August 15th, 2017 I did a radio interview with the Seattle NPR station on a retrospective of the June 1999 Bellingham, WA Olympic Pipeline rupture and how the results are still relevant. A summary of the Bellingham incident is in my book Protecting Industrial Control Systems from Electronic Threats. Bellingham was a cyber incident that shutdown SCADA which then set all of the sensors to average values. Because the sensors were set to average values, they could not follow actual process conditions. Consequently, the operators had no view of the process (SCADA was inoperable) and no safety system. However, because the sensors didn't show "NOT UPDATING" on the displays, the operators did not know they had no safety system. This scenario illustrates a hole in the safety analysis methodology and NTSB did not address this problem in their final report of the Olympic Pipeline rupture. The policy of setting sensors to “fail as-is”, “fail upscale” or “fail downscale” is commonly done. However, if safety systems are involved, the approach of setting sensors to a fixed value can cause a Loss of Safety condition.
It is vitally important to fully understand the multiple levels of failures that are presented in the Bellingham case, since it is likely that similar factors currently exist in many other infrastructure systems. First, the system SCADA was vulnerable to cyber intrusion, which rendered it inoperable. The subsequent setting of sensors (by Design) to “average values”, or other arbitrary fixed settings, created the loss of vital sensing functions for the safety systems. Lastly, continuing operation without effective view, control, or safety override, the pipeline rupture was almost inevitable.
Safety and resiliency depend on redundancy and diversity. An inherent assumption is that sensor values are correct even though process sensors have neither authentication nor security. Cyber concerns with sensors include cyber vulnerable protocols and smart transmitter functionality which leads to cyber vulnerabilities. Currently, ICS cyber security assumes process sensor input is correct. If sensor values are incorrect either because of unintentional issues such as sensor drift, miscalibration, etc or by sensors being compromised by a cyber attacks, then resiliency, safety, and security can be defeated. This includes sensors not reaching their setpoints (a safety problem like what happened at Bellingham), sensors inadvertently reaching setpoints causing systems to shutdown (a reliability problem), or modifying sensor output compromising product quality. Following a rigorous approach to understanding complex system behavior over the full life-cycle and under numerous threat scenarios will expose such risks and support mitigation decisions. It should also be noted that because of the lack of cyber forensics at the sensor level, it would be difficult at best to determine if a sensor anomaly was caused by an unintentional malfunction or a targeted cyber attack.
It is critical to know the validity of the sensor readings as the sensors are the eyes and ears of the actual process. The ultimate failure lesson for the entire Bellingham episode is the knowledge and awareness that was lacking in the design phase of the system, i.e. – the lack of understanding fully the system behavior under various conditions of the “environment”, including damaging effects of cyber attacks. Assuming that the Bellingham control system was most likely conceived and developed in an era preceding many considerations for cyber threats, (most likely in the 80s and 90s), one can understand such gaps in awareness. However, today we have various threat scenarios which confront our entire national critical infrastructures. We must also recognize that many legacy systems in operation today are still based on the 30 year old designs with the same flawed logic. In seeking to prevent further unfolding of such disasters, it is essential to consider a review and re-examination of such failure modes in many of our vital infrastructure systems. As I am not aware of any cyber security, safety, or resiliency standard in any industry that addresses process sensor cyber security, this needs to change.
Joe Weiss
Like this blog post? Sign up for the Control Update newsletter and get posts like this delivered right to your inbox.