Achilles’ heel of high reliability management

Which do you find to be simpler: the radio that goes on when you turn a single knob, or the one that won’t work because the parts are all lined up on the floor? William H. Gass, novelist and essayist

The weakest point in risk and uncertainty management by critical infrastructures operating under high reliability mandates is any assumption that the infrastructures aim to ensure their users need no longer worry about risk, uncertainty and failure scenarios for the service provided.[1]

–Why? Because the considerable strengths of control rooms are at the same time blind-spots for society’s expectations of them.

Yes, control rooms represent unique system knowledge, but that real-time knowledge is difficult to convey to or distill for the public, let alone experts committed to checklists and protocols. Yes, their skills and requirements are so knowledge-intensive as to make control operators professionals in their own right, but that also means they cannot be expected to know the requirements of other control rooms with the same degree of breadth and depth. Yes, reliability professionals are virtuosi in managing real time (and it is true that professionals who cannot manage the short-term should not be expected to manage for the long term), but reliability professionals are the first to recognize the need for more long term planning and analysis.

Yes, the evolutionary advantage of control room operators to operationally redesign defective technology and regulation so as to ensure system reliability in real time is often under-recognized, but this does not make reliability professionals experts in altogether repurposing infrastructures when it comes to adding new services or instituting new infrastructures to provide the same service. Yes, control rooms are central to intra- and interinfrastructural reliability, but some critical infrastructures under very real mandates for high reliability do not have control rooms.

Yes, there is that sense in which a control room is like the weather vane taking all those lightning strikes to protect the house underneath, but that means control operators must be able to absorb the shocks and be protected in doing so. This protection—along with the longer term perspective and interinfrastructural oversight responsibilities—is what we expect from leaders, regulators and policymakers.

–Indeed, control rooms are our leaders’ seismograph for registering major activity along the fault-lines of our critical infrastructures, such that without these centers those fault-lines are not as reliably managed when it matters the most. Paul Schulman and I have highlighted control rooms as unique organizational formations and social institutions in their own right, meriting society protection, even during (especially during) continued attack.

–There will be, however, tension between control operators for whom recovery means interorganizational resumption of critical services to a new normal and leaders (including the regulators of record) for whom recovery occasions the major repurposing of infrastructures (again, with respect to new services or entirely new infrastructures for the same service). In the latter case, “resumption of service” would be further complicated by the inevitable operational redesigns in real time undertaken with premature innovations.

Nonetheless, we would expect leaders to be involved should a new precluded events standard of reliability be adopted or, in related fashion, should the definition of what is acceptable risk change, either in terms of what are new risks or acceptable bandwidths for existing ones.[2] To pick one from many examples, the prime minister of Japan during the Fukishima disaster concluded: “It is impossible to ensure safety to sufficiently prevent the risk of a national collapse,” said Naito Kan. “Experiencing the accident convinced me that the best way to make nuclear plants safe is not to rely on them, but rather to get rid of them”.

[1] Oops. “As the economist Gary Gorton has put it, banking does for the nonexpert in finance what the electricity grid does for the nonexpert in electricity: . . .enabling most who ultimately bear the risk not to have to worry about it on a day-to-day basis. But as we discovered once the [2008 financial] crisis broke, it was not just nonexperts who had stopped worrying about risk on a day-to-day basis. Most professional investors had also gotten into the habit of not worrying about it either. . . Before we rail against their stupidity, we should remember that not worrying about risk is precisely what a modern banking system enables its customers to do. That the lack of worrying had gone too far is now undeniable, but it happened precisely because of how impressively the modern banking system works when it is working well.” (Paul Seabright in Foreign Policy accessed online on May 26, 2018 at

[2] One way for the precluded events standard to change has been to shift the burden of proof when it comes to demonstrating reliability. Such has happened over the years with respect to health and food safety in the U.S. (e.g., requiring manufacturers to show before product introduction that no serious harm would be produced in its use).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s