Introduction
How do you know you’ve made a mistake if caught in the grip of everything else being uncertain? You know more, of course, after the fact when consequences are clearer in hindsight. But how do you know in real time and in these fogs of struggle and strife that this or that action on your part is a mistake to be avoided, right now and here?
It is highly relevant for the purposes of policy and management to insist that real-time error avoidance is possible even under particular (but not all) conditions of widespread systemwide complexity and uncertainties.
Research Findings
I
Paul Schulman and I have been undertaking research on a set of interconnected critical infrastructures in Oregon and Washington State. The upshot is that not only do major uncertainties and risks change with shifting interconnectivities, but new errors to be avoided emerge as well, and clearly so for some cases.
Based on interviews with infrastructure control room operators and emergency managers, real-time surprises are widespread in flooding, wildfire, road and other transportation disruptions, levee breaches, and transmission failures in electricity and water.
But, as many also told us, there can be and often are an urgency, clarity and logic about what to do by way of just-in-time or just-for-now interventions. What needs to be done is at times evident to front-line infrastructure staff and emergency management professionals, when not so to those in incident command centers or higher-level management or official positions. For these experienced front-line staff and in these circumstances, not doing what needs to be done constitute errors to be avoided in real-time. They are, in other words, opportunities that cannot be missed.
II
What are those circumstances and conditions of urgency, clarity and logic?
Ones identified by our interviewees focus on an infrastructure’s key interconnectivities with infrastructures they depended upon and which depended upon them. More specifically, this focus and concern centers around shifts in the interconnectivity involving their respective systemwide control variables, like frequency and voltage for electricity transmission, main-pipe pressures for large-scale water supply or natural gas systems, and network bandwidth in telecommunications.
During normal operations, these control variables are already interconnected. What defines system disruption and failure is when the interconnectivities shift in unignorable ways. Fire-fighters setting their firebreaks under more accessible rights-of-way, which are the same rights-of-way created for electricity transmission lines, can create conflict between backfires needed by the fire- fighters and the voltage and flow paths along the transmission lines. Because they share the same waterway, clearing a river passage for ongoing marine transport and re-opening a major port along the way is important to both infrastructures.
When these systems as systems are disrupted, or fail outright, restoring or recovering what had already been interconnected system control variables require urgent and often improvisatory behavior by all manner of infrastructure operators from the control rooms on down to field staff. These improvisations may be last-second one-offs saving the day, like seen in battle. From our perspective, these are better understood as part and parcel of the wide range of workarounds that line operators and field staff undertake–beginning in normal operations, routine maintenance and non-routine repair–to ensure safe and reliable operations at the system level.
III
In particular, we found:
–Under conditions of shifting or shifted interconnectivity, it would be an error for infrastructure operators and emergency managers not to establish lateral communications with one another and undertake improvisational and shared restoration activities where needed, even if no official arrangement exists to do so.
–In addition, there are also errors of anticipation and planning. It would be a management error in anticipation and planning not to provide robust and contingent interinfrastructure communication capabilities, including phone connections between the control rooms of interconnected infrastructures. This communication, it has been demonstrated, is also greatly facilitated by establishing lateral interinfrastructure personnel contacts prior to emergencies
–Further, it would be an error not to have some contingent resources for restoration and recovery activities such as vehicles, portable generators and movable cell towers in differing locations available across infrastructures if needed, particularly where chokepoints of interconnected infrastructures are adjacent to each other.
While these three errors are not the entire set, our interviews and prior research convince us that they are of primary and are to be avoided because they seriously degrade effective resilience in emergency prevention and responses.
Three Important Policy and Management Implications
I. Error avoidance is not risk management
Let’s start with a US example. It would be an error not to put into the mandated county/city hazard mitigation plan a proposal to replace a majorly vulnerable culvert with a new bridge, should the former be washed away in new flooding and when federal funds would be available for bridge replacement under those conditions. Put this way, there is a role for forward planning in anticipating and taking advantage of these already existing funding and construction opportunities.
Or from the other direction, rural town that did not anticipate accelerated gentrification after a major wildfire in its hazard mitigation plan will have to deal with the consequences of not having prepared for this gentrification (e.g. newly added residential water and wastewater demands and transportation requirements).
In both cases and from this perspective, the mandated hazard mitigation plan is a problem definition, parts of which are latent until activated during immediate emergency response, initial service restoration or longer-term recovery. Collapsing either example under the category of “risk management” is to miss the fact that these error (or, if prefer, missed opportunities) are not to be managed, more or less like risks, but rather managed categorically as yes or no. Did you avoid or did you not?
II. An example of how distinguishing between error avoidance and risk management is also important for locality residents affected by the disaster
Friends are telling us wonderful things about their recent move to a rural area in the Pacific Northwest. They were also surprised, given all the rain, about the high fire hazard risk mapped for their area and nearby environs. As in California, such maps created political and insurance company push-back. And there are methodological issues in mapping fuel loads in the absence of not knowing point-of-ignition information in advance.
So what to do? In their case, they talked about how they and neighbors agree in advance to help each other should a wildfire threaten (ignited, say, by vehicle sparks along the roadside). If one neighbor was threatened, all would move to that site to help out.
Such self-organizing happens all over the world and there is nothing extraordinary in this example, except one thing that deserves highlighting: What is going on here (and I suspect many other examples) is not managing the risks associated with fire hazards but rather avoiding known errors when faced with fire hazards, whatever the associated risks.
These errors include the aforementioned need for robust communications in this case among the neighbors and the need to have firefighting tools and associated equipment distributed and accessible beforehand. In addition, it is hoped that here too they and other residents use their county’s hazard mitigation plan to seek federal and state support for improving their lifeline infrastructures (water, electricity, roads and telecoms), should fires and other disasters actually undermine them in the future.
To repeat, it is an error to have missed really-existing opportunities for more robust communications, more dispersed equipment and tools, and greater use of existing planning and funding mechanisms. But why is that distinction important? It implies that there should dedicated support and staffing to assist such locality-based error avoidance, in addition to and separate from risk management efforts, not least of which being those fire hazard maps.
III. The special institutional niche for infrastructures in error-avoiding disaster management
Those who study major earthquakes, tsunamis, or other place-based catastrophes often remark about how populations left behind self-organize by way of saving lives and providing what relief they can on their own. We have seen this too. What is less recognized, I believe, is the institutional niche that critical infrastructures hold in this group adaptive behavior.
In some cases, the group-organization of groups takes place because there is little government presence, let alone as the disaster unfolds. One thinks of the media attention given to earthquakes in some low-income countries.
Self-organizing groups, however, are also observed in disaster situations that destroy longstanding critical infrastructures in middle to high-income countries. Error-avoiding behavior in the form of increased lateral communication and improvisational behavior are witnessed, in particular, among front-line infrastructure staff, emergency managers and some local communities.
I want to suggest that group adaptation in these latter cases differs in at least one under-acknowledged respect. A major part of that self-organization of field crews and the public is to provide initial restoration of some kind of electricity, water, road, communications and other so-called lifeline services, like medical care. This niche of critical infrastructures is already established.
Indeed, what better acknowledgement of society’s institutional niche for interconnected critical infrastructures than the immediate emergency response of trying to avoid all manner of errors in restoring the backbone infrastructures of electricity, water, telecoms and roads?
————-
Acknowledgement. My thanks to Paul Schulman for working through and crafting a number of these points. All errors–!–remain mine.
Reference. For an initial discussion of topics in this blog entry and its source material, see: E. Roe and P.R. Schulman (2023). “An Interconnectivity Framework for Analyzing and Demarcating Real-Time Operations Across Critical Infrastructures and Over Time.” Safety Science (available online at https://doi.org/10.1016/j.ssci.2023.106308)