Complex: a short manifesto

When I and others call for better recognition and accommodation of complexity, we mean the complex as well as the uncertain, unfinished and conflicted must be particularized and contextualized if we are to analyze and to manage case-by-granular case.

When I and others say we need more findings that can be replicated across a range of cases, we are calling for identification not only of emerging better practices across cases, but also of greater equifinality: finding multiple but different pathways to achieve similar objectives, given case diversity.

What I and others mean by calling for greater collaboration is not just more teamwork or working with more and different stakeholders, but that team members and stakeholders “bring the system into the room” for the purposes of making the services in question reliable and safe.

When I and others call for more system integration, we mean the need to recouple the decoupled activities in ways that better mimic but can never reproduce the coupled nature of the wider system environment.

When I and others call for more flexibility, we mean the need for greater maneuverability across different performance modes in the face of changing system volatility and options to respond to those changes. (“Only the middle road does not lead to Rome,” said composer, Arnold Schoenberg.)

Where we need more experimentation, we do not mean a trial-and-error learning where the next systemwide error ends up being the last systemwide trial destroying survival.

Where others talk about risks in a system’s hazardous components, we point to different systemwide reliability standards and only then, to the different risks and uncertainties that follow from the different standards.

Rethinking the status quo

Consider those who are stopped short by “the unimaginability of any alternative to the neoliberal status quo.” Surely that’s a glove pulled inside-out. Neoliberalism generates such contingency and uncertainty as to undermine any status quo. It’s “the” status quo as has been understood that is unimaginable.

And here’s what’s helpful in that realization. When have status quo’s ever been as real in practice as they are in theory? To paraphrase the international relations theorist, Hans Morgenthau: Excuse me, but just what status quo have the people committed themselves to? They haven’t, irrespective of what systems are said to do on their own.

Appeals to anything like prolonged stability in the midst of collectively-evident turbulence have to be read symptomatically, if we are to move something as complicated as actually-existing betterment.

What they don’t tell you in Safety Culture 101: when operator error is not a mistake

–There is a key but under-appreciated virtue in control room operators working within their shared comfort zone of team situation awareness: their knowing when it is a mistake to comply with a regulation or protocol that would work against system reliability and safety.

Which just goes to show that it’s a mistake to think all errors are mistakes.

–Noncompliance may be a regulatory error for the regulator of record; the same noncompliance may be an important option for system reliability when the task environment indicates the regulation to be defective. It’s not a control room mistake if system high reliability compels the real-time commission of a noncompliance error.

What needs to be distinguished are the volatility conditions and reliability mandates under which “operator error” is forced. Yes, you can’t un-ring the bell once rung, but it’s always been more complex than that.

Luck in real-time infrastructure operations

Ensuring systemwide service reliability has always involved luck in major critical infrastructures. This, the control room operators, will tell you. At its most abstract, good luck can be defined as the non-occurrence of system failure in the absence of exercising failure avoidance options, while bad luck is the occurrence of failure in the presence of exercising those options.

But luck also favors the well-prepared, and well-prepared operators make a difference. Consider how a senior operations engineer for a transmission grid described a close call to us:

. . . We nearly caused a voltage collapse all over the western grid. Everything was going up and down, we were trying to get power from all the nuclear units in the western grid. Life flashed before our eyes. And then the gen dispatcher did intuitively the right thing. He said, Shut one pump down. How he figured that, I still don’t understand. It was something we had never seen before. We had no procedures. . .We went back and looked at it, and the planner said, Oh yeah, you should never have been running three pumps, and we said, Where did you get that from? So we started writing new procedures.

If luck is when talent meets opportunity, then this good luck is the value added by professionals in stopping close calls and near misses from becoming system failures.

When regulation and policy are run-on sentences

–Whether or not it is true that most markets are mostly efficient most of the times, inefficient market are hardly the solution for weak regulation. That solution is to manage the resulting mess since markets and regulations fail to live up to the promises for them.

–In all the talk about the need for prudential risk regulation systemwide, few seem ready to face this challenge: The larger and more complex the system, the less the regulation of risk will be the focus of management.

Attention will unavoidably shift to the now greater surprises and unknown unknowns well outside frequency distributions or worst-case scenarios. Indeed, to equate system uncertainties and unknown-unknowns with systemic risk is the disaster to forestall rather than unintentionally hasten.

–If fireflies can do it, why can’t humans? If fireflies can self-organize and flash in unison, why can’t humans better regulate their own behavior?

Well then let’s carry that to its logical conclusion: If earthworms can do it, why can’t humans? If earthworms can move tons of soil, why can’t humans do the same in the name of economic development?

Self-regulation, as a wit put it, stands in relation to regulation the way self-importance stands to importance.

–George Bernard Shaw, in one of his polemics against the U.S. Constitution, counseled Americans to farm out the important stuff to Europeans: “Some years ago I suggested as a remedy that the American cities should be managed from Europe by committees of capable Europeans trained in municipal affairs in London, Berlin, Paris, etc. San Francisco rejected my advice and tried an earthquake instead. . .”

–I remember a tense meeting during the height of the California electricity crisis. Rumors were about as to whether some organizations would survive the upheaval. A senior executive at state grid transmission center told the group: “My view is that we are jumping under the table and the earthquake is happening and what we have to do is to hope this isn’t a nuclear attack and that the rubble will settle and when it does, we get up and be the only ones around who know what to do.”

–Policy and regulation look much more like the rubbished artist studios of Edgar Degas and Francis Bacon. It is a world in which clutter and worse can be used for wildly different ends depending on in whose way it has been sorted. Sigmund Freud and H. Rider Haggard were enthusiastic collectors of hundreds of carvings and antiquities to inspire their work. Freud at his improvising best gave one of his patients Haggard’s She to get her thinking, but she already had read it.

–My California Driver Handbook reads: “Remember, if a pedestrian makes eye contract with you, they are ready to cross the street. Yield to the pedestrian.” ARE THEY CRAZY? These days, the last thing a pedestrian wants to do have eye contact with on-coming drivers. “They’ll just speed up if you do that and head right for you!”

–I remember the speed-up to decolonization as one of the most significant developments and turning points in the last half of the previous century. Little did we know the now-‘then.’

If the operations of critical infrastructures could talk, this is the story they’d want us to hear

“I grew up in the old oil country of north-western Pennsylvania. Learned to drive on steep, curvy two-lane roads that snake through the area. When you’re a teenage boy, that’s alright!

“Anyhow, I know something about driving too fast for the conditions. But I was nothing compared to my high school buddy, Kenny. He loved cars. There wasn’t one he couldn’t fix. And if he didn’t have the right part, he could figure out how to patch something together. None of us had his talent.

“Kenny ended up making a living driving around drilling and fixing water wells and installing pumps in the Pennsylvania hills. When business was good, you’d find him deep off the back roads working. When things were slow, he’d be making up for lost time repairing the van or drilling rig.

“Sometimes he’d have to drive a couple hundred miles a day. He’d been on them in every season and every kind of weather, night or day. He did that for 15 years, and then one day he spun out, his rig went through the guardrail and down into the valley.

“We were all surprised. No one could believe it. Kenny really knew what he was doing – knew his van and rig, knew the roads, knew just how far to push it to get there with his load in time.
Everybody had a different theory about what went wrong.

“Kirkpatrick, who owned the well company, blamed Kenny. ‘I mean, he was good, but always the cowboy. Liked to push everything to the limit, and then some more. I tried to tell him ‘safety first,’ but short of riding along, how could I know when he was pushing it too hard?’

“Kenny’s mom blamed Kirkpatrick. ‘It was that man’s greed killed my son. Always shoving Kenny to cover more territory in less time. That boss had no idea what he was asking of my boy. Never once had he gone out on the road to see what it was really like. And that van he had Kenny drive was a piece of junk!’

“James Rathbone, running for county supervisor that year, blamed government. ‘Those roads were never designed for hauling heavy equipment. Traffic increased 100% in the last 10 years and we’re still trying to get by on small-lane roads. And where were the police? They’re supposed to enforce the law.’

“Del, a state trooper and one of Kenny’s oldest friends, took offense. ‘We barely got enough officers to patrol the highway. And besides, if anybody knew what he was doing on those back roads, it was Kenny. It was just bad luck that night – a patch of black ice, maybe. You can’t see that stuff until it’s too late.’

“Mr. Kirkpatrick’s son, Stuart, with his MBA, blamed competition – all those other drilling and plumbing companies that sprung up in the good times. ‘Listen, I’m sorry about Kenny, but there’s no getting around it. The name of this game is increasing the productivity of every man and piece of equipment out there. And, yes, my dad still has to cut costs. It’s grow or die.’

“In one sense, they were all right. The truck was old, the roads inadequate. God only knows what it was like to haul a drilling rig under all the conditions. Kenny was fast, and he took pride in covering the distance and finishing the job. But every time he did a little more, it raised the bar in Kirkpatrick’s mind. And the son was right. The company’s under a lot of competition.

“And Kenny’s mother was right. Neither Kirkpatrick nor the MBA-toting kid knew the first thing about driving, even if they knew something about the business end. For sure neither of them had experienced the driving risks first hand.

“But none of this is bottom line for me. Sure, it was an accident waiting to happen, but no one is talking about all the accidents Kenny prevented. We’re all in the big race for time and money and that takes skill. That means having very good drivers, especially when the roads have more traffic than they were built for and there are too few cops on all the roads that matter.

“We need more Kenny’s, not fewer, if we’re going to lick into shape this mess we’ve gotten ourselves into.”

Glossary of key concepts for mess and reliability management

Anticipations (see pattern recognition)

Contingency scenarios (see scenario formulation)

Coupling-decoupling-recoupling dynamic: process whereby tightly coupled system variables must be decoupled from each other in order to be managed, program by program or agency by agency. Administrative decoupling, however, ends up highlighting how connected the issues are and how important it is to deal with them in a linked way. Where initial coupling generated pressure to decouple, decoupling reinforces pressure to recouple—the dynamic’s third element. A positive feature of decoupling is to render more transparent what can and needs to be recoupled operationally. How operational recoupling (real-time mess and reliability management) occurs is case-by-case, since the dynamics of coupling, decoupling and recoupling are site-specific.

Critical infrastructures: assets or systems deemed essential by government for the provision of vital societal services, including but are not limited to engineered supplies for water, electricity, transportation and financial services.

Decoupling (see coupling-decoupling-recoupling dynamic)

Domain of competence: unique knowledge that mess and reliability professionals and their networks have to manage a critical service reliably. It is bounded by skills of networked professionals in terms of their abilities to recognize systemwide patterns and practices and to formulate contingency scenarios based on local application of broader principles and precepts.

Equifinality: behavior of a system to produce an end state through differing means.

***Maximum equifinality: ability to maintain equifinality in reserve, just in case the system needs to use it later. This is typical of just-in-case behavior under conditions of high network options and low system volatility.

***Adaptive equifinality: ability to assemble different options just in time to achieve reliability of service. Liquidity in finance economics is adaptive equifinality, when liquidity represents the ability of a seller to assemble a deal even in distressed times (as in assembling resources at the last moment).

***Zero equifinality: lack of equifinality, as in only one (set of) means to produce the end state of reliable service provision. This is typical of command and control behavior in just-this-way performance under conditions of low network options and enforced low system volatility.

Just-for-now performance: activities undertaken by mess and reliability professionals when system volatility remains high but network options with which to respond are few. This performance is unstable and not to be prolonged, since fire-fighting, band aids, and quick fixes mean operators end up doing one thing to achieve and maintain reliability, just for now, that can make other factors worse off.

Just-in-case performance: activities undertaken by mess and reliability professionals when system volatility is low and when network options to respond to the volatility are many and varied. This performance is characterized by multiple resources (including strategies) held in store, just in case they are needed to achieve and maintain reliability if something bad happens. (See also maximum equifinality.)

Just-in-time performance: activities undertaken by mess and reliability professionals when system volatility is high and network options to respond to that volatility remain many and varied. This performance is characterized by operator flexibility in assembling different options up to the last moment to achieve and maintain reliability, just in time. (See adaptive equifinality.)

Just-this-way performance: activities undertaken by mess and reliability professionals to ensure system volatility is low when network options to respond are few. This performance is characterized by command and control, such as emergency declarations. Such interventions are to be complied with, just this way, to reduce task environment volatility so that reliability can be achieved and maintained with the few resources available. (See also zero equifinality.)

Macro-design: position in mess and reliability space from which formal deductive principles are applied at the system level to govern the achievement and maintenance of processes for critical service provision. (See mess and reliability space.)

Mess (as in policy mess): informally, any controversy or issue which is uncertain, complex, incomplete and disputed at the same time. Formally, any controversy or issue, the multiple and conflicting standpoints over which can be sorted out around four nodes of macro-design, micro-operations, scenario formulation and pattern recognition.

            Bad mess:  informally, a mess that cannot be used. Formally, a mess centered around:  few if any systemwide patterns or localized scenarios; single standpoints taken in mess and reliability space or confusion over nodes in the space; and/or leaps of faith across nodes that bypass unique knowledge base of mess and reliability professionals or are uninformed by learning there. In terms of the performance modes, a bad mess is what we see in just-for-now performance.

            Good mess: informally, a mess that can be used. Formally, making a mess better or stopping a mess from going bad (just-in-time) or a bad mess from worsening (just-this-way).  The preceding means protecting mess and reliability professionals, becoming one yourself, favoring networked-centered over problem-centered decisionmaking, and learning how to undertake better management of organizational setbacks.

            Best mess: Formally, being unable to operate within and across all performance modes and well inside the domain of competence of known patterns and scenarios.

            Worst mess: Formally, having to operate outside the performance modes and the domain of competence. If prolonged, just-for-now performance can become the worst mess possible.

Mess and reliability management: operations and activities undertaken by professionals in networks skilled in pattern recognition and scenario formulation and their translation into reliable critical service provision. In their domain of competence, professionals manage by working together so as to recouple activities across separate programs or agencies. In doing so, professionals are required to move across performance modes as task environment and resource conditions change in order to achieve and maintain the reliability of a critical service.

Mess and reliability professionals: Operators or managers, working in a network with others, who are skilled in (1) recognizing patterns (including practices) emerging across a systemwide run of cases, (2) formulating contingency scenarios based on design principles but localized to case at hand; and (3) translating the systemwide patterns recognized and the localized scenarios formulated into the reliable provision of a critical service.

Mess and reliability space: cognitive field in which reliability in critical service provision is known and across which it is realized. It has two dimensions:  (1) the type of knowledge brought to bear on efforts to make the service reliable, and (2) the scope (or scale) of attention of the reliability efforts.

The knowledge from which reliable performance is pursued can range from formal or representational knowledge, in which key efforts are understood through abstract principles and deductive models based upon the principles, to experience, based on informal, tacit understanding. Scope of attention ranges from a purview which embraces reliability as an entire system output, encompassing many variables and elements, to a case-by-case focus in which each case is viewed as a particular event with distinct properties or features.

The two continua of knowledge and scope bound a cognitive space in which four nodes (hubs) are identified for achieving reliability: macro-design (principles at the system level), scenario formulation (contingency scenarios at the case level based on the localized modification of macro-principles); pattern recognition (including systemwide anticipations and practices emerging from the identified patterns of behavior and experience); and micro-operations (reactive behavior or experience at the case level).

Micro-operations, reactive: the position in mess and reliability space from which individual operators and managers with tacit knowledge and experience achieve and maintain reliability of a critical service, albeit reactively.

Network-centered decisionmaking: process where a sense of urgency drives professionals to intervene in an issue over which exists little prior agreement. Decisionmaking focuses on ensuring agreement on the rules to be followed in dealing with the issue by networked professionals, recognizing that goals will be defined along the way through negotiation over what is relevant knowledge and practices to deal with the issue as it evolves. Options are kept open and further rounds of negotiations might be needed, with new opportunities as network’s understanding of issue changes.

Options variety: amount of different resources that networked professionals have with which to respond to the system volatility they faces. Resources could be monetary, personnel and/or strategies with which the managers respond.

Pattern recognition, systemwide: position in mess and reliability space from which trends, configurations and generalizations emerge across a run of cases and which network professionals use as basis for anticipations and better practices (where evident) to achieve and maintain reliable critical service provision.

Performance modes: See just-in-case, just-in-time, just-for-now and just-this-way performance.

Problem-centered decisionmaking: process of defining the problem, the goals to be achieved in solving it, and the evidence that exists for use in the solution, where a decision follows from the prior problem statement, goals of analysis, and information assembled for its solution. Once decision has been locked in, its effectiveness is measured by how well implementation of the decision meets the predefined goals.

Professional challenges: are fourfold for mess and reliability professional—manage complexity; build analytic and management capacity into the network of professionals; capitalize on diverse communities and stakeholders when managing; and operate in real time more effectively.

Realism 1, 2 and 3: Realism 1 is that of you as the observing subject looking out onto reality. Realism 2 is when become the object of reality, in what psychologists call “the grip of the real.” Realism 3 is the situational awareness that comes when you are operating within a network of like professionals all operating under mandates of ensuring real-time reliability.

Recoupling (see coupling-decoupling-recoupling dynamic)

Reliability: safe and continuous provision of a critical service, even during peak demand or turbulent periods.

*Real-time reliability: Safe and continuous provision of critical service during just-in-time and just-for-now performance.

Resilience (and anticipation): absorbing or bouncing back from a shock (and planning and preparing for the next step ahead at the same time).

Scenario formulation, localized: position in mess and reliability space from which macro-principles are contextualized by networked professionals to a local case, thereby enabling professionals to formulate scenarios and protocols that embrace a wider range of contingencies when achieving and maintaining reliability in critical service provision.

Setback management: managing a setback (a sudden or unanticipated check on organizational behavior) that could lead to a bad or worse mess unless managed.  This means trying to pull the good mess out of one that could go bad by treating the setback as a design probe, test-bed for something better, an interruption from which the organization learns, or an obstacle to overcome so that the organization moves to a new stage of its life cycle.

Volatility (system): degree to which the network managing for reliability faces uncontrollable changes or unpredictable conditions in the task environment that threaten provision of the critical service.

Abolishing infrastructures

Recognizing the climate emergency does not mean taking its worst-case scenarios the same way.

Assume a 3 degree or more increase in global warming. There are some who might then expect that the rising sea-levels and increased storm surges would sink the northern California’s refinery hub at Concord much sooner. One problem solved. So too for the Central Energy Infrastructure hub in Portland, assuming a magnitude 9.0 Cascadia earthquake doesn’t trigger the toxic flaming sludge first.

Such scenarios for abolishing infrastructures would, of course, entail alongside the massive destruction of ecosystems, but then again the worst-case supposition would be they’d never be restored anyway within a fossil-fuel status quo.

One recasting not to favor: Anthropocene as wartime

[S]eeing the Anthropocene more as a ‘boundary’ position that intersects with histories of capitalism, empire and the evolution of human life on earth, rather than a clearly distinct new epoch. . .allows more room for experimentation with the various processes of re-orientation that might be required of politics within, rather than beyond, these boundary lines, by not pre-determining one singular course of action as unambiguously correct.

In recasting the epoch as boundary process, it also reflects another connection back to World War I, when analogous ideas of revolution as processual events, instead of one-time incidents, became part of the mainstream. Indeed, the concept of wartime itself suggests a processual and extendable temporality, rather than a straightforward binary. This is the case since the division between wartime and peacetime is never as clear cut as any formal cessation of hostilities or signing of a treaty would suggest. World War I clearly did not end with the Armistice, and neither did it cease with the signing of the Versailles Treaty. For some, the World War has never really ended at all given that its promises of meaningful forms of (particularly racial and gender) equality as recompense for serving one’s country have still failed to materialize. The war had an enormous impact both upon the fabric of the earth and natural resources, while its legacy for the ways such categories as state, democracy, representation and capitalism, have become fixed parts of Euro-American political thinking, has been equally profound. It might therefore be productive to think about the Anthropocene as a form of ‘deep-war time’, both practically and intellectually. This means considering the Anthropocene as an ongoing battle over what it means to think across both planetary and global perspectives, and across the arc spanning World War I and into the present.

D. Kelly (2022). Wartime for the Planet? Journal of Modern European History (DOI: 10.1177/16118944221113281; excerpted above without embedded footnotes)

Emergencies are one thing, like that for the climate. But not all emergencies are wars. If the Anthropocene is recast as its own wartime, then how is this war different than all the other wars, namely, as massive engines of unpredictable, unimaginable and ungovernable contingencies? We might as well recommend more poverty and hunger as ways to accentuate the contradictions and “thereby” bring about a war sooner. Why ever would we say wartime better captures there being no real boundary between war and peace, when the Anthropocene is also about neither human war nor human peace only?

If the “planetary” is as much a human construction as “local” and “global” are–or if you prefer, planetary and global and local are not thorough-going human constructions (remember “the irreducible particularity of being”?)–then we’re well advised not to dismiss policy and management as if they were the low and mean cunning of local and global alone.

Indeed, cunning looks much the viable option when compared to This war has to be different! Failure is not an option! We just have to have the political will to make it happen! These claims to exceptions deny that we can better prepare for other unavoidably broad patterns we see and other unavoidably local scenarios we face, when both clearly contradict “There is no alternative but to do. . .”

Unions and unionized

Assume that evidence can be generalized as follows: Unionized firms as compared to nonunionized firms have lower rates of productivity, employment creation, and investment, other things equal. Even putting aside all the contrary evidence, we still ask: So what?

These are generalizations only. Localized scenarios in which the opposite holds are possible and counter-cases available. Considerable evidence suggests that the ‘‘union/nonunion’’ dichotomy masks great variability in collective bargaining laws and wage arrangements across countries and regions.

That variability, in turn, suggests we take a deeper look at the macro-design standpoints with respect to unions or not. What human rights, for instance, are at issue when one talks about unionization? In reviewing the literature, one quickly realizes that the rights concerned relate less to any ‘‘right to unionization’’ and more to traditional rights of collective bargaining and freedom of association.

Taking the latter as the point of departure surfaces an issue missed by some observers.

First, focusing on different rights means the earlier starting focus on empirical generalizatons about unionization and economic growth is narrow. We should also be looking at the evidence related to economic growth and collective bargaining arrangements, both generally and specifically. We would then better understand why local conditions are so variable with respect to ‘‘unions,’’ now variously defined and found.