Showing posts with label safety. Show all posts
Showing posts with label safety. Show all posts

Saturday, December 7, 2019

Why do regulatory organizations fail?


Why is Charles Perrow a pessimist about government regulation?

Perrow is a leading researcher in the sociology of organizations, and he is a singular expert on accidents and failures. Several of his books are classics in their field -- Normal Accidents: Living with High-Risk Technologies, The Next Catastrophe: Reducing Our Vulnerabilities to Natural, Industrial, and Terrorist Disasters, Organizing America: Wealth, Power, and the Origins of Corporate Capitalism. So why is he so gloomy about the ability of governmental organizations to protect the public from large failures and disasters of various kinds -- hurricanes, floods, chemical plant fires, software failures, terrorism? He is not a relentless critic of organizations such as the EPA, the Department of Justice, or the Food and Drug Administration, but his assessment of their capacity for success is dismal.
We should not expect too much of organizations, but the DHS is extreme in its dysfunctions. As with all organizations, the DHS has been used by its masters and outsiders for purposes that are beyond its mandate, and the usage of the DHS has been extreme. One major user of the DHS is Congress. While Congress is the arm of the government that is closest to the people, it is also the one that is most influenced by corporations and local interest groups that do not have the interests of the larger community in mind. (The Next Catastrophe, kl 205)
I don't think that Perrow's views derive from the general skeptical view that organizations never succeed in accomplishing the functions we assign to them -- hospitals, police departments, labor unions, universities, public health departments. And in fact his important book Complex Organizations: A Critical Essay provided a constructive description of the field of organizational studies when it appeared in 1972 and was updated in 2014 (link).)

Instead, there seem to be particular reasons why large governmental organizations designed to protect the public are likely to fail, in Perrow's assessment. It is organizations that are designed to regulate risky activities and those that are charged to create prudent longterm plans for the future that seem particularly vulnerable, in his account. So what are those reasons for failure in these kinds of organizations?

FEMA is faulted, for example, because of its failure to adequately plan for and provide emergency relief to the people of New Orleans and other parts of the Gulf region from the effects of Hurricane Katrina. Poor planning, incompetent executives at the top, politicized directions coming from the White House, poor coordination across sub-units, and poor internal controls eventually resulted in a historic failure. These are fairly routine organizational failures that could happen within the United Parcel Service corporate headquarters as easily as Washington.

The Nuclear Regulatory Commission is faulted for its oversight of safety in nuclear plants, including Three Mile Island, Davis-Besse, and Shoreham. Key organizational faults include regulatory capture by owners and the nuclear industry, excessive dependence on specific key legislators, commissioners who are politically beholden, and insufficient personnel to carry out intensive inspection regimes.

Perrow's key ideas about failures in the industrial systems themselves seem not to be central in his negative assessment of government regulatory organizations. The features of "complex systems" and "tightly coupled processes" that are so central to his theory of normal accidents in industrial systems like nuclear power plants play only incidental roles in his analysis of regulatory failure. Agencies are neither complex nor tightly coupled in the way a petroleum processing plant is. In fact, an outside observer might hypothesize that a somewhat more tightly coupled system in the NRC or the EPA (a more direct connection among the scientists, engineering experts, inspectors, and commissioners) might actually improve performance.

Instead, his analysis of regulatory failure depends on a different set of axes: interests, influence, and power. Regulatory agencies fail, in Perrow's accounts, when their top administrators have bureaucratic interests and dependencies that diverge from the mission of safety, when powerful outsiders and owners have the capacity to influence rules, policies, and implementation, and when political and economic power is deployed to protect the interests of powerful actors. (All these defects are apparent in Trump administration appointments to federal agencies with regulatory responsibilities.)

Interestingly, these factors have also played a central role in his sociological thinking about the emergence of the twentieth-century corporation; he views corporations as vehicles for the concentration of power:
Our economic organizations -- business and industry -- concentrate wealth and power; socialize employees and customers alike to meet their needs; and pass off to the rest of society the cost of their pollution, crowding, accidents, and encouragement of destructive life styles. In the vaunted "free market" economy of the United States, regulation of business and industry to prevent or mitigate this market failure is relatively ineffective, as compared to that enacted by other industrialized countries. (Organizing America, 1-2)
So the primary foundation of Perrow's assessment of the linked of organizational failure when it comes to government regulation derives from the role that economic and political power plays in deforming the operations of major government organizations to serve the interests of the powerful. Regulatory agencies are "captured" by the powerful industries they are supposed to oversee, whether through influence on the executive branch or through merciless lobbying of the legislative branch. Commissioners are often very sympathetic to the business needs of the sector they regulate, and strive to avoid "undue regulatory burden".

This leads us to a fascinating question: is there a powerful constituency for safety that could be a counterweight to corporate power and a bulwark for honest, scientifically guided regulatory regimes? Is a more level playing field between economic interests and the public's interests in effective safety regulation possible?

We may want to invoke the public at large, and it is true that public opinion sometimes effectively demands government intervention for safety. But the public is generally limited in several important ways. Only a small set of issues manage to become salient for the public. Further, issues only remain salient for a limited period of time. And the salience of an issue is often geographically and demographically bounded. There was intense opposition to the Shoreham nuclear plant siting decision on Long Island, but the public in Chicago and Dallas did not mobilize around the issue. Sometimes vocal public opinion prevails, but much more common is the scenario where public interest wanes and profit-motivated corporate interests persists. (Pepper Culpepper lays out the logic of salience and unequal power between a diffuse public and a concentrated corporate interest in Quiet Politics and Business Power: Corporate Control in Europe and Japan.)

Other pertinent voices for safety are public interest organizations -- the Union of Concerned Scientists, Friends of the Earth, Bulletin of Atomic Scientists. Organizations like these have succeeded in creating a national base of support, they have drawn resources in support of their efforts, and they have a greater organizational capacity to persist over an extended period of time. (In another field of advocacy, organizations like Anti-Defamation League and the Southern Poverty Law Center have succeeded in maintaining organizational focus on the dangers of hate-based movements.) So public interest organizations sometimes have the capacity and staying power to advocate for stronger regulation.

Investigative journalism and a free press are also highly relevant in exposing regulatory failures and enhancing performance of safety organizations. The New York Times and Washington Post coverage of the FAA's role in certification of the 737 Max will almost certainly lead to improvements in this area of aircraft safety. (Significantly, when I made this statement concerning the link between industrial safety in China and a free press, I was told that "this is a sensitive subject in China.")

(These examples are drawn from the national level of government. Sometimes local government -- e.g. police departments and zoning boards -- are captured as well, when organized crime "firms" and land developers are able to distort regulations and enforcement in their favor. But it may be that organizations at this level of government are a bit more visible to their publics, and therefore somewhat less likely to bend to the dictates of powerful local interests. Jessica Troundstine addresses these kinds of issues in Political Monopolies in American Cities: The Rise and Fall of Bosses and Reformers (link).

Thursday, November 14, 2019

Electronic Health Records and medical mistakes


Electronic Health Record systems (EHRs) have been broadly implemented by hospitals and health systems around the country as a way of increasing the accuracy, availability, and timeliness of patient health status and treatment information. (These systems are also sometimes called "Digital Medical Records" (DMRs).) They are generally regarded as an important forward step in improving the quality of healthcare. Here is a description of the advantages of Electronic Health Record systems, according to Athena Health:

The advantages of electronic health records in the clinical setting are numerous and important. In the 2012 edition of the Physician Sentiment IndexTM, published by athenahealth and Sermo, 81% of physicians said they believe EHRs improve access to clinical data. More than two-thirds said an EHR can actually improve patient care.

The use of an electronic health records system offers these clinical advantages:
  • No bulky paper records to store, manage and retrieve
  • Easier access to clinical data
  • The ability to establish and maintain effective clinical workflows
  • Fewer medical errors, improved patient safety and stronger support for clinical decision-making
  • Easier participation in Meaningful Use, Patient-Centered Medical Home (PCMH) and other quality programs, with electronic prompts ensuring that required data is recorded at the point of care
  • The ability to gather and analyze patient data that enables outreach to discreet populations
  • The opportunity to interact seamlessly with affiliated hospitals, clinics, labs and pharmacies
Considering all the advantages of electronic health records, and the rapidly growing electronic interconnectedness of the health care world, even if EHRs had not been mandated by health care reform, their development and eventual ubiquity in the health care industry was inevitable.

And yet, like any software system, EHR systems are capable of creating new errors; and some of those errors can be harmful to patients.

Nancy Leveson is an important expert on software system safety who has written extensively on the challenges of writing highly reliable software in safety-critical applications. Here are a few apt observations from her book Safeware: System Safety and Computers (1995).
Although it might seem that automation would decrease the risk of operator error, the truth is that automation does not remove people from systems -- it merely moves them to maintenance and repair functions and to higher-level supervisory control and decision making. The effects of human decisions and actions can then be extremely serious. At the same time, the increased system complexity makes the decision-making process more difficult. (10)
The increased pace of change lessens opportunity to learn from experience. Small-scale and relatively nonhazardous systems can evolve gradually by trial and error. But learning by trial and error is not possible for many modern products and processes because the pace of change is too fast and the penalties of failure are too great. Design and operating procedures must be right the first time when there is potential for a major fire, explosion, or release of toxic materials. (12)
(To the last statement we might add "or harm to hospital patients through incorrect prescriptions or failed transmission of lab results".)
The safety implications of computers exercising direct control over potentially dangerous processes are obvious. Less obvious are the dangers when ... software generated data is used to make safety-critical decisions, ... software is used in design analysis, ... safety-critical data (such as blood bank data) is stored in computer databases. The FDA has received reports of software errors in medical instruments that led to mixing up patient names and data, as well as reports of incorrect outputs from laboratory and diagnostic instruments (such as patient monitors, electrocardiogram analyzers, and imaging devices". (23)
Automatic control systems [like aircraft autopilots] are designed to cope with the immediate effects of a deviation in the process -- they are feedback loops that attempt to maintain a constant system state, and as such, they mask the occurrence of a problem in its early stages. An operator will be aware of such problems only if adequate information to detect them is provided. That such information is often not provided may be the result of the different mental models of the designers and experienced operators, or it may merely reflect financial pressures on designers due to the cost of providing operators with independent information. (117)
One of the cases examined in detail in Safeware is the Therac-25 radiation-therapy device, which due to a minor software flaw in the treatment-entry plan module began seriously injuring patients with excessive doses of radiation in 1985-87 (515 ff.). It had operated without incident thousands of times before the first accident.

So Leveson gives ample reason to be cautious about the safety implications of DMRs and the "fault pathways" through which their normal functioning might harm patients. What has been the experience so far, now that the healthcare industry has witnessed widespread adoption of DMR systems?

Two specific issues involving EHR errors affecting patient care have been identified in the past several years. The first is in the area of errors in the administration of prescription drugs, and the second is in the area of the handling and routing of medical test results. Both errors have the potential for harming the patient.

Jennifer Bresnick (link) summarizes the results of a report by the Pennsylvania Patient Safety Authority concerning medication errors caused by DMR systems. Medication errors (wrong medication, wrong dose, wrong patient, wrong frequency) can occur at several stages of the clinical process, including prescribing, transcribing, dispensing, and administration. The digital medical record is intended to dramatically reduce all these sources of error, the Pennsylvania study shows that the DMR can also contribute to errors at each of these stages.
While EHRs and other technologies are intended to reduce errors and improve the safe, standardized, and well-documented delivery of care, some stakeholders believe that digital tools can simply serve to swap one set of mistakes for another. Poor implementation and lackluster user training can leave patients just as vulnerable to medication errors as they were when providers used paper charts, commented Staley Lawes, PharmD, BCPS, Patient Safety Analyst, and Matthew Grissinger, RPh, FISMP, FASCP, Manager of Medication Safety Analysis in the brief. (link)
Part of the blame, according to the Pennsylvania report, belongs to the design of the user interface:
For this reason, it is important to design a system with an intuitive user interface to minimize the risk for human error. Users should be able to easily enter and retrieve data and share information with other healthcare professionals.  When systems are designed without these considerations in mind, patients are subject to undue risk. (link)
The report contains several specific design standards that would improve the safety of the DMR system:
The interaction between clinician and software is a key component that is to be taken into consideration when trying to improve the safety of health IT,” the report says. “Incident reports can provide valuable information about the types of HIT-related issues that can cause patient harm, and ongoing HIT system surveillance can help in developing medication safety interventions. (link)
It is clear that ongoing health IT system surveillance and remedial interventions are needed. Efforts to improve health IT safety should include attention to software interoperability, usability, and workflow. The relationship between clinician and software includes complex interactions that must be considered to optimize health IT’s contribution to medication safety.
Yackel and Embi (link) treat the problem of test result management errors in "Unintended errors with EHR-based result management: a case series". Here is their abstract:
Test result management is an integral aspect of quality clinical care and a crucial part of the ambulatory medicine workflow. Correct and timely communication of results to a provider is the necessary first step in ambulatory result management and has been identified as a weakness in many paper-based systems. While electronic health records (EHRs) hold promise for improving the reliability of result management, the complexities involved make this a challenging task. Experience with test result management is reported, four new categories of result management errors identified are outlined, and solutions developed during a 2-year deployment of a commercial EHR are described. Recommendations for improving test result management with EHRs are then given.
They identify test management errors at four stages of the clinical process:
  • results not correctly communicated to provider;
  • results communicated but never received or reviewed by the provider;
  • results reviewed, but appropriate action not recommended by provider;
  • appropriate recommendation made by provider, but action not carried out.
They make several key recommendations for improving the performance of DMR systems in managing test results: Develop fault-tolerant systems that automatically report delivery failures; use robust testing to find rare errors that occur both within and between systems; implement tracking mechanisms for critical tests, such as cancer screening and diagnostics; and deliver results directly to patients.

These are just two types of errors that can arise in digital medical record management systems. It is evident that the designers and implementers of DMRs need to take the systems-safety approach described by Nancy Leveson and implement comprehensive safety failure analysis, both in terms of "safety case analysis" (discovery of failure scenarios) and after-event investigation to identify the source of the failure in the software and its human interface.

These examples are not intended to suggest that DMRs are hazardous and should be avoided. On the contrary, the consolidation and convenient presentation of patient information for the provider is clearly an important step forward. But it is crucial that designers and implementers keep safety at the center of their attention, and to have a healthy respect for the ways in which automated systems can incorporate incorrect assumptions, can produce unintended interactions among components, and can be presented in such a confusing way to the human provider that patient care is harmed.

(Here is a case of treatment involving several different errors conveyed through the digital medical record system that involved attaching biopsy and test results to the wrong patient, leading to the wrong treatment for the patient. It is interesting to read because it reflects some of the complexity identified by Leveson in other system failures.) 

Sunday, November 3, 2019

O-rings and production pressure


Allan McDonald's Truth, Lies, and O-Rings: Inside the Space Shuttle Challenger Disaster (2009) has given me a somewhat different understanding of the Challenger launch disaster than I've gained from other sources, including Diane Vaughan's excellent book The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA. McDonald is a Morton Thiokol (MTI) insider who was present through virtually all aspects of the evolving solid rocket program at NASA in the two years leading up to the explosion in January 1986. He was director of the Space Shuttle Solid Rocket Motor Project during part of this time and he represented MTI at the formal Launch Readiness Review panels (LRRs) for several shuttle launches, including the fateful Challenger launch. He was senior management representative for MTI for the launch of STS-51L Challenger. His account gives a great deal of engineering detail about the Morton Thiokol engineering group's ongoing concerns about the O-rings in the months preceding the Challenger disaster. This serves as a backdrop for a detailed analysis of the dysfunctions in decision-making in both NASA and Morton Thiokol that led to an insufficient priority being given to safety assessments.

It is worth noting that O-rings were a key part of other large solid-fuel rockets, including the Titan rocket. So there was a large base of engineering and test experience with the performance of the O-rings when exposed to the high temperatures and pressures of ignition and firing.

The biggest surprise to me is the level of informed, rigorous, and evidence-based concern that MTI engineers had about the reliability of joint seal afforded by the primary and secondary seals on the solid rocket motors on the Shuttle system. These specialists had a very good and precise understanding of the mechanics of the problem. Further, there was a good engineering understanding of the expected (and required) time-sequence performance of the O-rings during ignition and firing. If the sealing action were delayed by even a few hundredths of a second, hot gas would be able to penetrate past the seal. These were not hypothetical worries, but instead were based on data from earlier launches demonstrating O-ring erosion and soot between the primary and secondary rings showing that super-hot gases had penetrated the primary seal. The worst damage and evidence of blowby had occurred on flight STS-51C January 25, 1985, one year earlier, the lowest-temperature launch yet attempted. And that launch took place when the temperature was 53 degrees.

Launch temperatures for the rescheduled January 28 launch were projected to be extremely cold -- 22-26 degrees was forecast on January 27, roughly 30 degrees colder than the previous January launch. The projected temperatures immediately raised alarm concerning the potential effects on the O-rings with the Utah-based engineering team and with McDonald himself. A teleconference meeting was scheduled for January 27 to receive recommendations from the Utah-based Morton Thiokol engineers who were focused on the O-rings problem about the minimum acceptable temperature for launch (95).
I tried to reach Larry Mulloy at his hotel but failed, so I called Cecil Houston, the NASA/MSFC Resident Manager at KSC. I alerted him of our concerns about the sealing capability of the field-joint O-rings at the predicted cold temperatures and asked him to set up the teleconference. (96)
The teleconference began at 8:30 pm on the evening before the launch. McDonald was present in Cape Canaveral for the Flight Readiness Review panel and participated in the teleconference involving the analysis and recommendations from MTI engineering, leading to a recommendation against launching in the expected cold weather conditions.
Thiokol's engineering presentation consisted of about a dozen charts summarizing the history of the performance of the field-joints, some engineering analysis on the operation of the joints, and some laboratory and full-scale static test data relative to the performance of the O-rings at various temperatures. About half the charts had been prepared by Roger Boisjoly, our chief seal expert on the O-ring Seal Task Force and staff engineer to Jack Kapp, Manager of Applied Mechanics. The remainder were presented by Arnie Thompson, the supervisor of our Structures Section under Jack Kapp, and by Brian Russell, a program manager working for Bob Ebeling. (97)
Boisjoly's next chart showed how cold temperature would reduce all the factors that helped maintain a good seal in the joint: lower O-ring squeeze due to thermal shrinkage of the O-ring; thicker and more viscous grease around the O-ring, making it slower to move across the O-ring groove; and higher O-ring hardness due to low temperature, making it more difficult for the O-ring to extrude dynamically into the gap for proper sealing. All of these things increased the dynamic actuation time, or timing function, of the O-ring, when at the very same time the O-ring could be eroding, creating a situation where the secondary seal might not be able to seal the motor, not if the primary O-ring was sufficiently eroded to prevent sealing in the joint. (99)
Based on their concerns about temperature and effectiveness of the seals in the critical half-second of ignition, MTI engineering staff prepared the foundation for a recommendation to not launch in temperatures lower than 53 degrees. Their conclusion as presented at the January 27 teleconference was unequivocal against launch under these temperature conditions:
The final chart included the recommendations, which resulted in several strong comments and many very surprising reactions from the NASA participants in the teleconference. The first statement on the “Recommendations” chart stated that the O-ring temperature must be equal to or greater than 53° at launch, and this was primarily based upon the fact that SRM-15, which was the best simulation of this condition, worked at 53 °. The chart ended with a statement that we should project the ambient conditions (temperature and wind) to determine the launch time. (102)

NASA lead Larry Mulloy contested the analysis and evidence in the slides and expressed great concern about the negative launch recommendation, and he asserted that the data were "inconclusive" in establishing a relationship between temperature and O-ring failure.
Mulloy immediately said he could not accept the rationale that was used in arriving at that recommendation. Stan Reinartz then asked George Hardy, Deputy Director of Science and Engineering at NASA/MSFC, for his opinion. Hardy said he was “appalled” that we could make such a recommendation, but that he wouldn't fly without Morton Thiokol's concurrence. Hardy also stated that we had only addressed the primary O-ring, and did not address the secondary O-ring, which was in a better position to seal because of the leak-check. Mulloy then shouted, “My God, Thiokol, when do you want me to launch, next April?” He also stated that “the eve of a launch is a helluva time to be generating new launch commit criteria!” Stan Reinartz entered the conversation by saying that he was under the impression that the solid rocket motors were qualified from 40° to 90° and that the 53° recommendation certainly was not consistent with that.” (103)
Joe Kilminster, VP of Space Booster Programs at MTI, then requested a short caucus for the engineering team in Utah to reevaluate the data and consider their response to the skepticism voiced by NASA officials. McDonald did not participate in the caucus, but his reconstruction based on the memories of persons present paints a clear picture. The engineering experts did not change their assessment, and they were overriden by MTI executives Cal Wiggins (VP and General Manager of the Space Division) and Jerry Mason (Senior VP of Wasatch Operations). In opening the caucus discussion, Mason is quoted as saying "we need to make a management decision". Engineers Boisjoly and Thompson reiterated their technical concerns about the functionality of the O-ring seals at low temperature, with no response from the senior executives. No members of the engineering team spoke up to support a decision to launch. Mason polled the senior executives, including Bob Lund (VP of Engineering), and said to Lund, "It's time for you, Bob, to take off your engineering hat and put on your management hat." (111) A positive launch recommendation was then conveyed to NASA, and the process in Florida resumed towards launch.

McDonald spends considerable time indicating the business pressure that MTI was subject to from its largest customer, NASA. NASA was considering creating a second-source option for competing companies for solid fuel motors from MTI and had also delayed signing a large contract (Buy-III fixed cost bid) for the next batch of motors. The collective impact of these actions by NASA could cost MTI over a billion dollars. So MTI management appears to have been under great pressure to accommodate to NASA managers' preferences concerning the launch decision. And it is hard to avoid the conclusion that their decision placed business interests first and the professional judgments of their safety engineers second. In doing so they placed the lives of seven astronauts at risk, with tragic consequences.

And what about NASA? Here the pressures are somewhat less fully developed than in Vaughan's account, but the driving commitment to achieve a 24-launch per year schedule seems to have been a primary motivation. Delayed launches significantly undermined this goal, which threatened both the prestige of NASA, the hope of significant commercial revenue for the program, and the assurance of continuing funding from Congress.

McDonald was not a participant in the caucus conference call. But he provides a reconstruction based on information provided by participants. In his understanding the engineers continued to defend their recommendation based on very concrete concerns about the effectiveness of the O-rings in extreme cold. Senior managers indicated their lack of support for this engineering judgment, and in the end Jerry Mason indicated that this would need to be a management decision. The FRR team was then informed that MTI has reconsidered its negative recommendation concerning launch. McDonald refused to sign the launch recommendation document, which was signed by his boss Joe Kilminster and faxed to the LRR team.


In hindsight it seems clear that both MTI executives and NASA executives deferred to business pressures of their respective organizations in the face of well-supported doubts about the safety of the launch. Is this a case of 20-20 vision after the fact? It distinctly appears not to be. The depth of knowledge, analysis, and rational concern that was present in the engineering group for at least a year prior to the Challenger disaster gave very specific and evidence-based reasons to abort this launch. This was not some intuitive, unspecific set of worries; it was an ongoing research problem that greatly concerned the engineers who were directly involved. And it appears there was no significant disagreement or uncertainty among them.

So it is hard to avoid a rather terrible conclusion, that the Challenger disaster was avoidable and should have been prevented. And the culpability lies with senior NASA and MTI executives who placed production pressures and business interests ahead of normal safety assessment procedures, and ahead of safety itself.

It is worth noting that Diane Vaughan's assessment is directly at odds with this assessment. She writes:
We now return to the eve of the launch. Accounts emphasizing valiant attempts by Thiokol engineers to stop the launch, actions of a few powerful managers who overruled a unanimous engineering position, and managerial failure to pass information about the teleconference to senior NASA administrators, coupled with news of economic strain and production pressure at NASA, led many to suspect that NASA managers had acted as amoral calculators, knowingly violating rules and taking extraordinary risk with human lives in order to keep the shuttle on schedule. However, like the history of decision making, I found that events on the eve of the launch were vastly more complex than the published accounts and media representations of it. From the profusion of information available after the accident, some actions, comments, and actors were brought repeatedly to public attention, finding their way into recorded history. Others, receiving less attention or none, were omitted. The omissions became, for me, details of social context essential for explanation. (LC 6215)
Young, Cook, Boisjoly, and Feynman. Concluding this list of puzzles and contradictions, I found that no one accused any of the NASA managers associated with the launch decision of being an amoral calculator. Although the Presidential Commission report extensively documented and decried the production pressures under which the Shuttle Program operated, no individuals were confirmed or even alleged to have placed economic interests over safety in the decision to launch the Space Shuttle Challenger. For the Commission to acknowledge production pressures and simultaneously fail to connect economic interests and individual actions is, prima facie, extremely suspect. But NASA’s most outspoken critics—Astronaut John Young, Morton Thiokol engineers Al McDonald and Roger Boisjoly, NASA Resource Analyst Richard Cook, and Presidential Commissioner Richard Feynman, who frequently aired their opinions to the media—did not accuse anyone of knowingly violating safety rules, risking lives on the night of January 27 and morning of January 28 to meet a schedule commitment. (kl 1627)
Vaughan's account includes many of the pivot-points of McDonald's narrative, but she assigns a different significance to many of them. She prefers her "normalization of deviance" explanation over the "amoral calculator" explanation.

(The Rogers Commission report and supporting documents are available online. Here is a portion of the hearings transcripts in which senior NASA officials provide testimony; link. This segment is critical to the issues raised in McDonald's account, since it addresses the January 27, 1986 teleconference FRR session in which a recommendation against launch was put forward by MTI engineering and was challenged by NASA senior administrators.)

Monday, October 28, 2019

Regulatory delegation at the FAA


Earlier posts have focused on the role of inadequate regulatory oversight as part of the tragedy of the Boeing 737 MAX (link, link). (Also of interest is an earlier discussion of the "quiet power" through which business achieves its goals in legislation and agency rules (link).) Reporting in the New York Times this week by Natalie Kitroeff and David Gelles provides a smoking gun for the idea of regulatory capture by industry over the regulatory agency established to ensure its safe operations (link). The article quotes a former attorney in the FAA office of chief counsel:
“The reauthorization act mandated regulatory capture,” said Doug Anderson, a former attorney in the agency’s office of chief counsel who reviewed the legislation. “It set the F.A.A. up for being totally deferential to the industry.”
Based on exhaustive investigative journalism, Kitroeff and Gelles provide a detailed account of the lobbying strategy and efforts by Boeing and the aircraft manufacturing industry group that led to the incorporation of industry-favored language into the FAA Reauthorization Act of 2018, and it is a profoundly discouraging account for anyone interested in the idea that the public good should drive legislation. The new paragraphs introduced into the final legislation stipulate full implementation of the philosophy of regulatory delegation and establish an industry-centered group empowered to oversee the agency's performance and to make recommendations about FAA employees' compensation. "Now, the agency, at the outset of the development process, has to hand over responsibility for certifying almost every aspect of new planes." Under the new legislation the FAA is forbidden from taking back control of the certification process for a new aircraft without a full investigation or inspection justifying such an action.

As the article notes, the 737 MAX was certified under the old rules. The new rules give the FAA even less oversight powers and responsibilities for the certification of new aircraft and major redesigns of existing aircraft. And the fact that the MCAS system was never fully reviewed by the FAA, based on assurances of its safety from Boeing, reduces even further our confidence in the effectiveness of the FAA process. From the article:
The F.A.A. never fully analyzed the automated system known as MCAS, while Boeing played down its risks. Late in the plane’s development, Boeing made the system more aggressive, changes that were not submitted in a safety assessment to the agency.
Boeing, the Aerospace Industries Association, and the General Aviation Manufacturers Association exercised influence on the 2018 legislation through a variety of mechanisms. Legislators and lobbyists alike were guided by a report on regulation authored by Boeing itself. Executives and lobbyists exercised their ability to influence powerful senators and members of Congress through person-to-person interactions. And elected representatives from both parties favored "less regulation" as a way of supporting the economic interests of businesses in their states. For example:
They also helped persuade Senator Maria Cantwell, Democrat of Washington State, where Boeing has its manufacturing hub, to introduce language that requires the F.A.A. to relinquish control of many parts of the certification process.
And, of course, it is important not to forget about the "revolving door" from industry to government to lobbying firm. Ali Bahrami was an FAA official who subsequently became a lobbyist for the aerospace industry; Stephen Dixon is a former executive of Delta Airlines who now serves as Administrator of the FAA; and in 2007 former FAA Administrator Marion Blakey became CEO of the Aerospace Industries Association, the industry's chief advocacy and lobbying group (link). It is hard to envision neutral, objective judgment in ensuring the safety of the public from such appointments.
Boeing and its allies found a receptive audience in the head of the House transportation committee, Bill Shuster, a Pennsylvania Republican staunchly in favor of deregulation, and his aide working on the legislation, Holly Woodruff Lyons.
These kinds of influence on legislation and agency action provide crystal-clear illustrations of the mechanisms cited by Pepper Culpepper in Quiet Politics and Business Power: Corporate Control in Europe and Japan explaining the political influence of business. Here is my description of his views in an earlier post:
Culpepper unpacks the political advantage residing with business elites and managers in terms of acknowledged expertise about the intricacies of corporate organization, an ability to frame the issues for policy makers and journalists, and ready access to rule-writing committees and task forces. These factors give elite business managers positional advantage, from which they can exert a great deal of influence on how an issue is formulated when it comes into the forum of public policy formation.
It seems abundantly clear that the "regulatory delegation" movement and its underlying effort to reduce regulatory burden on industry have gone too far in the case of aviation; and the same seems true in other industries such as the nuclear industry. The much harder question is organizational: what form of regulatory oversight would permit a regulatory industry to genuinely enhance the safety of the regulated industry and protect the public from unnecessary hazards? Even if we could take the anti-regulation ideology that has governed much public discourse since the Reagan years out of the picture, there are the continuing issues of expertise, funding, and industry power of resistance that make effective regulation a huge challenge.

Thursday, October 10, 2019

Organizational culture


It is of both intellectual and practical interest to understand how organizations function and how the actors within them choose the actions that they pursue. A common answer to these questions is to refer to the rules and incentives of the organization, and then to attempt to understand the actor's choices through the lens of rational preference theory. However, it is now increasingly clear that organizations embody distinctive "cultures" that significantly affect the actions of the individuals who operate within their scope. Edgar Schein is a leading expert on the topic of organizational culture. Here is how he defines the concept in Organizational Culture and Leadership. Organizational culture, according to Schein, consists of a set of "basic assumptions about the correct way to perceive, think, feel, and behave, driven by (implicit and explicit) values, norms, and ideals" (Schein, 1990).
Culture is both a dynamic phenomenon that surrounds us at all times, being constantly enacted and created by our interactions with others and shaped by leadership behavior, and a set of structures, routines, rules, and norms that guide and constrain behavior. When one brings culture to the level of the organization and even down to groups within the organization, one can see clearly how culture is created, embedded, evolved, and ultimately manipulated, and, at the same time, how culture constrains, stabilizes, and provides structure and meaning to the group members. These dynamic processes of culture creation and management are the essence of leadership and make one realize that leadership and culture are two sides of the same coin. (3rd edition, p. 1)
According to Schein, there is a cognitive and affective component of action within an organization that has little to do with rational calculation of interests and more to do with how the actors frame their choices. The values and expectations of the organization help to shape the actions of the participants. And one crucial aspect of leaders, according to Schein, is the role they play in helping to shape the culture of the organizations they lead.

It is intriguing that several pressing organizational problems have been found to rotate around the culture of the organization within which behavior takes place. The prevalence of sexual and gender harassment appears to depend a great deal on the culture of respect and civility that an organization has embodied -- or has failed to embody. The ways in which accidents occur in large industrial systems seems to depend in part on the culture of safety that has been established within the organization. And the incidence of corrupt and dishonest practices within businesses seems to be influenced by the culture of integrity that the organization has managed to create. In each instance experience seems to demonstrate that "good" culture leads to less socially harmful behavior, while "bad" culture leads to more such behavior.

Consider first the prominence that the idea of safety culture has come to play in the nuclear industry after Three Mile Island and Chernobyl. Here are a few passages from a review document authored by the Advisory Committee on Reactor Safeguards (link).
There also seems to be a general agreement in the nuclear community on the elements of safety culture. Elements commonly included at the organization level are senior management commitment to safety, organizational effectiveness, effective communications, organizational learning, and a working environment that rewards identifying safety issues. Elements commonly identified at the individual level include personal accountability, questioning attitude, and procedural adherence. Financial health of the organization and the impact of regulatory bodies are occasionally identified as external factors potentially affecting safety culture. 
The working paper goes on to consider two issues: has research validated the causal relationship between safety culture and safe performance? And should the NRC create regulatory requirements aimed at observing and enhancing the safety culture in a nuclear plant? They note that current safety statistics do not permit measurement of the association between safety culture and safe performance, but that experience in the industry suggests that the answers to both questions are probably affirmative:
On the other hand, even at the current level of industry maturity, we are confronted with events such as the recent reactor vessel head corrosion identified so belatedly at the Davis-Besse Nuclear Power Plant. Problems subsequently identified in other programmatic areas suggest that these may not be isolated events, but the result of a generally degraded plant safety culture. The head degradation was so severe that a major accident could have resulted and was possibly imminent. If, indeed, the true cause of such an event proves to be degradation of the facility's safety culture, is it acceptable that the reactor oversight program has to wait for an event of such significance to occur before its true root cause, degraded culture, is identified? This event seems to make the case for the need to better understand the issues driving the culture of nuclear power plants and to strive to identify effective performance indicators of resulting latent conditions that would provide leading, rather than lagging, indications of future plant problems. (7-8)
Researchers in the area of sexual harassment have devoted quite a bit of attention to the topic of workplace culture as well. This theme is emphasized in the National Academy study on sexual and gender harassment (link); the authors make the point that gender harassment is chiefly aimed at expressing disrespect towards the target rather than sexual exploitation. This has an important implication for institutional change. An institution that creates a strong core set of values emphasizing civility and respect is less conducive to gender harassment. They summarize this analysis in the statement of findings as well:
Organizational climate is, by far, the greatest predictor of the occurrence of sexual harassment, and ameliorating it can prevent people from sexually harassing others. A person more likely to engage in harassing behaviors is significantly less likely to do so in an environment that does not support harassing behaviors and/or has strong, clear, transparent consequences for these behaviors. (50)
Ben Walsh is representative of this approach. Here is the abstract of a research article by Walsh, Lee, Jensen, McGonagle, and Samnani on workplace incivility (link):
Scholars have called for research on the antecedents of mistreatment in organizations such as workplace incivility, as well as the theoretical mechanisms that explain their linkage. To address this call, the present study draws upon social information processing and social cognitive theories to investigate the relationship between positive leader behaviors—those associated with charismatic leadership and ethical leadership—and workers’ experiences of workplace incivility through their perceptions of norms for respect. Relationships were separately examined in two field studies using multi- source data (employees and coworkers in study 1, employees and supervisors in study 2). Results suggest that charismatic leadership (study 1) and ethical leadership (study 2) are negatively related to employee experiences of workplace incivility through employee perceptions of norms for respect. Norms for respect appear to operate as a mediating mechanism through which positive forms of leadership may negatively relate to workplace incivility. The paper concludes with a discussion of implications for organizations regarding leader behaviors that foster norms for respect and curb uncivil behaviors at work.
David Hess, an expert on corporate corruption, takes a similar approach to the problem of corruption and bribery by officials of multinational corporations (link). Hess argues that bribery often has to do with organizational culture and individual behavior, and that effective steps to reduce the incidence of bribery must proceed on the basis of an adequate analysis of both culture and behavior. And he links this issue to fundamental problems in the area of corporate social responsibility.
Corporations must combat corruption. By allowing their employees to pay bribes they are contributing to a system that prevents the realization of basic human rights in many countries. Ensuring that employees do not pay bribes is not accomplished by simply adopting a compliance and ethics program, however. This essay provided a brief overview of why otherwise good employees pay bribes in the wrong organizational environment, and what corporations must focus on to prevent those situations from arising. In short, preventing bribe payments must be treated as an ethical issue, not just a legal compliance issue, and the corporation must actively manage its corporate culture to ensure it supports the ethical behavior of employees.
As this passage emphasizes, Hess believes that controlling corrupt practices requires changing incentives within the corporation while equally changing the ethical culture of the corporation; he believes that the ethical culture of a company can have effects on the degree to which employees engage in bribery and other corrupt practices.

What is in common among each of these examples -- and other examples are available as well -- is that intangible features of the work environment are likely to influence behavior of the actors in that environment, and thereby affect the favorable and unfavorable outcomes of the organization's functioning as well. Moreover, if we take the lead offered by Schein and work on the assumption that leaders can influence culture through their advocacy for the values that the organization embodies, then leadership has a core responsibility to facilitate a work culture that embodies these favorable outcomes. Work culture can be cultivated to encourage safety and to discourage bad outcomes like sexual harassment and corruption.

Tuesday, September 3, 2019

The US Chemical Safety Board


The Federal agency responsible for investigating chemical and petrochemical accidents in the United States is the Chemical Safety Board (link). The mission of the Board is described in these terms:
The CSB is an independent federal agency charged with investigating industrial chemical accidents. Headquartered in Washington, DC, the agency’s board members are appointed by the President and confirmed by the Senate.
The CSB’s mission is to “drive chemical safety change through independent investigation to protect people and the environment.”
The CSB’s vision is “a nation safe from chemical disasters.”
The CSB conducts root cause investigations of chemical accidents at fixed industrial facilities. Root causes are usually deficiencies in safety management systems, but can be any factor that would have prevented the accident if that factor had not occurred. Other accident causes often involve equipment failures, human errors, unforeseen chemical reactions or other hazards. The agency does not issue fines or citations, but does make recommendations to plants, regulatory agencies such as the Occupational Safety and Health Administration (OSHA) and the Environmental Protection Agency (EPA), industry organizations, and labor groups. Congress designed the CSB to be non-regulatory and independent of other agencies so that its investigations might, where appropriate, review the effectiveness of regulations and regulatory enforcement.
CSB was legislatively conceived in analogy with the National Transportation Safety Board, and its sole responsibility is to conduct investigations of major chemical accidents in the United States and report its findings to the public. It is not subordinate to OSHA or EPA, but it collaborates with those (and other) Federal agencies as appropriate (link). It has no enforcement powers; its sole function is to investigate, report, and recommend when serious chemical or petrochemical accidents have occurred.

One of its most important investigations concerned the March 23, 2005 Texas City BP refinery explosion. A massive explosion resulted in the deaths of 15 workers, injuries to over 170 workers, and substantial destruction of the refinery infrastructure. CSB conducted an extensive investigation into the “root causes” of the accident, and assigned substantial responsibility to BP’s corporate management of the facility. Here is the final report of that investigation (link), and here is a video prepared by CSB summarizing its main findings (link).

The key findings of the CSB report focus on the responsibility of BP management for the accident. Here is a summary of the CSB assessment of root causes:

The BP Texas City tragedy is an accident with organizational causes embedded in the refinery’s culture. The CSB investigation found that organizational causes linked the numerous safety system failures that extended beyond the ISOM unit. The organizational causes of the March 23, 2005, ISOM explosion are

  • BP Texas City lacked a reporting and learning culture. Reporting bad news was not encouraged, and often Texas City managers did not effectively investigate incidents or take appropriate corrective action.
  • BP Group lacked focus on controlling major hazard risk. BP management paid attention to, measured, and rewarded personal safety rather than process safety.
  • BP Group and Texas City managers provided ineffective leadership and oversight. BP management did not implement adequate safety oversight, provide needed human and economic resources, or consistently model adherence to safety rules and procedures.
  • BP Group and Texas City did not effectively evaluate the safety implications of major organizational, personnel, and policy changes.
Underlying almost all of these failures to manage this complex process with a priority on “process safety” rather than simply personal safety is a corporate mandate for cost reduction:
In late 2004, BP Group refining leadership ordered a 25 percent budget reduction “challenge” for 2005. The Texas City Business Unit Leader asked for more funds based on the conditions of the Texas City plant, but the Group refining managers did not, at first, agree to his request. Initial budget documents for 2005 reflect a proposed 25 percent cutback in capital expenditures, including on compliance, HSE, and capital expenditures needed to maintain safe plant operations.[208] The Texas City Business Unit Leader told the Group refining executives that the 25 percent cut was too deep, and argued for restoration of the HSE and maintenance-related capital to sustain existing assets in the 2005 budget. The Business Unit Leader was able to negotiate a restoration of less than half the 25 percent cut; however, he indicated that the news of the budget cut negatively affected workforce morale and the belief that the BP Group and Texas City managers were sincere about culture change. (176)
And what about corporate accountability? What did BP have to pay in recompense for its faulty management of the Texas City refinery and the subsequent damages to workers and local residents? The answer is, remarkably little. OSHA assessed a fine of $50.6 million for its violations of safety regulations (link, link), and it committed to spend at least $500M to take corrective steps within the plant to protect the safety of workers. This was a record fine at the time; and yet it might very well be seen by BP corporate executives as a modest cost of doing business in this industry. It does not seem to be of the magnitude that would lead to fundamental change of culture, action, and management within the company.

BP commissioned a major review of BP refinery safety in all five of its US-based refineries following release of the CSB report. This study became the Baker Panel REPORT OF THE BP U.S. REFINERIES INDEPENDENT SAFETY REVIEW PANEL (JANUARY 2007) (link). The Baker Panel consisted of fully qualified experts on industrial and technological safety who were in a very good position to assess the safety management and culture of BP in its operations of its five US-based refineries. The Baker Panel was specifically directed to refrain from attempting to analyze responsibility for the Texas City disaster and to focus its efforts on assessing the safety culture and management direction that were currently to be found in BP's five refineries. Here are some central findings:
  • Based on its review, the Panel believes that BP has not provided effective process safety leadership and has not adequately established process safety as a core value across all its five U.S. refineries.
  • BP has not always ensured that it identified and provided the resources required for strong process safety performance at its U.S. refineries. Despite having numerous staff at different levels of the organization that support process safety, BP does not have a designated, high-ranking leader for process safety dedicated to its refining business.
  • The Panel also found that BP did not effectively incorporate process safety into management decision-making. BP tended to have a short-term focus, and its decentralized management system and entrepreneurial culture have delegated substantial discretion to U.S. refinery plant managers without clearly defining process safety expectations, responsibilities, or accountabilities.
  • BP has not instilled a common, unifying process safety culture among its U.S. refineries.
  • While all of BP’s U.S. refineries have active programs to analyze process hazards, the system as a whole does not ensure adequate identification and rigorous analysis of those hazards.
  • The Panel’s technical consultants and the Panel observed that BP does have internal standards and programs for managing process risks. However, the Panel’s examination found that BP’s corporate safety management system does not ensure timely compliance with internal process safety standards and programs at BP’s five U.S. refineries.
  • The Panel also found that BP’s corporate safety management system does not ensure timely implementation of external good engineering practices that support and could improve process safety performance at BP’s five U.S. refineries. (Summary of findings, xii-xiii)
These findings largely validate and support the critical assessment of BP's safety management practices in the CSB report.

It seems clear that an important part of the substantial improvement that has occurred in aviation safety in the past fifty years is the effective investigation and reporting provided by the NTSB. NTSB is an authoritative and respected bureau of experts whom the public trusts when it comes to discovering the causes of aviation disasters. The CSB has a much shorter institutional history -- it was created in 1990 -- but we need to ask a parallel question here as well: Does the CSB provide a strong lever for improving safety practices in the chemical and petrochemical industries through its accident investigations; or are industry actors largely free to continue their poor management practices indefinitely, safe in the realization that large chemical accidents are rare and the costs of occasional liability judgments are manageable?

Monday, August 12, 2019

Testing the NRC


Serious nuclear accidents are rare but potentially devastating to people, land, and agriculture. (It appears that minor to moderate nuclear accidents are not nearly so rare, as James Mahaffey shows in Atomic Accidents: A History of Nuclear Meltdowns and Disasters: From the Ozark Mountains to Fukushima.) Three Mile Island, Chernobyl, and Fukushima are disasters that have given the public a better idea of how nuclear power reactors can go wrong, with serious and long-lasting effects. Reactors are also among the most complex industrial systems around, and accidents are common in complex, tightly coupled industrial systems. So how can we have reasonable confidence in the safety of nuclear reactors?

One possible answer is that we cannot have reasonable confidence at all. However, there are hundreds of large nuclear reactors in the world, and 98 active nuclear reactors in the United States alone. So it is critical to have highly effective safety regulation and oversight of the nuclear power industry. In the United States that regulatory authority rests with the Nuclear Regulatory Commission. So we need to ask the question: how good is the NRC at regulating, inspecting, and overseeing the safety of nuclear reactors in our country?

One would suppose that there would be excellent and detailed studies within the public administration literature that attempt to answer this question, and we might expect that researchers within the field of science and technology studies might have addressed it as well. However, this seems not to be the case. I have yet to find a full-length study of the NRC as a regulatory agency, and the NRC is mentioned only twice in the 600-plus page Oxford Handbook of Regulation. However, we can get an oblique view of the workings of the NRC through other sources. One set of observers who are in a position to evaluate the strengths and weaknesses of the NRC are nuclear experts who are independent of the nuclear industry. For example, publications from the Bulletin of the Atomic Scientists include many detailed reports on the operations and malfunctions of nuclear power plants that permit a degree of assessment of the quality of oversight provided by the NRC (link). And a detailed (and scathing) report by the General Accounting Office on the near-disaster at the Davis-Besse nuclear power plant is another expert assessment of NRC functioning (link).

David Lochbaum, Edwin Lyman, and Susan Stranahan fit the description of highly qualified independent scientists and observers, and their detailed case history of the Fukushima disaster provides a degree of insight into the workings of the NRC as well as the Japanese nuclear safety agency. Their book, Fukushima: The Story of a Nuclear Disaster, is jointly written by the authors under the auspices of the Union of Concerned Scientists, one of the best informed networks of nuclear experts we have in the United States. Lochbaum is director of the UCS Nuclear Safety Project and author of Nuclear Waste Disposal Crisis. The book provides a careful and scientific treatment of the unfolding of the Fukushima disaster hour by hour, and highlights the background errors that were made by regulators and owners in the design and operation of the Fukushima plant as well. The book makes numerous comparisons to the current workings of the NRC which permit a degree of assessment of the US regulatory agency.

In brief, Lochbaum and his co-authors appear to have a reasonably high opinion of the technical staff, scientists, and advisors who prepare recommendations for NRC consideration, but a low opinion of the willingness of the five commissioners to adopt costly recommendations that are strongly opposed by the nuclear industry. The authors express frustration that the nuclear safety agencies in both countries appear to have failed to have learned important lessons from the Fukushima disaster:
“The [Japanese] government simply seems in denial about the very real potential for another catastrophic accident.... In the United States, the NRC has also continued operating in denial mode. It turned down a petition requesting that it expand emergency evacuation planning to twenty-five miles from nuclear reactors despite the evidence at Fukushima that dangerous levels of radiation can extend at least that far if a meltdown occurs. It decided to do nothing about the risk of fire at over-stuffed spent fuel pools. And it rejected the main recommendation of its own Near-Term Task Force to revise its regulatory framework. The NRC and the industry instead are relying on the flawed FLEX program as a panacea for any and all safety vulnerabilities that go beyond the “design basis.” (kl 117)
They believe that the NRC is excessively vulnerable to influence by the nuclear power industry and to elected officials who favor economic growth over hypothetical safety concerns, with the result that it tends to err in favor of the economic interests of the industry.
Like many regulatory agencies, the NRC occupies uneasy ground between the need to guard public safety and the pressure from the industry it regulates to get off its back. When push comes to shove in that balancing act, the nuclear industry knows it can count on a sympathetic hearing in Congress; with millions of customers, the nation’s nuclear utilities are an influential lobbying group. (36)
They note that the NRC has consistently declined to undertake more substantial reform of its approach to safety, as recommended by its own panel of experts. The key recommendation of the Near-Term Task Force (NTTF) was that the regulatory framework should be anchored in a more strenuous standard of accident prevention, requiring plant owners to address "beyond-design-basis accidents". The Fukushima earthquake and tsunami events were "beyond-design-basis"; nonetheless, they occurred, and the NTTF recommended that safety planning should incorporate consideration of these unlikely but possible events.
The task force members believed that once the first proposal was implemented, establishing a well-defined framework for decision making, their other recommendations would fall neatly into place. Absent that implementation, each recommendation would become bogged down as equipment quality specifications, maintenance requirements, and training protocols got hashed out on a case-by-case basis. But when the majority of the commissioners directed the staff in 2011 to postpone addressing the first recommendation and focus on the remaining recommendations, the game was lost even before the opening kickoff. The NTTF’s Recommendation 1 was akin to the severe accident rulemaking effort scuttled nearly three decades earlier, when the NRC considered expanding the scope of its regulations to address beyond-design accidents. Then, as now, the perceived need for regulatory “discipline,” as well as industry opposition to an expansion of the NRC’s enforcement powers, limited the scope of reform. The commission seemed to be ignoring a major lesson of Fukushima Daiichi: namely, that the “fighting the last war” approach taken after Three Mile Island was simply not good enough. (kl 253)
As a result, "regulatory discipline" (essentially the pro-business ideology that holds that regulation should be kept to a minimum) prevailed, and the primary recommendation was tabled. The issue was of great importance, in that it involved setting the standard of risk and accident severity for which the owner needed to plan. By staying with the lower standard, the NRC left the door open to the most severe kinds of accidents.

The NTTF task force also addressed the issue of "delegated regulation" (in which the agency defers to the industry in many issues of certification and risk assessment) (Here is the FAA's definition of delegated regulation; link.)
The task force also wanted the NRC to reduce its reliance on industry voluntary initiatives, which were largely outside of regulatory control, and instead develop its own “strong program for dealing with the unexpected, including severe accidents.” (252)
Other more detail-oriented recommendations were refused as well -- for example, a requirement to install reliable hardened containment vents in boiling water reactors, with a requirement that these vents should incorporate filters to remove radioactive gas before venting. 
But what might seem a simple, logical decision—install a $15 million filter to reduce the chance of tens of billions of dollars’ worth of land contamination as well as harm to the public—got complicated. The nuclear industry launched a campaign to persuade the NRC commissioners that filters weren’t necessary. A key part of the industry’s argument was that plant owners could reduce radioactive releases more effectively by using FLEX equipment.... In March 2013, they voted 3–2 to delay a requirement that filters be installed, and recommended that the staff consider other alternatives to prevent the release of radiation during an accident. (254)
The NRC voted against including the requirement of filters on containment vents, a decision that was based on industry arguments that the cost of the filters was excessive and unnecessary.

The authors argue that the NRC needs to significantly rethink its standards of safety and foreseeable risk.
What is needed is a new, commonsense approach to safety, one that realistically weighs risks and counterbalances them with proven, not theoretical, safety requirements. The NRC must protect against severe accidents, not merely pretend they cannot occur. (257)
Their recommendation is to make use of an existing and rigorous plan for reactor safety incorporating the results of "severe accident mitigation alternatives" (SAMA) analysis already performed -- but largely disregarded.

However, they are not optimistic that the NRC will be willing to undertake these substantial changes that would significantly enhance safety and make a Fukushima-scale disaster less likely. Reporting on a post-Fukushima conference sponsored by the NRC, they write:
But by now it was apparent that little sentiment existed within the NRC for major changes, including those urged by the commission’s own Near-Term Task Force to expand the realm of “adequate protection.”
Lochbaum and his co-authors also make an intriguing series of points about the use of modeling and simulation in the effort to evaluate safety in nuclear plants. They agree that simulation methods are an essential part of the toolkit for nuclear engineers seeking to evaluate accident scenarios; but they argue that the simulation tools currently available (or perhaps ever available) fall far short of the precision sometimes attributed to them. So simulation tools sometimes give a false sense of confidence in the existing safety arrangements in a particular setting.
Even so, the computer simulations could not reproduce numerous important aspects of the accidents. And in many cases, different computer codes gave different results. Sometimes the same code gave different results depending on who was using it. The inability of these state-of-the-art modeling codes to explain even some of the basic elements of the accident revealed their inherent weaknesses—and the hazards of putting too much faith in them. (263)
In addition to specific observations about the functioning of the NRC the authors identify chronic failures in the nuclear power system in Japan that should be of concern in the United States as well. Conflict of interest, falsification of records, and punishment of whistleblowers were part of the culture of nuclear power and nuclear regulation in Japan. And these problems can arise in the United States as well. Here are examples of the problems they identify in the Japanese nuclear power system; it is a valuable exercise to attempt to determine whether these issues arise in the US regulatory environment as well.

Non-compliance and falsification of records in Japan
Headlines scattered over the decades built a disturbing picture. Reactor owners falsified reports. Regulators failed to scrutinize safety claims. Nuclear boosters dominated safety panels. Rules were buried for years in endless committee reviews. “Independent” experts were financially beholden to the nuclear industry for jobs or research funding. “Public” meetings were padded with industry shills posing as ordinary citizens. Between 2005 and 2009, as local officials sponsored a series of meetings to gauge constituents’ views on nuclear power development in their communities, NISA encouraged the operators of five nuclear plants to send employees to the sessions, posing as members of the public, to sing the praises of nuclear technology. (46)
The authors do not provide evidence about similar practices in the United States, though the history of the Davis-Besse nuclear plant in Ohio suggests that similar things happen in the US industry. Charles Perrow treats the Davis-Besse near-disaster in a fair amount of detail; link. Descriptions of the Davis-Besse nuclear incident can be found herehere, here, and here.
Conflict of interest
Shortly after the Fukushima accident, Japan’s Yomiuri Shimbun reported that thirteen former officials of government agencies that regulate energy companies were currently working for TEPCO or other power firms. Another practice, known as amaagari, “ascent to heaven,” spins the revolving door in the opposite direction. Here, the nuclear industry sends retired nuclear utility officials to government agencies overseeing the nuclear industry. Again, ferreting out safety problems is not a high priority.
Punishment of whistle-blowers
In 2000, Kei Sugaoka, a nuclear inspector working for GE at Fukushima Daiichi, noticed a crack in a reactor’s steam dryer, which extracts excess moisture to prevent harm to the turbine. TEPCO directed Sugaoka to cover up the evidence. Eventually, Sugaoka notified government regulators of the problem. They ordered TEPCO to handle the matter on its own. Sugaoka was fired. (47)
There is a similar story in the Davis-Besse plant history.

Factors that interfere with effective regulation

In summary: there appear to be several structural factors that make nuclear regulation less effective than it needs to be.

First is the fact of the political power and influence of the nuclear industry itself. This was a major factor in the background of the Chernobyl disaster as well, where generals and party officials pushed incessantly for rapid completion of reactors; Serhii Plokhy, Chernobyl: The History of a Nuclear Catastrophe. Lochbaum and his collaborators demonstrate the power that TEPCO had in shaping the regulations under which it built the Fukushima complex, including the assumptions that were incorporated about earthquake risk and tsunami risk. Charles Perrow demonstrates a comparable ability by the nuclear industry in the United States to influence the rules and procedures that govern their use of nuclear power as well (link). This influence permits the owners of nuclear power plants to influence the content of regulation as well as the systems of inspection and oversight that the agency adopts.

A related factor is the set of influences and lobbying points that come from the needs of the economy and the production pressures of the energy industry. (Interestingly enough, this was also a major influence on Soviet decision-making in choosing the graphite-moderated light water reactor for use at Chernobyl and numerous other plants in the 1960s; Serhii Plokhy, Chernobyl: The History of a Nuclear Catastrophe.)

Third is the fact emphasized by Charles Perrow that the NRC is primarily governed by Congress, and legislators are themselves vulnerable to the pressures and blandishments of the industry and demands for a low-regulation business environment. This makes it difficult for the NRC to carry out its role as independent guarantor of the health and safety of the public. Here is Perrow's description of the problem in The Next Catastrophe: Reducing Our Vulnerabilities to Natural, Industrial, and Terrorist Disasters (quoting Lochbaum from a 2004 Union of Concerned Scientists report):
With utilities profits falling when the NRC got tough after the Time story, the industry not only argued that excessive regulation was the problem, it did something about what it perceived as harassment. The industry used the Senate subcommittee that controls the agency’s budget, headed by a pro-nuclear Republican senator from New Mexico, Pete Domenici. Using the committee’s funds, he commissioned a special study by a consulting group that was used by the nuclear industry. It recommended cutting back on the agency’s budget and size. Using the consultant’s report, Domenici “declared that the NRC could get by just fine with a $90 million budget cut, 700 fewer employees, and a greatly reduced inspection effort.” (italics supplied) The beefed-up inspections ended soon after the threat of budget cuts for the agency. (Mangels 2003) And the possibility for public comment was also curtailed, just for good measure. Public participation in safety issues once was responsible for several important changes in NRC regulations, says David Lochbaum, a nuclear safety engineer with the Union of Concerned Scientists, but in 2004, the NRC, bowed to industry pressure and virtually eliminated public participation. (Lochbaum 2004) As Lochbaum told reporter Mangels, “The NRC is as good a regulator as Congress permits it to be. Right now, Congress doesn’t want a good regulator.”  (The Next Catastrophe, kl 2799)
A fourth important factor is a pervasive complacency within the professional nuclear community about the inherent safety of nuclear power. This is a factor mentioned by Lochbaum:
Although the accident involved a failure of technology, even more worrisome was the role of the worldwide nuclear establishment: the close-knit culture that has championed nuclear energy—politically, economically, socially—while refusing to acknowledge and reduce the risks that accompany its operation. Time and again, warning signs were ignored and near misses with calamity written off. (kl 87)
This is what we might call an ideological or cultural factor, in that it describes a mental framework for thinking about the technology and the public. It is very real factor in decision-making, both within the industry and in the regulatory world. Senior nuclear engineering experts at major research universities seem to share the view that the public "fear" of nuclear power is entirely misplaced, given the safety record of the industry. They believe the technical problems of nuclear power generation have been solved, and that a rational society would embrace nuclear power without anxiety. For rebuttal to this complacency, see Rose and Sweeting's report in the Bulletin of the Atomic Scientists, "How safe is nuclear power? A statistical study suggests less than expected" (link). Here is the abstract to their paper:
After the Fukushima disaster, the authors analyzed all past core-melt accidents and estimated a failure rate of 1 per 3704 reactor years. This rate indicates that more than one such accident could occur somewhere in the world within the next decade. The authors also analyzed the role that learning from past accidents can play over time. This analysis showed few or no learning effects occurring, depending on the database used. Because the International Atomic Energy Agency (IAEA) has no publicly available list of nuclear accidents, the authors used data compiled by the Guardian newspaper and the energy researcher Benjamin Sovacool. The results suggest that there are likely to be more severe nuclear accidents than have been expected and support Charles Perrow’s “normal accidents” theory that nuclear power reactors cannot be operated without major accidents. However, a more detailed analysis of nuclear accident probabilities needs more transparency from the IAEA. Public support for nuclear power cannot currently be based on full knowledge simply because important information is not available.
Lee Clarke's book on planning for disaster on the basis of unrealistic models and simulations is relevant here. In Mission Improbable: Using Fantasy Documents to Tame Disaster Clarke argues that much of the planning currently in place for largescale disasters depends upon models, simulations, and scenario-building tools in which we should have very little confidence.

The complacency about nuclear safety mentioned here makes safety regulation more difficult and, paradoxically, makes the safe use of nuclear power more unlikely. Only when the risks are confronted with complete transparency and honesty will it be possible to design regulatory systems that do an acceptable job of ensuring the safety and health of the public.

In short, Lochbaum and his co-authors seem to provide evidence for the conclusion that the NRC is not in a position to perform its primary function: to establish a rational and scientifically well grounded set of standards for safe reactor design and operation. Further, its ability to enforce through inspection seems impaired as well by the power and influence the nuclear industry can deploy through Congress to resist its regulatory efforts. Good expert knowledge is canvassed through the NRC's processes; but the policy recommendations that flow from this scientific analysis are all too often short-circuited by the ability of the industry to fend off new regulatory requirements. Lochbaum's comment quoted by Perrow above seems all too true: “The NRC is as good a regulator as Congress permits it to be. Right now, Congress doesn’t want a good regulator.” 

It is very interesting to read the transcript of a 2014 hearing of the Senate Committee on Environment and Public Works titled "NRC'S IMPLEMENTATION OF THE FUKUSHIMA NEAR-TERM TASK FORCE RECOMMENDATIONS AND OTHER ACTIONS TO ENHANCE AND MAINTAIN NUCLEAR SAFETY" (link). Senator Barbara Boxer, California Democrat and chair of the committee, opened the meeting with these words:
Although Chairman Macfarlane said, when she announced her resignation, she had assured that ‘‘the agency implemented lessons learned from the tragic accident at Fukushima.’’ She said, ‘‘the American people can be confident that such an accident will never take place here.’’

I say the reality is not a single one of the 12 key safety recommendations made by the Fukushima Near-Term Task Force has been implemented. Some reactor operators are still not in compliance with the safety requirements that were in place before the Fukushima disaster. The NRC has only completed its own action 4 of the 12 task force recommendations.
This is an alarming assessment, and one that is entirely in accord with the observations made by Lochbaum above.

Thursday, August 1, 2019

Pervasive organizational and regulatory failures


It is intriguing to observe how pervasive organizational and regulatory failures are in our collective lives. Once you are sensitized to these factors, you see them everywhere. A good example is in the business section of today's print version of the New York Times, August 1, 2019. There are at least five stories in this section that reflect the consequences of organizational and regulatory failure.

The first and most obvious story is one that has received frequent mention in Understanding Society, the Boeing 737 Max disaster. In a story titled “FAA oversight of Boeing scrutinized", the reporters give information about a Senate hearing on FAA oversight earlier this week.  Members of the Senate Appropriations Committee questioned the process of certification of new aircraft currently in use by the FAA.
Citing the Times story, Ms. Collins raised concerns over “instances in which FAA managers appeared to be more concerned with Boeing’s production timeline, rather than the safety recommendations of its own engineers.”
Senator Jack Reed referred to the need for a culture change to rebalance the relationship between regulator and industry. Agency officials continued to defend the certification process, which delegates 96% of the work of certification to the manufacturer.

This story highlights two common sources of organizational and regulatory failure. There is first the fact of “production pressure” coming from the owner of a risky process, involving timing, supply of product, and profitability. This pressure leads the owner to push the organization hard in an effort to achieve goals -- often leading to safety and design failures. The second factor identified here is the structural imbalance that exists between powerful companies running complex and costly processes, and the safety agencies tasked to oversee and regulate their behavior. The regulatory agency, in this case the FAA, is under-resourced and lacks the expert staff needed to carry out in depth a serious process of technical oversight.  The article does not identify the third factor which has been noted in prior posts on the Boeing disaster, the influence which Boeing has on legislators, government officials, and the executive branch.

 A second relevant story (on the same page as the Boeing story) refers to charges filed in Germany against the former CEO of Audi who has been charged concerning his role in the vehicle emissions scandal. This is part of the long-standing deliberate effort by Volkswagen to deceive regulators about the emissions characteristics of their diesel engine and exhaust systems. The charges against the Audi executive involved ordering the development of software designed to cheat diesel emissions testing for their vehicles. This ongoing story is primarily a story about corporate dysfunction, in which corporate leaders were involved in unethical and dishonest activities on behalf of the company. Regulatory failure is not a prominent part of this story, because the efforts at deception were so carefully calculated that it is difficult to see how normal standards of regulatory testing could have defeated them. Here the pressing problem is to understand how professional, experienced executives could have been led to undertake such actions, and how the corporation was vulnerable to this kind of improper behavior at multiple levels within the corporation. Presumably there were staff at multiple levels within these automobile companies who were aware of improper behavior. The story quotes a mid-level staff person who writes in an email that “we won’t make it without a few dirty tricks.” So the difficult question for these corporations is how their internal systems were inadequate to take note of dangerously improper behavior. The costs to Volkswagen and Audi in liability judgments and government penalties are truly vast, and surely outweigh the possible gains of the deception. These costs in the United States alone exceed $22 billion.

A similar story, this time from the tech industry, concerns a settlement of civil claims against Cisco Systems to settle claims “that it sold video surveillance technology that it knew had a significant security flaw to federal, state and local government agencies.” Here again we find a case of corporate dishonesty concerning some of its central products, leading to a public finding of malfeasance. The hard question is, what systems are in place for companies like Cisco that ensure ethical and honest presentation of the characteristics and potential defects of the products that they sell? The imperatives of working always to maximize profits and reduce costs lead to many kinds of dysfunctions within organizations, but this is a well understood hazard. So profit-based companies need to have active and effective programs in place that encourage and enforce honest and safe practices by managers, executives, and frontline workers. Plainly those programs broke down at Cisco, Volkswagen, and Audi. (One of the very useful features of Tom Beauchamp's book Case Studies in Business, Society, and Ethics is the light Beauchamp sheds through case studies on the genesis of unethical and dishonest behavior within a corporate setting.)

Now we go on to Christopher Flavelle's story about home-building in flood zones. From a social point of view, it makes no sense to continue to build homes, hotels, and resorts in flood zones. The increasing destruction of violent storms and extreme weather events has been evident at least since the devastation of Hurricane Katrina. Flavelle writes:
There is overwhelming scientific consensus that rising temperatures will increase the frequency and severity of coastal flooding caused by hurricanes, storm surges, heavy rain and tidal floods. At the same time there is the long-term threat of rising seas pushing the high-tide line inexorably inland.
However, Flavelle reports research by Climate Central that shows that the rate of home-building in flood zones since 2010 exceeds the rate of home-building in non-flood zones in eight states. So what are the institutional and behavioral factors that produce this amazingly perverse outcome? The article refers to incentives of local municipalities in generating property-tax revenues and of potential homeowners subject to urban sprawl and desires for second-home properties on the water. Here is a tragically short-sighted development official in Galveston who finds that "the city has been able to deal with the encroaching water, through the installation of pumps and other infrastructure upgrades": "You can build around it, at least for the circumstances today. It's really not affected the vitality of things here on the island at all." The factor that is not emphasized in this article is the role played by the National Flood Insurance Program in the problem of coastal (and riverine) development. If flood insurance rates were calculated in terms of the true riskiness of the proposed residence, hotel, or resort, then it would no longer be economically attractive to do the development. But, as the article makes clear, local officials do not like that answer because it interferes with "development" and property tax growth. ProPublica has an excellent 2013 story on the perverse incentives created by the National Flood Insurance Program, and its inequitable impact on wealthier home-owners and developers (link). Here is an article by Christine Klein and Sandra Zellmer in the SMU Law Review on the dysfunctions of Federal flood policy (link):
Taken together, the stories reveal important lessons, including the inadequacy of engineered flood control structures such as levees and dams, the perverse incentives created by the national flood insurance program, and the need to reform federal leadership over flood hazard control, particularly as delegated to the Army Corps of Engineers.
Here is a final story from the business section of the New York Times illustrating organizational and regulatory dysfunctions -- this time from the interface between the health industry and big tech. The story here is an effort that is being made by DeepMind researchers to use artificial intelligence techniques to provide early diagnosis of otherwise mysterious medical conditions like "acute kidney injury" (AKI). The approach proceeds by analyzing large numbers of patient medical records and attempting to identify precursor conditions that would predict the occurrence of AKI. The primary analytical tool mentioned in the article is the set of algorithms associated with neural networks. In this instance the organizational / regulatory dysfunction is latent rather than explicit and has to do with patient privacy. DeepMind is a business unit within the Google empire of businesses, Alphabet. DeepMind researchers gained access to large volumes of patient data from the UK National Health Service. There is now regulatory concern in the UK and the US concerning the privacy of patients whose data may wind up in the DeepMind analysis and ultimately in Google's direct control. "Some critics question whether corporate labs like DeepMind are the right organization to handle the development of technology with such broad implications for the public." Here the issue is a complicated one. It is of course a good thing to be able to diagnose disorders like AKI in time to be able to correct them. But the misuse and careless custody of user data by numerous big tech companies, including especially Facebook, suggests that sensitive personal data like medical files need to be carefully secured by effective legislation and regulation. And so far the regulatory system appears to be inadequate for the protection of individual privacy in a world of massive databases and largescale computing capabilities. The recent FTC $5 billion settlement imposed on Facebook, large as it is, may not suffice to change the business practices of Facebook (link).

(I didn't find anything in the sports section today that illustrates organizational and regulatory dysfunction, but of course these kinds of failures occur in professional and college sports as well. Think of doping scandals in baseball, cycling, and track and field, sexual abuse scandals in gymnastics and swimming, and efforts by top college football programs to evade NCAA regulations on practice time and academic performance.)