Thursday, November 14, 2019

Electronic Health Records and medical mistakes


Electronic Health Record systems (EHRs) have been broadly implemented by hospitals and health systems around the country as a way of increasing the accuracy, availability, and timeliness of patient health status and treatment information. (These systems are also sometimes called "Digital Medical Records" (DMRs).) They are generally regarded as an important forward step in improving the quality of healthcare. Here is a description of the advantages of Electronic Health Record systems, according to Athena Health:

The advantages of electronic health records in the clinical setting are numerous and important. In the 2012 edition of the Physician Sentiment IndexTM, published by athenahealth and Sermo, 81% of physicians said they believe EHRs improve access to clinical data. More than two-thirds said an EHR can actually improve patient care.

The use of an electronic health records system offers these clinical advantages:
  • No bulky paper records to store, manage and retrieve
  • Easier access to clinical data
  • The ability to establish and maintain effective clinical workflows
  • Fewer medical errors, improved patient safety and stronger support for clinical decision-making
  • Easier participation in Meaningful Use, Patient-Centered Medical Home (PCMH) and other quality programs, with electronic prompts ensuring that required data is recorded at the point of care
  • The ability to gather and analyze patient data that enables outreach to discreet populations
  • The opportunity to interact seamlessly with affiliated hospitals, clinics, labs and pharmacies
Considering all the advantages of electronic health records, and the rapidly growing electronic interconnectedness of the health care world, even if EHRs had not been mandated by health care reform, their development and eventual ubiquity in the health care industry was inevitable.

And yet, like any software system, EHR systems are capable of creating new errors; and some of those errors can be harmful to patients.

Nancy Leveson is an important expert on software system safety who has written extensively on the challenges of writing highly reliable software in safety-critical applications. Here are a few apt observations from her book Safeware: System Safety and Computers (1995).
Although it might seem that automation would decrease the risk of operator error, the truth is that automation does not remove people from systems -- it merely moves them to maintenance and repair functions and to higher-level supervisory control and decision making. The effects of human decisions and actions can then be extremely serious. At the same time, the increased system complexity makes the decision-making process more difficult. (10)
The increased pace of change lessens opportunity to learn from experience. Small-scale and relatively nonhazardous systems can evolve gradually by trial and error. But learning by trial and error is not possible for many modern products and processes because the pace of change is too fast and the penalties of failure are too great. Design and operating procedures must be right the first time when there is potential for a major fire, explosion, or release of toxic materials. (12)
(To the last statement we might add "or harm to hospital patients through incorrect prescriptions or failed transmission of lab results".)
The safety implications of computers exercising direct control over potentially dangerous processes are obvious. Less obvious are the dangers when ... software generated data is used to make safety-critical decisions, ... software is used in design analysis, ... safety-critical data (such as blood bank data) is stored in computer databases. The FDA has received reports of software errors in medical instruments that led to mixing up patient names and data, as well as reports of incorrect outputs from laboratory and diagnostic instruments (such as patient monitors, electrocardiogram analyzers, and imaging devices". (23)
Automatic control systems [like aircraft autopilots] are designed to cope with the immediate effects of a deviation in the process -- they are feedback loops that attempt to maintain a constant system state, and as such, they mask the occurrence of a problem in its early stages. An operator will be aware of such problems only if adequate information to detect them is provided. That such information is often not provided may be the result of the different mental models of the designers and experienced operators, or it may merely reflect financial pressures on designers due to the cost of providing operators with independent information. (117)
One of the cases examined in detail in Safeware is the Therac-25 radiation-therapy device, which due to a minor software flaw in the treatment-entry plan module began seriously injuring patients with excessive doses of radiation in 1985-87 (515 ff.). It had operated without incident thousands of times before the first accident.

So Leveson gives ample reason to be cautious about the safety implications of DMRs and the "fault pathways" through which their normal functioning might harm patients. What has been the experience so far, now that the healthcare industry has witnessed widespread adoption of DMR systems?

Two specific issues involving EHR errors affecting patient care have been identified in the past several years. The first is in the area of errors in the administration of prescription drugs, and the second is in the area of the handling and routing of medical test results. Both errors have the potential for harming the patient.

Jennifer Bresnick (link) summarizes the results of a report by the Pennsylvania Patient Safety Authority concerning medication errors caused by DMR systems. Medication errors (wrong medication, wrong dose, wrong patient, wrong frequency) can occur at several stages of the clinical process, including prescribing, transcribing, dispensing, and administration. The digital medical record is intended to dramatically reduce all these sources of error, the Pennsylvania study shows that the DMR can also contribute to errors at each of these stages.
While EHRs and other technologies are intended to reduce errors and improve the safe, standardized, and well-documented delivery of care, some stakeholders believe that digital tools can simply serve to swap one set of mistakes for another. Poor implementation and lackluster user training can leave patients just as vulnerable to medication errors as they were when providers used paper charts, commented Staley Lawes, PharmD, BCPS, Patient Safety Analyst, and Matthew Grissinger, RPh, FISMP, FASCP, Manager of Medication Safety Analysis in the brief. (link)
Part of the blame, according to the Pennsylvania report, belongs to the design of the user interface:
For this reason, it is important to design a system with an intuitive user interface to minimize the risk for human error. Users should be able to easily enter and retrieve data and share information with other healthcare professionals.  When systems are designed without these considerations in mind, patients are subject to undue risk. (link)
The report contains several specific design standards that would improve the safety of the DMR system:
The interaction between clinician and software is a key component that is to be taken into consideration when trying to improve the safety of health IT,” the report says. “Incident reports can provide valuable information about the types of HIT-related issues that can cause patient harm, and ongoing HIT system surveillance can help in developing medication safety interventions. (link)
It is clear that ongoing health IT system surveillance and remedial interventions are needed. Efforts to improve health IT safety should include attention to software interoperability, usability, and workflow. The relationship between clinician and software includes complex interactions that must be considered to optimize health IT’s contribution to medication safety.
Yackel and Embi (link) treat the problem of test result management errors in "Unintended errors with EHR-based result management: a case series". Here is their abstract:
Test result management is an integral aspect of quality clinical care and a crucial part of the ambulatory medicine workflow. Correct and timely communication of results to a provider is the necessary first step in ambulatory result management and has been identified as a weakness in many paper-based systems. While electronic health records (EHRs) hold promise for improving the reliability of result management, the complexities involved make this a challenging task. Experience with test result management is reported, four new categories of result management errors identified are outlined, and solutions developed during a 2-year deployment of a commercial EHR are described. Recommendations for improving test result management with EHRs are then given.
They identify test management errors at four stages of the clinical process:
  • results not correctly communicated to provider;
  • results communicated but never received or reviewed by the provider;
  • results reviewed, but appropriate action not recommended by provider;
  • appropriate recommendation made by provider, but action not carried out.
They make several key recommendations for improving the performance of DMR systems in managing test results: Develop fault-tolerant systems that automatically report delivery failures; use robust testing to find rare errors that occur both within and between systems; implement tracking mechanisms for critical tests, such as cancer screening and diagnostics; and deliver results directly to patients.

These are just two types of errors that can arise in digital medical record management systems. It is evident that the designers and implementers of DMRs need to take the systems-safety approach described by Nancy Leveson and implement comprehensive safety failure analysis, both in terms of "safety case analysis" (discovery of failure scenarios) and after-event investigation to identify the source of the failure in the software and its human interface.

These examples are not intended to suggest that DMRs are hazardous and should be avoided. On the contrary, the consolidation and convenient presentation of patient information for the provider is clearly an important step forward. But it is crucial that designers and implementers keep safety at the center of their attention, and to have a healthy respect for the ways in which automated systems can incorporate incorrect assumptions, can produce unintended interactions among components, and can be presented in such a confusing way to the human provider that patient care is harmed.

(Here is a case of treatment involving several different errors conveyed through the digital medical record system that involved attaching biopsy and test results to the wrong patient, leading to the wrong treatment for the patient. It is interesting to read because it reflects some of the complexity identified by Leveson in other system failures.) 

Wednesday, November 6, 2019

Twelve years of Understanding Society



Understanding Society has now reached its twelfth anniversary of continuous publication. This represents 1,271 posts, and over 1.3 million words. According to Google Blogspot statistics, the blog has gained over 11 million pageviews since 2010. Just over half of visitors came from the United States, Great Britain, and Canada, with the remainder spread out over the rest of the world. The most popular posts are "Lukes on power" (134K) and "What is a social structure?" (124K).

I've continued to find writing the blog to be a great way of keeping several different lines of thought and research going. My current interest in "organizational causes of technology failures" has had a large presence in the blog in the past year, with just under half of the posts in 2019 on this topic. Likewise, a lot of the thinking I've done on the topic of "a new ontology of government" has unfolded in the blog. Other topic areas include the philosophy of social science, philosophy of technology, and theories of social ontology. A theme that was prominent in 2018 that is not represented in the current year is "Democracy and the politics of hate", but I'm sure I'll return to this topic in the coming months because I'll be teaching a course on this subject in the spring.

I continue to look at academic blogging as a powerful medium for academic communication, creativity, and testing out new ideas. I began in 2007 by describing the blog as "open-source philosophy", and it still has that character for me. And I continue to believe that my best thinking finds expression in Understanding Society. Every post that I begin starts with an idea or a question that is of interest to me on that particular day, and it almost always leads me to learning something new along the way.

I've also looked at the blog as a kind of experiment in exploration of social media for serious academic purposes. Can blogging platforms and social media platforms like Twitter or Facebook contribute to academic progress? So it is worth examining the reach of the blog over time, and the population of readers whom it has touched. The graph of pageviews over time is interesting in this respect.


Traffic to the blog increased in a fairly linear way from the beginning date of the data collection in 2010 through about 2017, and then declined more steeply from 2017 through to the present. (The data points are pageviews per month.) At its peak the blog received about 150K pageviews per month, and it seems to be stabilizing now at about 100K pageviews per month. My impression is that a lot of the variation has to do with unobserved changes in search engine page ranking algorithms, resulting in falling numbers of referrals. The Twitter feed associated with the blog has just over 2,100 followers (@dlittle30), and the Facebook page for the blog registers 12,800 followers. The Facebook page is not a very efficient way of disseminating new posts from the blog, though, because Facebook's algorithm for placing an item into the feed of a "follower" is extremely selective and opaque. A typical item may be fed into 200-400 of the feeds of the almost 13,000 individuals who have expressed interest in the page.

A surprising statistic is that about 75% of pageviews on the blog came through desktop requests rather than mobile requests (phone and tablet). We tend to think that most web viewing is occurring on mobile devices now, but that does not seem to be the case. Also interesting is that the content of the blog is mirrored to a Wordpress platform (www.undsoc.org), and the traffic there is a small fraction of the traffic on the Blogspot platform (1,500 pageviews versus 80,000 pageviews).

So thanks to the readers who keep coming back for more, and thanks as well to those other visitors who come because of an interest in a very specific topic. It's genuinely rewarding and enjoyable to be connected to an international network of people, young and old, who share an interest in how the social world works.

Sunday, November 3, 2019

O-rings and production pressure


Allan McDonald's Truth, Lies, and O-Rings: Inside the Space Shuttle Challenger Disaster (2009) has given me a somewhat different understanding of the Challenger launch disaster than I've gained from other sources, including Diane Vaughan's excellent book The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA. McDonald is a Morton Thiokol (MTI) insider who was present through virtually all aspects of the evolving solid rocket program at NASA in the two years leading up to the explosion in January 1986. He was director of the Space Shuttle Solid Rocket Motor Project during part of this time and he represented MTI at the formal Launch Readiness Review panels (LRRs) for several shuttle launches, including the fateful Challenger launch. He was senior management representative for MTI for the launch of STS-51L Challenger. His account gives a great deal of engineering detail about the Morton Thiokol engineering group's ongoing concerns about the O-rings in the months preceding the Challenger disaster. This serves as a backdrop for a detailed analysis of the dysfunctions in decision-making in both NASA and Morton Thiokol that led to an insufficient priority being given to safety assessments.

It is worth noting that O-rings were a key part of other large solid-fuel rockets, including the Titan rocket. So there was a large base of engineering and test experience with the performance of the O-rings when exposed to the high temperatures and pressures of ignition and firing.

The biggest surprise to me is the level of informed, rigorous, and evidence-based concern that MTI engineers had about the reliability of joint seal afforded by the primary and secondary seals on the solid rocket motors on the Shuttle system. These specialists had a very good and precise understanding of the mechanics of the problem. Further, there was a good engineering understanding of the expected (and required) time-sequence performance of the O-rings during ignition and firing. If the sealing action were delayed by even a few hundredths of a second, hot gas would be able to penetrate past the seal. These were not hypothetical worries, but instead were based on data from earlier launches demonstrating O-ring erosion and soot between the primary and secondary rings showing that super-hot gases had penetrated the primary seal. The worst damage and evidence of blowby had occurred on flight STS-51C January 25, 1985, one year earlier, the lowest-temperature launch yet attempted. And that launch took place when the temperature was 53 degrees.

Launch temperatures for the rescheduled January 28 launch were projected to be extremely cold -- 22-26 degrees was forecast on January 27, roughly 30 degrees colder than the previous January launch. The projected temperatures immediately raised alarm concerning the potential effects on the O-rings with the Utah-based engineering team and with McDonald himself. A teleconference meeting was scheduled for January 27 to receive recommendations from the Utah-based Morton Thiokol engineers who were focused on the O-rings problem about the minimum acceptable temperature for launch (95).
I tried to reach Larry Mulloy at his hotel but failed, so I called Cecil Houston, the NASA/MSFC Resident Manager at KSC. I alerted him of our concerns about the sealing capability of the field-joint O-rings at the predicted cold temperatures and asked him to set up the teleconference. (96)
The teleconference began at 8:30 pm on the evening before the launch. McDonald was present in Cape Canaveral for the Flight Readiness Review panel and participated in the teleconference involving the analysis and recommendations from MTI engineering, leading to a recommendation against launching in the expected cold weather conditions.
Thiokol's engineering presentation consisted of about a dozen charts summarizing the history of the performance of the field-joints, some engineering analysis on the operation of the joints, and some laboratory and full-scale static test data relative to the performance of the O-rings at various temperatures. About half the charts had been prepared by Roger Boisjoly, our chief seal expert on the O-ring Seal Task Force and staff engineer to Jack Kapp, Manager of Applied Mechanics. The remainder were presented by Arnie Thompson, the supervisor of our Structures Section under Jack Kapp, and by Brian Russell, a program manager working for Bob Ebeling. (97)
Boisjoly's next chart showed how cold temperature would reduce all the factors that helped maintain a good seal in the joint: lower O-ring squeeze due to thermal shrinkage of the O-ring; thicker and more viscous grease around the O-ring, making it slower to move across the O-ring groove; and higher O-ring hardness due to low temperature, making it more difficult for the O-ring to extrude dynamically into the gap for proper sealing. All of these things increased the dynamic actuation time, or timing function, of the O-ring, when at the very same time the O-ring could be eroding, creating a situation where the secondary seal might not be able to seal the motor, not if the primary O-ring was sufficiently eroded to prevent sealing in the joint. (99)
Based on their concerns about temperature and effectiveness of the seals in the critical half-second of ignition, MTI engineering staff prepared the foundation for a recommendation to not launch in temperatures lower than 53 degrees. Their conclusion as presented at the January 27 teleconference was unequivocal against launch under these temperature conditions:
The final chart included the recommendations, which resulted in several strong comments and many very surprising reactions from the NASA participants in the teleconference. The first statement on the “Recommendations” chart stated that the O-ring temperature must be equal to or greater than 53° at launch, and this was primarily based upon the fact that SRM-15, which was the best simulation of this condition, worked at 53 °. The chart ended with a statement that we should project the ambient conditions (temperature and wind) to determine the launch time. (102)

NASA lead Larry Mulloy contested the analysis and evidence in the slides and expressed great concern about the negative launch recommendation, and he asserted that the data were "inconclusive" in establishing a relationship between temperature and O-ring failure.
Mulloy immediately said he could not accept the rationale that was used in arriving at that recommendation. Stan Reinartz then asked George Hardy, Deputy Director of Science and Engineering at NASA/MSFC, for his opinion. Hardy said he was “appalled” that we could make such a recommendation, but that he wouldn't fly without Morton Thiokol's concurrence. Hardy also stated that we had only addressed the primary O-ring, and did not address the secondary O-ring, which was in a better position to seal because of the leak-check. Mulloy then shouted, “My God, Thiokol, when do you want me to launch, next April?” He also stated that “the eve of a launch is a helluva time to be generating new launch commit criteria!” Stan Reinartz entered the conversation by saying that he was under the impression that the solid rocket motors were qualified from 40° to 90° and that the 53° recommendation certainly was not consistent with that.” (103)
Joe Kilminster, VP of Space Booster Programs at MTI, then requested a short caucus for the engineering team in Utah to reevaluate the data and consider their response to the skepticism voiced by NASA officials. McDonald did not participate in the caucus, but his reconstruction based on the memories of persons present paints a clear picture. The engineering experts did not change their assessment, and they were overriden by MTI executives Cal Wiggins (VP and General Manager of the Space Division) and Jerry Mason (Senior VP of Wasatch Operations). In opening the caucus discussion, Mason is quoted as saying "we need to make a management decision". Engineers Boisjoly and Thompson reiterated their technical concerns about the functionality of the O-ring seals at low temperature, with no response from the senior executives. No members of the engineering team spoke up to support a decision to launch. Mason polled the senior executives, including Bob Lund (VP of Engineering), and said to Lund, "It's time for you, Bob, to take off your engineering hat and put on your management hat." (111) A positive launch recommendation was then conveyed to NASA, and the process in Florida resumed towards launch.

McDonald spends considerable time indicating the business pressure that MTI was subject to from its largest customer, NASA. NASA was considering creating a second-source option for competing companies for solid fuel motors from MTI and had also delayed signing a large contract (Buy-III fixed cost bid) for the next batch of motors. The collective impact of these actions by NASA could cost MTI over a billion dollars. So MTI management appears to have been under great pressure to accommodate to NASA managers' preferences concerning the launch decision. And it is hard to avoid the conclusion that their decision placed business interests first and the professional judgments of their safety engineers second. In doing so they placed the lives of seven astronauts at risk, with tragic consequences.

And what about NASA? Here the pressures are somewhat less fully developed than in Vaughan's account, but the driving commitment to achieve a 24-launch per year schedule seems to have been a primary motivation. Delayed launches significantly undermined this goal, which threatened both the prestige of NASA, the hope of significant commercial revenue for the program, and the assurance of continuing funding from Congress.

McDonald was not a participant in the caucus conference call. But he provides a reconstruction based on information provided by participants. In his understanding the engineers continued to defend their recommendation based on very concrete concerns about the effectiveness of the O-rings in extreme cold. Senior managers indicated their lack of support for this engineering judgment, and in the end Jerry Mason indicated that this would need to be a management decision. The FRR team was then informed that MTI has reconsidered its negative recommendation concerning launch. McDonald refused to sign the launch recommendation document, which was signed by his boss Joe Kilminster and faxed to the LRR team.


In hindsight it seems clear that both MTI executives and NASA executives deferred to business pressures of their respective organizations in the face of well-supported doubts about the safety of the launch. Is this a case of 20-20 vision after the fact? It distinctly appears not to be. The depth of knowledge, analysis, and rational concern that was present in the engineering group for at least a year prior to the Challenger disaster gave very specific and evidence-based reasons to abort this launch. This was not some intuitive, unspecific set of worries; it was an ongoing research problem that greatly concerned the engineers who were directly involved. And it appears there was no significant disagreement or uncertainty among them.

So it is hard to avoid a rather terrible conclusion, that the Challenger disaster was avoidable and should have been prevented. And the culpability lies with senior NASA and MTI executives who placed production pressures and business interests ahead of normal safety assessment procedures, and ahead of safety itself.

It is worth noting that Diane Vaughan's assessment is directly at odds with this assessment. She writes:
We now return to the eve of the launch. Accounts emphasizing valiant attempts by Thiokol engineers to stop the launch, actions of a few powerful managers who overruled a unanimous engineering position, and managerial failure to pass information about the teleconference to senior NASA administrators, coupled with news of economic strain and production pressure at NASA, led many to suspect that NASA managers had acted as amoral calculators, knowingly violating rules and taking extraordinary risk with human lives in order to keep the shuttle on schedule. However, like the history of decision making, I found that events on the eve of the launch were vastly more complex than the published accounts and media representations of it. From the profusion of information available after the accident, some actions, comments, and actors were brought repeatedly to public attention, finding their way into recorded history. Others, receiving less attention or none, were omitted. The omissions became, for me, details of social context essential for explanation. (LC 6215)
Young, Cook, Boisjoly, and Feynman. Concluding this list of puzzles and contradictions, I found that no one accused any of the NASA managers associated with the launch decision of being an amoral calculator. Although the Presidential Commission report extensively documented and decried the production pressures under which the Shuttle Program operated, no individuals were confirmed or even alleged to have placed economic interests over safety in the decision to launch the Space Shuttle Challenger. For the Commission to acknowledge production pressures and simultaneously fail to connect economic interests and individual actions is, prima facie, extremely suspect. But NASA’s most outspoken critics—Astronaut John Young, Morton Thiokol engineers Al McDonald and Roger Boisjoly, NASA Resource Analyst Richard Cook, and Presidential Commissioner Richard Feynman, who frequently aired their opinions to the media—did not accuse anyone of knowingly violating safety rules, risking lives on the night of January 27 and morning of January 28 to meet a schedule commitment. (kl 1627)
Vaughan's account includes many of the pivot-points of McDonald's narrative, but she assigns a different significance to many of them. She prefers her "normalization of deviance" explanation over the "amoral calculator" explanation.

(The Rogers Commission report and supporting documents are available online. Here is a portion of the hearings transcripts in which senior NASA officials provide testimony; link. This segment is critical to the issues raised in McDonald's account, since it addresses the January 27, 1986 teleconference FRR session in which a recommendation against launch was put forward by MTI engineering and was challenged by NASA senior administrators.)