Showing posts with label simulation. Show all posts
Showing posts with label simulation. Show all posts

Wednesday, June 17, 2020

ABM models for the COVID-19 pandemic


In an earlier post I mentioned that agent-based models provide a substantially different way of approaching the problem of pandemic modeling. ABM models are generative simulations of processes that work incrementally through the behavior of discrete agents; so modeling an epidemic using this approach is a natural application.

In an important recent research effort Gianluca Manzo and Arnout van de Rijt have undertaken to provide an empirically calibrated ABM model of the pandemic in France that pays attention to the properties of the social networks that are found in France. They note that traditional approaches to the modeling of epidemic diseases often work on the basis of average population statistics. (The draft paper is posted on ArXiv; link; they have updated the manuscript since posting). They note, however, that diseases travel through social networks, and individuals within a society differ substantially in terms of the number of contacts they have in a typical day or week. This implies intuitively that the transmission of a disease through a population should be expected to be influenced by the social networks found within that population and the variations that exist across individuals in terms of the number of social contacts that they have in a given time period. Manzo and van de Rijt believe that this feature of disease-spread through a community is crucial to consider when attempting to model the progression of the disease. But more importantly, they believe that consideration of contact variation across a population suggests public health strategies that might be successful in reducing the spread of a disease at lower social and public cost.

Manzo offers a general framework for this approach in "Complex Social Networks are Missing in the Dominant COVID-19 Epidemic Models," published last month in Sociologica (link). Here is the abstract for this article:
In the COVID-19 crisis, compartmental models have been largely used to predict the macroscopic dynamics of infections and deaths and to assess different non-pharmaceutical interventions aimed to contain the microscopic dynamics of person-to-person contagions. Evidence shows that the predictions of these models are affected by high levels of uncertainty. However, the link between predictions and interventions is rarely questioned and a critical scrutiny of the dependency of interventions on model assumptions is missing in public debate. In this article, I have examined the building blocks of compartmental epidemic models so influential in the current crisis. A close look suggests that these models can only lead to one type of intervention, i.e., interventions that indifferently concern large subsets of the population or even the overall population. This is because they look at virus diffusion without modelling the topology of social interactions. Therefore, they cannot assess any targeted interventions that could surgically isolate specific individuals and/or cutting particular person-to-person transmission paths. If complex social networks are seriously considered, more sophisticated interventions can be explored that apply to specific categories or sets of individuals with expected collective benefits. In the last section of the article, I sketch a research agenda to promote a new generation of network-driven epidemic models. (31)
Manzo's central concern about what he calls compartmental models (SIR models) is that "the variants of SIR models used in the current crisis context address virus diffusion without modelling the topology of social interactions realistically" (33).

 Manzo offers an interesting illustration of why a generic SIR model has trouble reproducing the dynamics of an epidemic of infectious disease by comparing this situation to the problem of traffic congestion:
It is as if we pretended realistically to model car flows at a country level, and potentially associated traffic jams, without also modelling the networks of streets, routes, and freeways. Could this type of models go beyond recommendations advising everyone not to use the car or allowing only specific fractions of the population to take the route at specific times and days? I suspect they could not. One may also anticipate that many drivers would be highly dissatisfied with such generic and undifferentiated instructions. SIR models currently in use put each of us in a similar situation. The lack of route infrastructure within my fictive traffic model corresponds to the absence of the structure of social interactions with dominant SIR models. (42)
The key innovation in the models constructed by Manzo and van de Rijt is the use of detailed data on contact patterns in France. They make highly pertinent use of a study of close-range contacts that was done in France in 2012 and published in 2015 (Béraud et al link). This study allows for estimation of the frequency of contacts possessed by French adults and children and the extensive variation that exists across individuals. Here is a graph illustrating the dispersion that exists in number of contacts for individuals in the study:

This graph demonstrates the very wide variance that exists among individuals when it comes to "number of contacts"; and this variation in turn is highly relevant to the spread of an infectious disease.

Manzo and van de Rijt make use of the data provided in this COMES-F study to empirically calibrate their agent-based model of the diffusion of the disease, and to estimate the effects of several different strategies designed to slow down the spread of the disease following relaxation of extreme social distancing measures.

The most important takeaway from this article is the strategy that it suggests for managing the reopening of social interaction after the peak of the epidemic. Key to transmission is frequency of close contact, and these models show that a small number of individuals have disproportionate effect on the spread of an infectious disease because of the high number of contacts they have. Manzo and van de Rijt ask the hypothetical question: are there strategies for management of an epidemic that could be designed by selecting a relatively small number of individuals for immunization? (Immunization might take the form of an effective but scarce vaccine, or it might take the form of testing, isolation, and intensive contact tracing.) But how would it be possible to identify the "high contact" individuals? M&R consider two strategies and then represent these strategies within their base model of the epidemic. Both strategies show dramatic improvement in the number of infected individuals over time. The baseline strategy "NO-TARGET" is one in which a certain number of individuals are chosen at random for immunization, and then the process of infection plays out. The "CONTACT-TARGET" strategy is designed to select the same number of individuals for immunization, but using a process that makes it more likely that the selected individuals will have higher-than-average contacts. The way this is done is to select a random group of individuals from the population and then ask those individuals to nominate one of their contacts for immunization. It is demonstrable that this procedure will arrive at a group of individuals for immunization who have higher-than-average numbers of contacts. The third strategy, HUB-TARGET, involves selecting the same number of individuals for treatment from occupations that have high levels of contacts.

The simulation is run multiple times for each of the three treatment strategies, using four different "budgets" that determine the number of individuals to be treated on each scenario. The results are presented here, and they are dramatic. Both contact-sensitive strategies of treatment result in substantial reduction in the total number of individuals infect over the course of 50, 100, and 150 days. And this  in turn translates into substantial reduction of the number of ICU beds required on each strategy.


Here is how Manzo and van de Rijt summarize their findings:
As countries exit the Covid-19 lockdown many have limited capacity to prevent flare-ups of the coronavirus. With medical, technological, and financial resources to prevent infection of only a fraction of its population, which individuals should countries target for testing and tracking? Together, our results suggest that targeting individuals characterized by high frequencies of short-range contacts dramatically improves the effectiveness of interventions. An additional known advantage of targeting hubs with medical testing specifically is that they serve as an early-warning device that can detect impending or unfolding outbreaks (Christakis & Fowler 2010; Kitsak et al. 2010).
This conclusion is reached by moving away from the standard compartmental models that rely on random mixing assumptions toward a network-based modeling framework that can accommodate person-to-person differences in infection risks stemming from differential connectedness. The framework allows us to model rather than average out the high variability of close-contact frequencies across individuals observed in contact survey data. Simulation results show that consideration of realistic close-contact distributions with high skew strongly impacts the expected impact of targeted versus general interventions, in favor of the former.
If these simulation results are indeed descriptive of the corresponding dynamics of spread of this disease through a population of socially connected people, then the research seems to provide an important hint about how public health authorities can effectively manage disease spread in a post-COVID without recourse to the complete shut-down of economic and social life that was necessary in the first half of 2020 in many parts of the world.

*.    *.    *


Here is a very interesting set of simulations by Grant Sanderson of the spread of infectious disease on YouTube (link). The video is presented with truly fantastic graphics allowing sophisticated visualization of the dynamics of the disease under different population assumptions. Sanderson doesn't explain the nature of the simulation, but it appears to be an agent-based model with parameters representing probability of infection through proximity. It is very interesting to look at this simulation through the eyes of the Manzo-van de Rijt critique: this model ignores exactly the factor that Manzo and van de Rijt take to be crucial -- differences across agents in number of contacts and the networks and hubs through which agents interact. This is reflected in the fact that every agent is moving randomly across space and every agent has the same average probability of passing on infection to those he/she encounters.

Monday, May 11, 2020

Thinking about pandemic models


One thing that is clear from the pandemic crisis that is shaking the world is the crucial need we have for models that allow us to estimate the future behavior of the epidemic. The dynamics of the spread of an epidemic are simply not amenable to intuitive estimation. So it is critical to have computational models that permit us to project the near- and middle-term behavior of the disease, based on available data and assumptions.

Scott Page is a complexity scientist at the University of Michigan who has written extensively on the uses and interpretation of computational models in the social sciences. His book, The Model Thinker: What You Need to Know to Make Data Work for You, does a superlative job of introducing the reader to a wide range of models. One of his key recommendations is that we should consider many models when we are trying to understand a particular kind of phenomenon. (Here is an earlier discussion of the book; link.) Page contributed a very useful article to the Washington Post this week that sheds light on the several kinds of pandemic models that are currently being used to understand and predict the course of the pandemic at global, national, and regional levels ("Which pandemic model should you trust?"; (link). Page describes the logic of "curve-fitting" models like the Institute for Health Metrics and Evaluation (IHME) model as well as epidemiological models that proceed on the basis of assumptions about the causal and social processes through which disease spreads. The latter attempt to represent the process of infection from infected person to susceptible person to recovered person. (Page refers to these as "microfoundational" models.) Page points out that all models involve a range of probable error and missing data, and it is crucial to make use of a range of different models in order to lay a foundation for sound public health policies. Here are his summary thoughts:
All this doesn’t mean that we should stop using models, but that we should use many of them. We can continue to improve curve-fitting and microfoundation models and combine them into hybrids, which will improve not just predictions, but also our understanding of how the virus spreads, hopefully informing policy. 
Even better, we should bring different kinds of models together into an “ensemble.” Different models have different strengths. Curve-fitting models reveal patterns; “parameter estimation” models reveal aggregate changes in key indicators such as the average number of people infected by a contagious individual; mathematical models uncover processes; and agent-based models can capture differences in peoples’ networks and behaviors that affect the spread of diseases. Policies should not be based on any single model — even the one that’s been most accurate to date. As I argue in my recent book, they should instead be guided by many-model thinking — a deep engagement with a variety of models to capture the different aspects of a complex reality. (link)
Page's description of the workings of these models is very helpful for anyone who wants to have a better understanding of the way a pandemic evolves. Page has also developed a valuable series of videos that go into greater detail about the computational architecture of these various types of models (link). These videos are very clear and eminently worth viewing if you want to understand epidemiological modeling better.

Social network analysis is crucial to addressing the challenge of how to restart businesses and other social organizations. Page has created "A Leader's Toolkit For Reopening: Twenty Strategies to Reopen and Reimagine", a valuable set of network tools and strategies offering concrete advice about steps to take in restarting businesses safely and productively. Visit this site to see how tools of network analysis can help make us safer and healthier in the workplace (link). 

Another useful recent resource on the logic of pandemic models is Jonathan Fuller's recent article "Models vs. evidence" in Boston Review (link). Fuller is a philosopher of science who undertakes two tasks in this piece: first, how can we use evidence to evaluate alternative models? And second, what accounts for the disagreements that exist in the academic literature over the validity of several classes of models? Fuller has in mind essentially the same distinction as Page does, between curve-fitting and microfoundational models. Fuller characterizes the former as "clinical epidemiological models" and the latter as "infectious disease epidemiological models", and he argues that the two research communities have very different ideas about what constitutes appropriate use of empirical evidence in evaluating a model. Essentially Fuller believes that the two approaches embody two different philosophies of science with regard to computational models of epidemics, one more strictly empirical and the other more amenable to a combination of theory and evidence in developing and evaluating the model. The article provides a level of detail that would make it ideal for a case study in a course on the philosophy of social science.

Joshua Epstein, author of Generative Social Science: Studies in Agent-Based Computational Modeling, gave a brief description in 2009 of the application of agent-based models to pandemics in "Modelling to Contain Pandemics" (link). Epstein describes a massive ABM model of a global pandemic, the Global-Scale Agent Model (GSAM), that attempted to model the spread of the H1N1 virus in 1996. Here is a video in which Miles Parker explains and demonstrates the model (link). 

Another useful resource is this video on "Network Theory: Network Diffusion & Contagion" (link), which provides greater detail about how the structure of social networks influences the spread of an infectious disease (or ideas, attitudes, or rumors).

My own predilections in the philosophy of science lean towards scientific realism and the importance of identifying underlying causal mechanisms. This leaves me more persuaded by the microfoundational / infectious disease models than the curve-fitting models. The criticisms that Nancy Cartwright and Jeremy Hardie offer in Evidence-Based Policy: A Practical Guide to Doing It Better of the uncritical methodology of randomized controlled trials (link) seem relevant here as well. The IHME model is calibrated against data from Wuhan and more recently northern Italy; but circumstances were very different in each of those locales, making it questionable that the same inflection points will show up in New York or California. As Cartwright and Hardie put the point, "The fact that causal principles can differ from locale to locale means that you cannot read off that a policy will work here from even very solid evidence that it worked somewhere else" (23). But, as Page emphasizes, it is valuable to have multiple models working from different assumptions when we are attempting to understand a phenomenon as complex as epidemic spread. Fuller makes much the same point in his article:
Just as we should embrace both models and evidence, we should welcome both of epidemiology’s competing philosophies. This may sound like a boring conclusion, but in the coronavirus pandemic there is no glory, and there are no winners. Cooperation in society should be matched by cooperation across disciplinary divides. The normal process of scientific scrutiny and peer review has given way to a fast track from research offices to media headlines and policy panels. Yet the need for criticism from diverse minds remains.

Thursday, January 2, 2020

The power of case studies in system safety



Images: Andrew Hopkins titles



Images: Other safety sources

One of the genuinely interesting aspects of the work of Andrew Hopkins is the extensive case studies he has conducted of the causation of serious industrial accidents. A good example is his analysis of the explosion of an Esso natural gas processing plant in Longford, Australia in 1998, presented in Lessons from Longford: The ESSO Gas Plant Explosion, with key findings also presented in this video. Also valuable is Hopkins' analysis of the Deepwater Horizon blowout in the Gulf of Mexico (link). Here he dispassionately walks through the steps of the accident and identifies faults at multiple levels (operator, engineering, management, corporate policy).

In addition to these books about major accidents and disasters, Hopkins has also created a number of very detailed videos based on the analysis presented in the case studies. These videos offer vivid recreation of the accidents along with a methodical and evidence-based presentation of Hopkins' analysis of the causes of the accidents at multiple levels.

It is intriguing to consider whether it would be possible to substantially improve the "safety thinking" of executives and managers in high-risk industries through an intensive training program based on case studies like these. Intensive system safety training for executives and managers is clearly needed. If complex processes are to be managed in a way that avoids catastrophic failures, executives and managers need to have a much more sophisticated understanding of safety science. Further, they need more refined skills in designing and managing risky processes. And yet much training about industrial safety focuses on the wrong level of accidents -- shop floor accidents, routine injuries, and days-lost metrics -- whereas there is a consensus among safety experts that the far larger source of hazard in complex industrial processes lies at the system level.

We might think of Hopkins' case studies (and others that are available in the literature) as the basis of cognitive and experiential training for executives and managers on the topic of system safety, helping them gain a broader understanding of the kinds of failures that are known to lead to major accidents and better mental skills for managing risky processes. This might be envisioned in analogy with the training that occurs through scenario-based table-top exercises for disaster response for high-level managers, where the goal is to give participants a practical and experiential exposure to the kinds of rare situations they may be suddenly immersed in and a set of mental tools through which to respond. (My city's top fire official and emergency manager once said to a group of senior leaders at my university at the end of a presentation about the city's disaster planning: "When disaster strikes, your IQ will drop by 20 points. So it is imperative that you work with lots of scenarios and develop a new set of skills that will allow you to respond quickly and appropriately to the circumstances that arise. And by the way -- a tornado has just blown the roof off the humanities building, and there are casualties!")

Consider a program of safety training for managers along these lines: simulation-based training, based on detailed accident scenarios, with a theoretical context introducing the ideas of system accidents, complexity, tight coupling, communications failures, lack of focus on organizational readiness for safety, and the other key findings of safety research. I would envision a week-long training offering exposure to the best current thinking about system safety, along with exposure to extensive case studies and a number of interactive simulations based on realistic scenarios.

I taught a graduate course in public policy on "Organizational causes of large technology failures" this year that made substantial use of case materials like these. Seeing the evolution that masters-level students underwent in the sophistication of their understanding of the causes of large failures, it seems very credible that senior-manager training like that described here would indeed be helpful. The learning that these students did on this subject was evident through the quality of the group projects they did on disasters. Small teams undertook to research and analyze failures as diverse as the V-22 Osprey program, the State of Michigan Unemployment Insurance disaster (in which the state's software system wrongly classified thousands of applicants as having submitted fraudulent claims), and the Chinese melamine milk adulteration disaster. Their work products were highly sophisticated, and very evidently showed the benefits of studying experts such as Diane Vaughan, Charles Perrow, Nancy Leveson, and Andrew Hopkins. I feel confident that these students would be able to take these perspectives and skills into the complex organizations in which they may work in the future, and their organizations will be safer as a result.

This kind of training would be especially useful in sectors that involve inherently high risks of large-scale accidents -- for example, the rail industry, marine shipping, aviation and space design and manufacturing, chemical and petrochemical processing, hospitals, banking, the electric power grid, and the nuclear industry.

(I should note that Hopkins himself provides training materials and consultation on the subject of system safety through FutureMedia Training Resources (link).)

Saturday, December 28, 2019

High-reliability organizations


Charles Perrow takes a particularly negative view of the possibility of safe management of high-risk technologies in Normal Accidents: Living with High-Risk Technologies. His summary of the Three Mile Island accident is illustrative: “The system caused the accident, not the operators” (12). Perrow’s account of TMI is chiefly an account of complex and tightly-coupled system processes, and the difficulty these processes create for operators and managers when they go wrong. And he is doubtful that the industry can safely manage its nuclear plants.

It is interesting to note that systems engineer and safety expert Nancy Leveson addresses the same features of “system accidents” that Perrow addresses, but with a greater level of confidence about the possibility of creating engineering and organizational enhancements. A recent expression of her theory of technology safety is provided in Engineering a Safer World: Systems Thinking Applied to Safety (Engineering Systems) and Resilience Engineering: Concepts and Precepts.

In examining the safety of high-risk industries, our goal should be to identify some of the behavioral, organizational, and regulatory dysfunctions that increase the likelihood and severity of accidents, and to consider organizational and behavioral changes that would serve to reduce the risk and severity of accidents. This is the approach taken by a group of organizational theorists, engineers, and safety experts who explore the idea and practice of a “high reliability organization”. Scott Sagan describes the HRO approach in these terms in The Limits of Safety:
The common assumption of the high reliability theorists is not a naive belief in the ability of human beings to behave with perfect rationality, it is the much more plausible belief that organizations, properly designed and managed, can compensate for well-known human frailties and can therefore be significantly more rational and effective than can individuals. (Sagan, 16)
Sagan lists several conclusions advanced by HRO theorists, based on a small number of studies of high-risk organizational environments. Researchers have identified a set of organizational features that appear to be common among HROs:
  • Leadership safety objectives: priority on avoiding altogether serious operational failures
  • Organizational leaders must place high priority on safety in order to communicate this objective clearly and consistently to the rest of the organization
  • The need for redundancy. Multiple and independent channels of communication, decision-making, and implementation can produce a highly reliable overall system
  • Decentralization -- authority must exist in order to permit rapid and appropriate responses to dangers by individuals closest to the problems
  • culture – recruit individuals who help maintain a strong organizational culture emphasizing safety and reliability
  • continuity – maintain continuous operations, vigilance, and training
  • organizational learning – learn from prior accidents and near-misses.
  • Improve the use of simulation and imagination of failure scenarios
Here is Sagan's effort to compare Normal Accident Theory with High Reliability Organization Theory:


The genuinely important question here is whether there are indeed organizational arrangements, design principles, and behavioral practices that are consistently effective in significantly reducing the incidence and harmfulness of accidents in high-risk enterprises, or whether on the other hand, the ideal of a "High Reliability Organization" is more chimera than reality.

A respected organizational theorist who has written on high-reliability organizations and practices extensively is Karl Weick. He and Kathleen Sutcliffe attempt to draw some useable maxims for high reliability in Managing the Unexpected: Sustained Performance in a Complex World. They use several examples of real-world business failures to illustrate their central recommendations, including an in-depth case study of the Washington Mutual financial collapse in 2008.

The chief recommendations of their book come down to five maxims for enhancing reliability:
  1. Pay attention to weak signals of unexpected events
  2. Avoid extreme simplification
  3. Pay close attention to operations
  4. Maintain a commitment to resilience
  5. Defer to expertise
Maxim 1 (preoccupation with failure) encourages a style of thinking -- an alertness to unusual activity or anomalous events and a commitment to learning from near-misses in the past. This alertness is both individual and organizational; individual members of the organization need to be alert to weak signals in their areas, and managers need to be receptive to hearing the "bad news" when ominous signals are reported. By paying attention to "weak signals" of possible failure, managers will have more time to design solutions to failures when they emerge.

Maxim 2 addresses the common cognitive mistake of subsuming unusual or unexpected outcomes under more common and harmless categories. Managers should be reluctant to accept simplifications. The Columbia space shuttle disaster seems to fall in this category, where senior NASA managers dismissed evidence of foam strike during lift-off by subsuming it under many earlier instances of debris strikes.

Maxim 3 addresses the organizational failure associated with distant management -- top executives who are highly "hands-off" in their knowledge and actions with regard to ongoing operations of the business. (The current Boeing story seems to illustrate this failure; even the decision to move the corporate headquarters to Chicago, very distant from the engineering and manufacturing facilities in Seattle, illustrates a hands-off attitude towards operations.) Executives who look at their work as "the big picture" rather than ensuring high-quality activity within the actual operations of the organization are likely to oversee disaster at some point.

Maxim 4 is both cognitive and organizational. "Resilience" refers to the "ability of an organization (system) to maintain or regain a dynamically stable state, which allows it to continue operations after a major mishap and/ or in the presence of a continuous stress". A resilient organization is one where process design has been carried out in order to avoid single-point failures, where resources and tools are available to address possible "off-design" failures, and where the interruption of one series of activities (electrical power) does not completely block another vital series of activities (flow of cooling water). A resilient team is one in which multiple capable individuals are ready to work together to solve problems, sometimes in novel ways, to ameliorate the consequences of unexpected failure.

Maxim 5 emphasizes the point that complex activities and processes need to be managed by teams incorporating experience, knowledge, and creativity in order to be able to confront and surmount unexpected failures. Weick and Sutcliffe give telling examples of instances where key expertise was lost at the frontline level through attrition or employee discouragement, and where senior executives substituted their judgment for the recommendations of more expert subordinates.

These maxims involve a substantial dose of cognitive practice, changing the way that employees, managers, and executives think: the importance of paying attention to signs of unexpected outcomes (pumps that repeatedly fail in a refinery), learning from near-misses, making full use of the expertise of members of the organization, .... It is also possible to see how various organizations could be evaluated in terms of their performance on these five maxims -- before a serious failure has occurred -- and could improve performance accordingly.

It is interesting to observe, however, that Weick and Sutcliffe do not highlight some factors that have been given strong priority in other treatments of high-reliability organizations: the importance of establishing a high priority for system safety in the highest management levels of the organization (which unavoidably competes with cost and profit pressures), the organizational feature of an empowered safety executive outside the scope of production and business executives in the organization, the possible benefits of a somewhat decentralized system of control, the possible benefits of redundancy, the importance of well-designed training aimed at enhancing system safety as well as personal safety, and the importance of creating a culture of honesty and compliance when it comes to safety. When mid-level managers are discouraged from bringing forward their concerns about the "signals" they perceive in their areas, this is a pre-catastrophe situation.

There is a place in the management literature for a handbook of research on high-reliability organizations; at present, such a resource does not exist.

(See also Sagan and Blanford's volume Learning from a Disaster: Improving Nuclear Safety and Security after Fukushima.)

Friday, November 30, 2018

Modeling the social


One of the most interesting authorities on social models and simulations is Scott Page. This month he published a major book on this topic, The Model Thinker: What You Need to Know to Make Data Work for You, and it is a highly valuable contribution. The book corresponds roughly to the content of Page's very successful Coursera course on models and simulations, and it serves as an excellent introduction to many different kinds of mathematical models in the social sciences.

Page's fundamental premise in the book is that we need many models, and many intellectual perspectives, to make sense of the social world. Mathematical modeling is a way of getting disciplined about the logic of our theories and hypotheses about various processes in the world, including the physical, biological, and social realms. No single approach will be adequate to understanding the complexity of the world; rather, we need multiple hypotheses and models to disentangle the many concurrent causal and systemic processes that are under way at a single time. As Page puts the point:
As powerful as single models can be, a collection of models accomplishes even more. With many models, we avoid the narrowness inherent in each individual model. A many-models approach illuminates each component model's blind spots. Policy choices made based on single models may ignore important features of the world such as income disparity, identity diversity, and interdependencies with other systems. (2)
Social ontology supports this approach in a fundamental way. The way I would put the point is this: social processes are almost invariably heterogeneous in their causes, temporal characters, and effects. So we need to have a way of theorizing society that is well suited to the forms of heterogeneity, and the many-models approach does exactly that.

Page proposes that there are multiple reasons why we might turn to models of a situation (physical, ecological, social, ...): to "reason, explain, design, communicate, act, predict, and explore" (15). We might simplify this list by saying that models can enhance theoretical understanding of complex phenomena (explanation, discovery of truth, exploration of hypotheses) and they may also serve practical purposes involving prediction and control.



Especially interesting are topics taken up in later chapters of the book, including the discussion of network models and broadcast, diffusion, and contagion models (chapters 9-10). These are all interesting because they represent different approaches to a common social phenomenon, the spread of a property through a population (ideas, disease, rebellion, hate and intolerance). These are among the most fundamental mechanisms of social change and stability, and Page's discussion of relevant models is insightful and accessible.

Page describes the constructs he considers as models, or abstract representations analogous to mathematical expressions. But we might also think of them as mini-theories of social mechanisms. Many of these examples illustrate a single kind of process that is found in real social situations, though rarely in a pure form. Games of coordination are a good example (chapter 15): the challenge of coordinating behavior with another purposive actor in order to bring about a beneficial outcome for both is a common social circumstance. Game theory provides an abstract analysis of how coordination can be achieved between rational agents; and the situation is more complicated when we consider imperfectly rational actors.

Another distinction that might be relevant in sorting the models that Page describes is that between "micro" and "macro". Some of the models Page presents have to do with individual-level behavior (and interactions between individuals); whereas others have to do with transitions among aggregated social states (market states, political regimes, ecological populations). The majority of the models considered have to do with individual choice, decision rules, and information sharing -- a micro-level approach comparable to agent-based modeling techniques. Several of the systems-dynamics models fall at the macro-end of the spectrum. Page treats this issue with the concept of "granularity": the level of structure and action at which the model's abstraction is couched (222).

The book closes with two very interesting examples of important social phenomena that can be analyzed using some of the models in the book. The first is the opioid epidemic in the United States, and the second is the last four decades' rapid increase in economic inequality. Thomas Schelling's memorable phrase, "the inescapable mathematics of musical chairs", is relevant to both problems. Once we recognize the changing rates of prescription of opioids, clustering of opioid users, and probability of transitioning from usage to addiction, the explosion of addition rates and mortality is inevitable.

Early in the book Page notes the current vogue for "big data" as a solution to the problem of understanding and forecasting large social trends and changes. He rightly argues that the data do not speak for themselves. Instead, it is necessary to bring analytical techniques to bear in order to identify relevant patterns, and we need to use imagination and rigor in creating hypotheses about the social mechanisms that underlie the patterns we discover. The Model Thinker is indeed a model of an approach to analyzing and understanding the complex world of social action and interaction that we inhabit.

Friday, March 23, 2018

Machine learning


The Center for the Study of Complex Systems at the University of Michigan hosted an intensive day-long training on some of the basics of machine learning for graduate students and interested faculty and staff. Jake Hofman, a Microsoft researcher who also teaches this subject at Columbia University, was the instructor, and the session was both rigorous and accessible (link). Participants were asked to load a copy of R, a software package designed for the computations involved in machine learning and applied statistics, and numerous data sets were used as examples throughout the day. (Here is a brief description of R; link.) Thanks, Jake, for an exceptionally stimulating workshop.

So what is machine learning? Most crudely, it is a handful of methods through which researchers can sift through a large collection of events or objects, each of which has a very large number of properties, in order to arrive at a predictive sorting of the events or objects into a set of categories. The objects may be email texts or hand-printed numerals (the examples offered in the workshop), the properties may be the presence/absence of a long list of words or the presence of a mark in a bitmap grid, and the categories may be "spam/not spam" or the numerals between 0 and 9. But equally, the objects may be Facebook users, the properties "likes/dislikes" for a very large list of webpages, and the categories "Trump voter/Clinton voter". There is certainly a lot more to machine learning -- for example, these techniques don't shed light on the ways that AI Go systems improve their play. But it's good to start with the basics. (Here is a simple presentation of the basics of machine learning; link.)

Two intuitive techniques form the core of basic machine learning theory. The first makes use of the measurement of conditional probabilities in conjunction with Bayes' theorem to assign probabilities of the object being a Phi given the presence of properties xi. The second uses massively multi-factor regressions to calculate a probability for the event being Phi given regression coefficients ci.

Another basic technique is to treat the classification problem spatially. Use the large number of variables to define an n-dimensional space; then classify the object according to the average or majority value of its m-closest neighbors. (The neighbor number m might range from 1 to some manageable number such as 10.)


There are many issues of methodology and computational technique raised by this approach to knowledge. But these are matters of technique, and smart data science researchers have made great progress on them. More interesting here are epistemological issues: how good and how reliable are the findings produced by these approaches to the algorithmic treatment of large data sets? How good is the spam filter or the Trump voter detector when applied to novel data sets? What kind of errors would we anticipate this approach to be vulnerable to?

One important observation is that these methods are explicitly anti-theoretical. There is no place for discovery of causal mechanisms or underlying explanatory processes in these calculations. The researcher is not expected to provide a theoretical hypothesis about how this system of phenomena works. Rather, the techniques are entirely devoted to the discovery of persistent statistical associations among variables and the categories of the desired sorting. This is as close to Baconian induction as we get in the sciences (link). The approach is concerned about classification and prediction, not explanation. (Here is an interesting essay where Jake Hofman addresses the issues of prediction versus explanation of social data; link.)

A more specific epistemic concern that arises is the possibility that the training set of data may have had characteristics that are importantly different from comparable future data sets. This is the familiar problem of induction: will the future resemble the past sufficiently to support predictions based on past data? Spam filters developed in one email community may work poorly in an email community in another region or profession. We can label this as the problem of robustness.

Another limitation of this approach has to do with problems where our primary concern is with a singular event or object rather than a population. If we want to know whether NSA employee John Doe is a Russian mole, it isn't especially useful to know that his nearest neighbors in a multi-dimensional space of characteristics are moles; we need to know more specifically whether Doe himself has been corrupted by the Russians. If we want to know whether North Korea will explode a nuclear weapon against a neighbor in the next six months the techniques of machine learning seem to be irrelevant.

The statistical and computational tools of machine learning are indeed powerful, and seem to lead to results that are both useful and sometimes surprising. One should not imagine, however, that machine learning is a replacement for all other forms of research methodology in the social and behavioral sciences.

(Here is a brief introduction to a handful of the algorithms currently in use in machine-learning applications; link.)

Saturday, July 29, 2017

Dynamics of medieval cities


Cities provide a good illustration of the ontology of the theory of assemblage (link). Many forms of association, production, logistics, governance, and population processes came together from independent origins and with different causal properties. So one might imagine that unexpected dynamics of change are likely to be found in all urban settings.

The medieval period is not known for its propensity for innovation, out-of-the-box thinking, or dynamic tendencies towards change. One thinks rather of the placid, continuing social relations of the English countryside, the French village, or the Italian town. There is the idea that a stultifying social order made innovation and change difficult. However, studies of medieval cities over the past century have cast some doubt on this stereotype. Henri Pirenne's lectures on the medieval city in 1923 were collected in Medieval Cities: Their Origins and the Revival of Trade, and there are numerous clues indicating that Pirenne found ample features of dynamic change in the medieval city from the eleventh century forward. Here are a few examples:
The eleventh century, in fact, brings us face to face with a real commercial revival. This revival received its impetus from two centers of activity, one located in the south and the other in the north: Venice on one side and the Flemish coast on the other. (82) 
Trade was thus forced upon them [Venice] by the very conditions under which they lived. And they had the energy and the genius to turn to profit the unlimited possibilities which trade offered them. (83)
 Constantinople, even in the eleventh century, appears not only as a great city, but as the greatest city of the whole Mediterranean basin. Her population was not far from reaching the figure of a million inhabitants, and that population was singularly active. She was not content, as had been the population of Rome under the Republic and the Empire, to consume without producing. She gave herself over, with a zeal which the fiscal system shackled but did not choke, not only to trading but to industry. (84)
The geographical situation of Flanders, indeed, put her in a splendid position to become the western focus for the commerce of the seas of the north. It formed the natural terminus of the voyage for ships arriving from Northern England or which, having crossed the Sound after coming out of the Baltic, were on their way to the south. (97)
It was only in the twelfth century that, gradually but definitely, Western Europe was transformed. The economic development freed her from the traditional immobility to which a social organization, depending solely on the relations of man to the soil, had condemned her. Commerce and industry did not merely find a place alongside of agriculture; they reacted upon it.... The rigid confines of the demesnial system, which had up to now hemmed in all economic activity, were broken down and the whole social order was patterned along more flexible, more active and more varied lines. (101-102)
Large or small, [cities] were to be met everywhere; one was to be found, on the average, in every twenty-five square leagues of land. They had, in fact, become indispensable to society. They had introduced into it a division of labor which it could no longer do without. Between them and the country was established a reciprocal exchange of services. (102)
So trade, finance, manufacturing, and flexible labor led to a dynamic of change that resulted in real economic and urban development in medieval European cities. Pirenne emphatically does not give a rendering of the medieval city that features a rigid social order impeding social and economic change.

A recent study provides modern evidence that the stereotyped impression of social stasis in the urban world of the middle ages is incorrect (link). Rudolf Ceseretti and his co-authors of "Population-Area Relationship for Medieval European Cities" provide a strikingly novel view of the medieval city (link). Their key finding is that there is an unexpected similarity of behavior with modern urban centers that can be observed in the population and spatial characteristics of medieval cities. They have collected data on 173 medieval cities across Western Europe:


Here is how they frame their finding in the Introduction:
This research suggests that, at a fundamental level, cities consist of overlapping social and physical networks that are self-consistently bounded by settled physical space [55–57]. Here, we investigate whether the relationships between settlement population and settled land area predicted by scaling theory—and observed in contemporary cities—also characterized medieval European cities. In this paper, we analyze the relationship between the extent of built-up area and resident populations of 173 settlements located in present-day Belgium, France, England, Switzerland, Germany, and Italy, ca. AD 1300. Previous scholarship has produced population estimates for a large number medieval European cities [58,59]. We build on this work by linking population estimates with estimates for the built-up area compiled from historical and archaeological sources.
The authors focus on a common belief about medieval cities -- the idea that social interactions among residents are structured by powerful social institutions. Guilds, ethnicities, family groups, and religion provide examples of such institutions. If the net effect of social institutions like these is to reduce the likelihood of interaction of pairs of individuals, then medieval cities should display different patterns of spatial distribution of population and growth; if this effect is not significant, then medieval cities should resemble modern cities in these respects. This study finds the latter to be the case. Fundamentally they are interested in the topic of "scaling of settlement area with population size". Here is a plot of area and population for the cities they studied, separated by region:


Their central finding is that the data about population density and spatial distribution do not support the hypothesis that medieval social institutions substantially inhibited social interactions to an extent that hindered urban growth and development. Rather, medieval cities look in their population and spatial structures to be very similar to modern cities.
Table 1 shows that the point estimates of the scaling coefficients for all four regional groups and for the pooled dataset fall within the 2/3 ≥ a ≥ 5/6 range predicted by the social reactor model... Thus, medieval cities across Western Europe exhibit, on average, economies of scale with respect to spatial agglomeration such that larger cities were denser on average. This pattern is similar to that observed for modern cities. 
Even though medieval cities were structured by hierarchical institutions that are ostensibly not so dominant today, we interpret this finding as excluding a strongly segregating role for medieval social institutions. This would suggest that the institutions of Western European urban systems ca. 1300 did not substantially constrain social mixing, economic integration, or the free flow of people, ideas, and information. We take these findings as an indication that the underlying micro-level social dynamics of medieval cities were fundamentally similar to those of contemporary cities. (discussion)
This study presents a fascinating contemporary test of a thesis that would surely have interested Pirenne almost a century ago: did medieval cities develop spatially in ways that reflect a reasonable degree of freedom of choice among residents about where they lived and worked? And the data seem to confirm a "yes" for this question.

(I haven't attempted to summarize the methods used in this study, and the full article bears reading for anyone interested in the question of interpreting urban history from a quantitative point of view.)

Saturday, April 22, 2017

Complexity and contingency


One of the more intriguing currents of social science research today is the field of complexity theory. Scientists like John Holland (Complexity: A Very Short Introduction), John Miller and Scott Page (Complex Adaptive Systems: An Introduction to Computational Models of Social Life), and Joshua Epstein (Generative Social Science: Studies in Agent-Based Computational Modeling) make bold and interesting claims about how social processes embody the intricate interconnectedness of complex systems.

John Holland describes some of the features of behavior of complex systems in these terms in Complexity:
  • self-organization into patterns, as occurs with flocks of birds or schools of fish  
  • chaotic behaviour where small changes in initial conditions (‘ the flapping of a butterfly’s wings in Argentina’) produce large later changes (‘ a hurricane in the Caribbean’)  
  • ‘fat-tailed’ behaviour, where rare events (e.g. mass extinctions and market crashes) occur much more often than would be predicted by a normal (bell-curve) distribution  
  • adaptive interaction, where interacting agents (as in markets or the Prisoner’s Dilemma) modify their strategies in diverse ways as experience accumulates. (p. 5)
In CAS the elements are adaptive agents, so the elements themselves change as the agents adapt. The analysis of such systems becomes much more difficult. In particular, the changing interactions between adaptive agents are not simply additive. This non-linearity rules out the direct use of PDEs in most cases (most of the well-developed parts of mathematics, including the theory of PDEs, are based on assumptions of additivity). (p. 11)
Miller and Page put the point this way:
One of the most powerful tools arising from complex systems research is a set of computational techniques that allow a much wider range of models to be explored. With these tools, any number of heterogeneous agents can interact in a dynamic environment subject to the limits of time and space. Having the ability to investigate new theoretical worlds obviously does not imply any kind of scientific necessity or validity— these must be earned by carefully considering the ability of the new models to help us understand and predict the questions that we hold most dear. (Complex Adaptive Systems, kl 199)
Much of the focus of complex systems is on how systems of interacting agents can lead to emergent phenomena. Unfortunately, emergence is one of those complex systems ideas that exists in a well-trodden, but relatively untracked, bog of discussion. The usual notion put forth underlying emergence is that individual, localized behavior aggregates into global behavior that is, in some sense, disconnected from its origins. Such a disconnection implies that, within limits, the details of the local behavior do not matter to the aggregate outcome. Clearly such notions are important when considering the decentralized systems that are key to the study of complex systems. Here we discuss emergence from both an intuitive and a theoretical perspective. (Complex Adaptive Systems, kl 832)
As discussed previously, we have access to some useful “emergence” theorems for systems that display disorganized complexity. However, to fully understand emergence, we need to go beyond these disorganized systems with their interrelated, helter-skelter agents and begin to develop theories for those systems that entail organized complexity. Under organized complexity, the relationships among the agents are such that through various feedbacks and structural contingencies, agent variations no longer cancel one another out but, rather, become reinforcing. In such a world, we leave the realm of the Law of Large Numbers and instead embark down paths unknown. While we have ample evidence, both empirical and experimental, that under organized complexity, systems can exhibit aggregate properties that are not directly tied to agent details, a sound theoretical foothold from which to leverage this observation is only now being constructed. (Complex Adaptive Systems, kl 987)
And here is Joshua Epstein's description of what he calls "generative social science":
The agent-based computational model— or artificial society— is a new scientific instrument. 1 It can powerfully advance a distinctive approach to social science, one for which the term “generative” seems appropriate. I will discuss this term more fully below, but in a strong form, the central idea is this: To the generativist, explaining the emergence2 of macroscopic societal regularities, such as norms or price equilibria, requires that one answer the following question:  
The Generativist's Question 
*     How could the decentralized local interactions of heterogeneous autonomous agents generate the given regularity?  
The agent-based computational model is well-suited to the study of this question since the following features are characteristics. (5)
Here Epstein refers to the characteristics of heterogeneity of actors, autonomy, explicit space, local interactions, and bounded rationality. And he believes that it is both possible and mandatory to show how higher-level social characteristics emerge from the rule-governed interactions of the agents at a lower level.

There are differences across these approaches. But generally these authors bring together two rather different ideas -- the curious unpredictability of even fairly small interconnected systems familiar from chaos theory, and the idea that there are simple higher level patterns that can be discovered and explained based on the turbulent behavior of the constituents. And they believe that it is possible to construct simulation models that allow us to trace out the interactions and complexities that constitute social systems.

So does complexity science create a basis for a general theory of society? And does it provide a basis for understanding the features of contingency, heterogeneity, and plasticity that I have emphasized throughout? I think these questions eventually lead to "no" on both counts.

Start with the fact of social contingency. Complexity models often give rise to remarkable and unexpected outcomes and patterns. Does this mean that complexity science demonstrates the origin of contingency in social outcomes? By no means; in fact, the opposite is true. The outcomes demonstrated by complexity models are in fact no more than computational derivations of the consequences of the premises of these models. So the surprises created by complex systems models only appear contingent; in fact they are generated by the properties of the constituents. So the surprises produced by complexity science are simulacra of contingency, not the real thing.

Second, what about heterogeneity? Does complexity science illustrate or explain the heterogeneity of social things? Not particularly. The heterogeneity of social things -- organizations, value systems, technical practices -- does not derive from complex system effects; it derives from the fact of individual actor interventions and contingent exogenous influences.

Finally, consider the feature of plasticity -- the fact that social entities can "morph" over time into substantially different structures and functions. Does complexity theory explain the feature of social plasticity? It does not. This is simply another consequence of the substrate of the social world itself: the fact that social structures and forces are constituted by the actors that make them up. This is not a systems characteristic, but rather a reflection of the looseness of social interaction. The linkages within a social system are weak and fragile, and the resulting structures can take many forms, and are subject to change over time.

The tools of simulation and modeling that complexity theorists are in the process of developing are valuable contributions, and they need to be included in the toolbox. However, they do not constitute the basis of a complete and comprehensive methodology for understanding society. Moreover, there are important examples of social phenomena that are not at all amenable to treatment with these tools.

This leads to a fairly obvious conclusion, and one that I believe complexity theorists would accept: that complexity theories and the models they have given rise to are a valuable contribution; but they are only a partial answer to the question, how does the social world work?


Monday, December 28, 2015

ANT-style critique of ABM


A short recent article in the Journal of Artificial Societies and Social Simulation by Venturini, Jensen, and Latour lays out a critique of the explanatory strategy associated with agent-based modeling of complex social phenomena (link). (Thanks to Mark Carrigan for the reference via Twitter; @mark_carrigan.) Tommaso Venturini is an expert on digital media networks at Sciences Po (link), Pablo Jensen is a physicist who works on social simulations, and Bruno Latour is -- Bruno Latour. Readers who recall recent posts here on the strengths and weaknesses of ABM models as a basis for explaining social conflict will find the article interesting (link). VJ&L argue that agent-based models -- really, all simulations that proceed from the micro to the macro -- are both flawed and unnecessary. They are flawed because they unavoidable resort to assumptions about agents and their environments that reduce the complexity of social interaction to an unacceptable denominator; and they are unnecessary because it is now possible to trace directly the kinds of processes of social interaction that simulations are designed to model. The "big data" available concerning individual-to-individual interactions permits direct observation of most large social processes, they appear to hold.

Here are the key criticisms of ABM methodology that the authors advance:
  • Most of them, however, partake of the same conceptual approach in which individuals are taken as discrete and interchangeable 'social atoms' (Buchanan 2007) out of which social structures emerge as macroscopic characteristics (viscosity, solidity...) emerge from atomic interactions in statistical physics (Bandini et al. 2009). (1.2)
  • most simulations work only at the price of simplifying the properties of micro-agents, the rules of interaction and the nature of macro-structures so that they conveniently fit each other. (1.4)
  • micro-macro models assume by construction that agents at the local level are incapable to understand and control the phenomena at the global level. (1.5)
And here is their key claim:
  • Empirical studies show that, contrarily to what most social simulations assume, collective action does not originate at the micro level of individual atoms and does not end up in a macro level of stable structures. Instead, actions distribute in intricate and heterogeneous networks than fold and deploy creating differences but not discontinuities. (1.11) 
This final statement could serve as a high-level paraphrase of actor-network theory, as presented by Latour in Reassembling the Social: An Introduction to Actor-Network-Theory. (Here is a brief description of actor-network theory and its minimalist social ontology; link.)

These criticisms parallel some of my own misgivings about simulation models, though I am somewhat more sympathetic to their use than VJ&L. Here are some of the concerns raised in earlier posts about the validity of various ABM approaches to social conflict (linklink):
  • Simulations often produce results that appear to be artifacts rather than genuine social tendencies.
  • Simulations leave out important features of the social world that are prima facie important to outcomes: for example, quality of leadership, quality and intensity of organization, content of appeals, differential pathways of appeals, and variety of political psychologies across agents.
  • The factor of the influence of organizations is particularly important and non-local.
  • Simulations need to incorporate actors at a range of levels, from individual to club to organization.
And here is the conclusion I drew in that post:
  • But it is very important to recognize the limitations of these models as predictors of outcomes in specific periods and locations of unrest. These simulation models probably don't shed much light on particular episodes of contention in Egypt or Tunisia during the Arab Spring. The "qualitative" theories of contention that have been developed probably shed more light on the dynamics of contention than the simulations do at this point in their development.
But the confidence expressed by VJ&L in the new observability of social processes through digital tracing seems excessive to me. They offer a few good examples that support their case -- opinion change, for example (1.9). Here they argue that it is possible to map or track opinion change directly through digital footprints of interaction (Twitter, Facebook, blogging), and this is superior to abstract modeling of opinion change through social networks. No doubt we can learn something important about the dynamics of opinion change through this means.

But this is a very special case. Can we similarly "map" the spread of new political ideas and slogans during the Arab Spring? No, because the vast majority of those present in Tahrir Square were not tweeting and texting their experiences. Can we map the spread of anti-Muslim attitudes in Gujarat in 2002 leading to massive killings of Muslims in a short period of time? No, for the same reason: activists and nationalist gangs did not do us the historical courtesy of posting their thought processes in their Twitter feeds either. Can we study the institutional realities of the fiscal system of the Indonesian state through its digital traces? No. Can we study the prevalence and causes of official corruption in China through digital traces? Again, no.

In other words, there is a huge methodological problem with the idea of digital traceability, deriving from the fact that most social activity leaves no digital traces. There are problem areas where the traces are more accessible and more indicative of the underlying social processes; but this is a far cry from the utopia of total social legibility that appears to underlie the viewpoint expressed here.

So I'm not persuaded that the tools of digital tracing provide the full alternative to social simulation that these authors assert. And this implies that social simulation tools remain an important component of the social scientist's toolbox.