Tuesday, October 7, 2014

Verisimilitude in models and simulations

Modeling always requires abstraction and simplification. We need to arrive at a system for representing the components of a system, the laws of action that describe their evolution and interaction, and a way of aggregating the results of the representation of the components and their interactions. Simplifications are required in order to permit us to arrive at computationally feasible representations of the reality in question; but deciding which simplifications are legitimate is a deeply pragmatic and contextual question. Ignoring air resistance is a reasonable simplification when we are modeling the trajectories of dense, massive projectiles through the atmosphere; it is wholly unreasonable if we are interested in modeling the fall of a leaf or a feather under the influence of gravity (link).

Modeling the social world is particularly challenging for a number of reasons. Not all social actors are the same; actors interact with each other in ways that are difficult to represent formally; and actors change their propensities for behavior as a result of their interactions. They learn, adapt, and reconfigure; they acquire new preferences and new ways of weighing their circumstances; and they sometimes change the frames within which they deliberate and choose.

Modeling the social world certainly requires the use of simplifying assumptions. There is no such thing as what we might call a Borges-class model -- one that represents every feature of the terrain. This means that the scientist needs to balance realism, tractability, and empirical adequacy in arriving at a set of assumptions about the actor and the environment, both natural and social. These judgments are influenced by several factors, including the explanatory and theoretical goals of the analysis. Is the analysis intended to serve as an empirical representation of an actual domain of social action -- the effects on habitat of the grazing strategies of a vast number of independent herders, say? Or is it intended to isolate the central tendency of a few key factors -- short term cost-benefit analysis in a context of a limited horizon of environmental opportunities, say?

If the goal of the simulation is to provide an empirically adequate reconstruction of the complex social situation, permitting adjustment of parameters in order to answer "what-if" questions, then it is reasonable to expect that the baseline model needs to be fairly detailed. We need to build in enough realism about the intentions and modes of reasoning of the actors, and we need a fair amount of detail concerning the natural, social, and policy environments in which they choose.

The discipline of economic geography provides good examples of both extremes of abstraction and realism of assumptions. At one extreme we have the work of von Thunen in his treatment of the Isolated State, producing a model of habitation, agriculture, and urbanization that reflects the economic rationality of the actors.

At the other extreme we have calibrated agent-based models of land use that build in more differentiated assumptions about the intentions of the actors and the legal and natural environment in which they make their plans and decisions. A very good and up-to-date volume dedicated to the application of calibrated agent-based models in economic geography is Alison Heppenstall, Andrew Crooks, Linda See, and Michael Batty, Agent-Based Models of Geographical Systems. The contribution by Crooks and Heppenstall provides an especially good introduction to the approach ("Introduction to Agent-Based Modelling"). Crook and Heppenstall describe the distinguishing features of the approach in these terms:
To understand geographical problems such as sprawl, congestion and segregation, researchers have begun to focus on bottom-up approaches to simulating human systems, specifically researching the reasoning on which individual decisions are made. One such approach is agent-based modelling (ABM) which allows one to simulate the individual actions of diverse agents, and to measure the resulting system behaviour and outcomes over time. The distinction between these new approaches and the more aggregate, static conceptions and representations that they seek to complement, if not replace, is that they facilitate the exploration of system processes at the level of their constituent elements. (86)
The volume also pays a good deal of attention to the problem of validation and testing of simulations. Here is how Manson, Sun, and Bonsal approach the problem of validation of ABMs in their contribution, "Agent-Based Modeling and Complexity":
Agent-based complexity models require careful and thorough evaluation, which is comprised of calibration, verification, and validation (Manson 2003 ) . Calibration is the adjustment of model parameters and specifications to fit certain theories or actual data. Verification determines whether the model runs in accordance with design and intention, as ABMs rely on computer code susceptible to programming errors. Model verification is usually carried out by running the model with simulated data and with sensitivity testing to determine if output data are in line with expectations. Validation involves comparing model outputs with real-world situations or the results of other models, often via statistical and geovisualization analysis. Model evaluation has more recently included the challenge of handling enormous data sets, both for the incorporation of empirical data and the production of simulation data. Modelers must also deal with questions concerning the relationship between pattern and process at all stages of calibration, verification, and validation. Ngo and See ( 2012 ) discuss these stages in ABM development in more detail. (125)
An interesting current illustration of the value of agent-based modeling in analysis and explanation of historical data is presented by Kenneth Sylvester, Daniel Brown, Susan Leonard, Emily Merchant, and Meghan Hutchins in "Exploring agent-level calculations of risk and return in relation to observed land-use changes in the US Great Plains, 1870-1940" (link). Their goal is to see whether it is possible to reproduce important features of land use in several Kansas counties by making specific assumptions about decision-making by the farmers, and specific information about the changing weather and policy circumstances within which choices were made. 

Here is how Sylvester and co-authors describe the problem of formulating a representation of the actors in their simulation:
Understanding the processes by which farming households made their land-use decisions is challenging because of the complexity of interactions between people and the places in which they lived and worked, and the often insufficient resolution of observed information. Complexity characterizes land-use processes because observed historical behaviors often represent accumulated decisions of heterogeneous actors who were affected by a wide range of environmental and human factors, and by specific social and spatial interactions. (1)
Here is a graph of the results of the Sylvester et al agent-based model, simulating the allocation of crop land across five different crops given empirical weather and rainfall data.
So how well does this calibrated agent-based model do as a simulation of the observed land use patterns? Not particularly well, in the authors' concluding remarks; their key finding is sobering:
Our base model, assuming profit maximization as the motive for land-use decision making, reproduced the historical record rather poorly in terms of both land use shares and farm size distributions in each township. We attribute the differences to deviations in decision making from profit-maximizing behavior. Each of the subsequent experiments illustrates how relatively simple changes in micro-level processes lead to different aggregate outcomes. With only minor adjustments to simple mechanisms, the pace, timing, and trajectories of land use can be dramatically altered.
However, they argue that this lack of fit does not discredit the ABM approach, but rather disconfirms the behavioral assumption that farmers are simple maximizers of earning. They argue, as sociologists would likely agree, that "trajectories of land-use depended not just on economic returns, but other slow processes of change, demographic, cultural, and ecological feedbacks, which shaped the decisions of farmers before and long after the middle of the twentieth century." And therefore it is necessary to provide more nuanced representations of actor intentionality if the model is to do a good job of reproducing the historical results and the medium-term behavior of the system.

(In an earlier post I discussed a set of formal features that have been used to assess the adequacy of formal models in economics and other mathematized social sciences (link). These criteria are discussed more fully in On the Reliability of Economic Models: Essays in the Philosophy of Economics.)

(Above I mentioned the whimsical idea of "Borges-class models" -- the unrealizable ideal of a model that reproduces every aspect of the phenomena that it seeks to simulate. Here is the relevant quotation from Jorge Borges.

On Exactitude in Science
Jorge Luis Borges, Collected Fictions, translated by Andrew Hurley.

…In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast Map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography.
—Borges quoting Suarez Miranda,Viajes devarones prudentes, Libro IV,Cap. XLV, Lerida, 1658)

No comments: