Tuesday, December 29, 2015

Do all roads lead to Rome?


Here is a fascinating data visualization experiment by moovel lab testing a piece of ancient wisdom, "All roads lead to Rome" (link). The experiment is discussed in the CityLab blog of the Atlantic. It is not a full map of the auto routes of Europe; instead, it is a construction of the routes that exist from every grid point on the map of Europe to the destination of Rome. So properly speaking, it doesn't confirm that "all roads lead to Rome"; instead it demonstrates that "you can get to Rome from virtually every point in Europe through a dense system of tributaries". It's an amazing representation of the capillaries of transportation throughout the continent.

Imagine what the system would look like if the destination were Stockholm instead. I imagine that the coverage of the map would be equally complete; "you can get to Stockholm from every point in Europe through a dense system of tributaries". But I also imagine that there would be some important structural differences in the two maps, with a different set of most-travelled primary capillaries.

What about it, moovel lab folks -- is this an experiment that could be readily performed?

 Here is a Google map of the Roman Empire prepared by the Pelagios Project demonstrating a much more reduced system of roads (link):


It appears visually that it is possible to align the two maps. Major roads in ancient Europe seem to follow the same course today.

It has sometimes been observed that, for the Romans, it might not have been such a good thing that all roads lead to Rome. This same system of roads served as conduits of invasion by waves of Germanic armies.



Here is a video by Mary Beard on the historical importance of the Roman road system.


Monday, December 28, 2015

ANT-style critique of ABM


A short recent article in the Journal of Artificial Societies and Social Simulation by Venturini, Jensen, and Latour lays out a critique of the explanatory strategy associated with agent-based modeling of complex social phenomena (link). (Thanks to Mark Carrigan for the reference via Twitter; @mark_carrigan.) Tommaso Venturini is an expert on digital media networks at Sciences Po (link), Pablo Jensen is a physicist who works on social simulations, and Bruno Latour is -- Bruno Latour. Readers who recall recent posts here on the strengths and weaknesses of ABM models as a basis for explaining social conflict will find the article interesting (link). VJ&L argue that agent-based models -- really, all simulations that proceed from the micro to the macro -- are both flawed and unnecessary. They are flawed because they unavoidable resort to assumptions about agents and their environments that reduce the complexity of social interaction to an unacceptable denominator; and they are unnecessary because it is now possible to trace directly the kinds of processes of social interaction that simulations are designed to model. The "big data" available concerning individual-to-individual interactions permits direct observation of most large social processes, they appear to hold.

Here are the key criticisms of ABM methodology that the authors advance:
  • Most of them, however, partake of the same conceptual approach in which individuals are taken as discrete and interchangeable 'social atoms' (Buchanan 2007) out of which social structures emerge as macroscopic characteristics (viscosity, solidity...) emerge from atomic interactions in statistical physics (Bandini et al. 2009). (1.2)
  • most simulations work only at the price of simplifying the properties of micro-agents, the rules of interaction and the nature of macro-structures so that they conveniently fit each other. (1.4)
  • micro-macro models assume by construction that agents at the local level are incapable to understand and control the phenomena at the global level. (1.5)
And here is their key claim:
  • Empirical studies show that, contrarily to what most social simulations assume, collective action does not originate at the micro level of individual atoms and does not end up in a macro level of stable structures. Instead, actions distribute in intricate and heterogeneous networks than fold and deploy creating differences but not discontinuities. (1.11) 
This final statement could serve as a high-level paraphrase of actor-network theory, as presented by Latour in Reassembling the Social: An Introduction to Actor-Network-Theory. (Here is a brief description of actor-network theory and its minimalist social ontology; link.)

These criticisms parallel some of my own misgivings about simulation models, though I am somewhat more sympathetic to their use than VJ&L. Here are some of the concerns raised in earlier posts about the validity of various ABM approaches to social conflict (linklink):
  • Simulations often produce results that appear to be artifacts rather than genuine social tendencies.
  • Simulations leave out important features of the social world that are prima facie important to outcomes: for example, quality of leadership, quality and intensity of organization, content of appeals, differential pathways of appeals, and variety of political psychologies across agents.
  • The factor of the influence of organizations is particularly important and non-local.
  • Simulations need to incorporate actors at a range of levels, from individual to club to organization.
And here is the conclusion I drew in that post:
  • But it is very important to recognize the limitations of these models as predictors of outcomes in specific periods and locations of unrest. These simulation models probably don't shed much light on particular episodes of contention in Egypt or Tunisia during the Arab Spring. The "qualitative" theories of contention that have been developed probably shed more light on the dynamics of contention than the simulations do at this point in their development.
But the confidence expressed by VJ&L in the new observability of social processes through digital tracing seems excessive to me. They offer a few good examples that support their case -- opinion change, for example (1.9). Here they argue that it is possible to map or track opinion change directly through digital footprints of interaction (Twitter, Facebook, blogging), and this is superior to abstract modeling of opinion change through social networks. No doubt we can learn something important about the dynamics of opinion change through this means.

But this is a very special case. Can we similarly "map" the spread of new political ideas and slogans during the Arab Spring? No, because the vast majority of those present in Tahrir Square were not tweeting and texting their experiences. Can we map the spread of anti-Muslim attitudes in Gujarat in 2002 leading to massive killings of Muslims in a short period of time? No, for the same reason: activists and nationalist gangs did not do us the historical courtesy of posting their thought processes in their Twitter feeds either. Can we study the institutional realities of the fiscal system of the Indonesian state through its digital traces? No. Can we study the prevalence and causes of official corruption in China through digital traces? Again, no.

In other words, there is a huge methodological problem with the idea of digital traceability, deriving from the fact that most social activity leaves no digital traces. There are problem areas where the traces are more accessible and more indicative of the underlying social processes; but this is a far cry from the utopia of total social legibility that appears to underlie the viewpoint expressed here.

So I'm not persuaded that the tools of digital tracing provide the full alternative to social simulation that these authors assert. And this implies that social simulation tools remain an important component of the social scientist's toolbox.

Monday, December 21, 2015

Quantum cognition?



Alexander Wendt proposes a radical idea in his Quantum Mind and Social Science: Unifying Physical and Social Ontology: that we should reconsider fundamentals of the social sciences to reflect emerging research on "quantum consciousness" and cognition. He describes his aim in these terms:
In this book I explore the possibility that this [classical physics] foundational assumption of social science is a mistake, by re-reading social science "through the quantum." More specifically, I argue that human beings and therefore social life exhibit quantum coherence -- in effect, that we are walking wave functions. (3)
A keystone to Wendt's argument is what he regards as the credibility and predictive niceness of "quantum decision theory". The foundational text in this field is Busemeyer and Bruza, Quantum Models of Cognition and Decision. Busemeyer and Bruza argue here, and elsewhere, that the mathematics and concepts of quantum mechanics in physics have seemingly relevant application to the field of cognition and judgment as well. For example, the idea of "wave function collapse" appears to have analogy with the resolution of uncertainty onto decision by a human cognitive agent. Busemeyer and Bruza offer six fundamental analogies between quantum mechanics and cognition:
  • judgments are based on indefinite states
  • judgments create rather than record
  • judgments disturb each other, introducing uncertainty
  • judgments do not always obey classic logic
  • judgments do not obey the principles of unicity
  • cognitive phenomena may not be decomposable
For these and related reasons Busemeyer and Bruza argue that the mathematics, logic, and concepts of quantum mechanics may allow us to reach better traction with respect to the processes of belief acquisition and judgment that constitute human cognition. So far so good -- there may be a mathematical homology between quantum states in the micro-physical world and states of knowledge acquisition at the level of acquisition.

However, Busemeyer and Bruza are entirely explicit in saying that they regard this solely as a formal analogy -- not a hypothesis about the real underlying structure of human thought. They explicitly deny that they find evidence to support the idea that consciousness is a quantum phenomenon at the sub-molecular level. They are "agnostic toward the so-called 'quantum mind' hypothesis" (kl 156). Their use of the mathematics of quantum mechanics is formal rather than substantive -- more akin to using the mathematics of fluid dynamics to represent flow through a social network than arriving at a theory of the real constitution of a domain as a basis for explaining its characteristics.
This book is not about quantum physics per se, but instead it explores the application of the probabilistic dynamic system created by quantum theory to a new domain – the field of cognition and decision making. (kl 245)
So the application is heuristic rather than realistic:
We motivate the use of quantum models as innovative abstractions of existing problems. That is all. These abstractions have the character of idealizations in the sense there is no claim as to the validity of the idealization “on the ground.” (kl 171)
Instead [our theory] turns to quantum theory as a fresh conceptual framework for explaining empirical puzzles, as well as a rich new source of alternative formal tools. To convey the idea that researchers in this area are not doing quantum mechanics, various modifiers have been proposed to describe this work, such as quantum-like models of cognition, cognitive models based on quantum structure, or generalized quantum models. (kl 156)
Given the key role this body of research plays in Wendt's arguments about the social sciences, it is worth considering how it has been received in the relevant academic communities. H. Van Dyke Parunak reviews the work in Computing Reviews (link). Parunak emphasizes the point made here, that the book is explicit in declaring that it does not provide support for the idea of "quantum cognition" as a manifestation of underlying quantum physical processes. He observes that "a more accurate title, but much less exciting, would be Hilbert space models of cognition and decision," emphasizing the purely formal and mathematical nature of their arguments. Quantum mechanics provides a computational model for cognition based on quantum probability theory in their work, not an ontology of the cognitive process. Here is a short piece by Trueblood, Pothos, and Busemeyer in Frontiers in Psychology that spells out the mathematical assumptions that are invoked here (link).
What is perhaps less known is that the ingenious physicists who developed quantum mechanics also invented a new theory of probability, since classical probability (CP) theory was inconsistent with their bold new theory of the physical world. QP theory refers to the rules for assigning probabilities to events from quantum mechanics, without the physics. QP theory is potentially applicable to any area where there is a need to compute probabilities. ("Quantum probability theory as a common framework for reasoning and similarity")
Here is a review article that proposes a series of tests of "quantum-like" models of judgment (link). Here is how the authors describe the field of quantum-like models of cognition:
Recently, a research field that rely on so-called “quantum” or “quantum-like” models has developed to account for such behaviors. The qualifier “quantum” is used to indicate that the models exploit the mathematics of a contemporary physical theory, quantum mechanics. Note that only some mathematical tools of quantum mechanics are employed, and that the claim is not that these models are justified by an application of quantum physics to the brain. For that reason, we shall prefer to call them “quantum-like” models. Such models put into question two classical characteristics recalled above: they abandon Bayesian probabilities for others which are similar to probabilities in quantum mechanics, and they allow for preferences or attitudes to be undetermined. Quantum-like models have received much interest from psychologists, physicists, economists, cognitive scientists and philosophers. For example, new theoretical frameworks have been proposed in decision theory and bounded rationality (Danilov and Lambert-Mogiliansky 2008 and 2010, Yukalov and Sornette 2011). (2)
This description too emphasizes the purely formal nature of this theory; it is an attempt to apply some of the mathematical models and constructs of quantum theory to the empirical problems of cognition and judgment. They go beyond this observation, however, by attempting to assess the ability of the mathematics to fit the data. Their overall judgment is dubious about the applicability of these mathematical tools to the available data on specific aspects of belief formation (22). "After performing the test against available data, the result is quite clear: non-degenerate models are not an option, being not empirically adequate or not needed."

This is all relevant to a discussion of Wendt's work, because Wendt's premise is solidly realist: he wants to seriously consider the possibility or likelihood of "quantum consciousness". This is the idea that thought and mental activity are the manifestations of subatomic quantum effects.
Quantum brain theory takes known effects at the sub-atomic level and scales them upward to the macroscopic level of the brain. (31) 
Hence the central question(s) of this book: (a) how might a quantum theoretic approach explain consciousness and by extension intentional phenomena, and thereby unify physical and social ontology, and (b) what are some implications of the result for contemporary debates in social theory? (29)
For the price of the two claims of quantum consciousness theory –that the brain is a quantum computer and that consciousness inheres in matter at the fundamental level –we get solutions to a host of intractable problems that have dogged the social sciences from the beginning. These claims are admittedly speculative, but neither is precluded by what we currently know about the brain or quantum physics, and given the classical materialist failure to make progress on the mind–body problem, at this point they look no more speculative than the orthodoxy –and the potential pay-off is huge. (35)
These are tantalizing ideas. It is clear that they are intended as substantive, not merely formal or mathematical. We are asked to take seriously, as an empirical hypothesis, the idea that the brain is a quantum machine and its gross behavior (memory, belief, judgment) is substantively influenced by that quantum substrate. But it is fundamentally unclear whether the findings of Busemeyer and Bruza or other practitioners of quantum probability in the field of cognition provide any support at all for the substantive quantum-consciousness hypothesis.

Friday, December 18, 2015

Von Neumann on the brain


image: representation of a mammalian brain neural network 

After World War II John von Neumann became interested in the central nervous system as a computing organ. Ironically, more was probably known about neuroanatomy than about advanced digital computing in the 1940s; that situation has reversed, of course. Now we know a great deal about calculating, recognizing, searching, and estimating in silicon; but relatively less about how these kinds of processes work in the setting of the central nervous system. At the time of his final illness von Neumann was preparing a series of Silliman Lectures at Yale University that focused on the parallels that exist between the digital computer and the brain; these were published posthumously as The Computer and the Brain (CB) in 1958. This topic also comes in for substantial discussion in Theory Of Self Reproducing Automata (TSRA) (edited and published posthumously by Arthur Burks in 1966). It is very interesting to see how von Neumann sought to analyze this problem on the basis of the kinds of information available to him in the 1950s.

Much of CB takes the form of a rapid summary of the state of knowledge about digital computing machines that existed in the 1950s, from Turing to ENIAC. Almost all computers today possess the "von Neumann" architecture along these lines.


Alan Turing provided some of the mathematical and logical foundations of modern digital computing (link). He hypothesized a very simple computing device that consisted of a tape of indefinite length, a  tape drive mechanism that permitted moving the tape forwards or backwards one space, and a read-write mechanism that could read the mark in a tape location or erase and re-write the mark in that location. Here is a diagram of a Turing machine:

(Fascinatingly, here is a photo of a working model of a Turing machine (link):)


Turing's fundamental theorem is that any function that is computable at all is computable on a Turing machine; so a Turing machine is a universal computing machine. The von Neumann architecture and the computing machines that it spawned -- ENIAC and its heirs -- are implementations of a universal computing machine. 

From the time of Frege it has been understood that mathematical operations can be built up as compounds of several primitive operations -- addition, subtraction, etc.; so, for example, multiplication can be defined in terms of a sequence of additions. Programming languages and libraries of subroutines take advantage of this basic logic: new functions are defined as series of more elementary operations embodied in machine states. As von Neumann puts the point in CB:
More specifically: any computing machine that is to solve a complex mathematical problem must be “programmed” for this task. This means that the complex operation of solving that problem must be replaced by a combination of the basic operations of the machine. Frequently it means something even more subtle: approximation of that operation—to any desired (prescribed) degree—by such combinations. (5)
Key questions about the capacities of a computing machine, either electro-mechanical or biological, have to do with estimating its dimensionality: how much space does it occupy, how much energy does it consume, and how much time does it take to complete a given calculation? And this is where von Neumann's analysis took its origin. Von Neumann sought to arrive at realistic estimates of the size and functionality of the components of these two kinds of computation machines. The differences in scale are enormous, whether we consider speed, volume, or energy consumption. Fundamentally, neurons are more numerous by orders of magnitude (10^10 versus 10^4); slower by orders of magnitude (5 msec vs. 10^-3 msec); less energy-intensive by orders of magnitude (10^-3 ergs vs.10^2 ergs); and computationally less precise by orders of magnitude. (Essentially he estimates that a neural circuit, either analog or digital, is capable of precision of only about 1%.) And yet von Neumann concludes that brains accomplish computational problems faster than digital computers because of their massively parallel structure -- in spite of the comparative slowness of the individual elements of computation (neurons). This implies that the brain embodies a structurally different architecture than sequential digital computing embodied in the von Neumann model.

Von Neumann takes the fundamental operator of the brain to be the neuron, and he represents the neuron as a digital device (in spite of its evident analog electrochemical properties). A neuron transmits a pulse. "The nervous pulses can clearly be viewed as (two-valued) markers.... The absence of a pulse then represents one value (say, the binary digit 0), and the presence of one represents the other (say, the binary digit 1)" (42). "The nervous system has a prima facie digital character" (44).

In their introduction to the second edition of CB the Churchlands summarize von Neumann's conclusion somewhat differently by emphasizing the importance of the analog features of the brain: "If the brain is a digital computer with a von Neumann architecture, it is doomed to be a computational tortoise by comparison... [But] the brain is neither a tortoise nor a dunce after all, for it was never a serial, digital machine to begin with: it is a massively parallel analog machine" (kl 397). However, it appears to me that they overstate the importance of analog neural features in von Neumann's account. Certainly vN acknowledges the analog electro-chemical features of neural activity; but I don't find him making a strong statement in this book to the effect that analog features contribute to the better-than-expected computational performance of the brain. This seems to correspond more to a view of the Churchlands than to von Neumann's analysis in the 1950s. Here is their view as expressed in "Could a Machine Think?" in Scientific American in 1990:
First, nervous systems are parallel machines, in the sense that signals are processed in millions of different pathways simultaneously. The retina, for example, presents its complex input to the brain not in chunks of eight, 16 or 32 elements, as in a desktop computer, but rather in the form of almost a million distinct signal elements arriving simultaneously at the target of the optic nerve (the lateral geniculate nucleus), there to be processed collectively, simultaneously and in one fell swoop. Second, the brain’s basic processing unit, the neuron, is comparatively simple. Furthermore, its response to incoming signals is analog, not digital, inasmuch as its output spiking frequency varies continuously with its input signals. Third, in the brain axons projecting from one neuronal population to another are often matched by axons returning from their target population. These descending or recurrent projections allow the brain to modulate the character of its sensory processing. (link, 35)
In considering the brain von Neumann reached several fundamental observations. First, the enormous neural network of the central nervous system is itself a universal computing machine. Von Neumann worked on the assumption that the CNS could be "programmed" to represent the fundamental operations of arithmetic and logic; and therefore it has all the power of a universal computational machine. But second, von Neumann believes his analysis demonstrates that its architecture is fundamentally different from the standard von Neumann architecture. This observation is the more fundamental. It derives from von Neumann's estimates of the base speed rate of calculation available to neurons in comparison to vacuum tubes; a von Neumann machine with components of this time scale would take eons to complete the calculations that the brain performs routinely. And so this underlines the importance of the massively parallel computing that is accomplished by the biological neural network. Ironically, however, it has proven challenging to emulate massively parallel neural nets in digital computing environments; here is an interesting technical report by Paul Fox that identifies communication bandwidth as being the primary limiting factor for such emulations (link). 

(Tsutomu Miki explores some of these issues in Brainware : Bio-Inspired Architecture and Its Hardware Implementation.)

Wednesday, December 9, 2015

John von Neumann and stochastic simulations

source: Monte Carlo method (Wikipedia)

John von Neumann was one of the genuine mathematical geniuses of the twentieth century. A particularly interesting window onto von Neumann's scientific work is provided by George Dyson in his  book, Turing's Cathedral: The Origins of the Digital Universe. The book is as much an intellectual history of the mathematics and physics expertise of the Princeton Institute for Advanced Study as it is a study of any one individual, but von Neumann plays a key role in the story. His contribution to the creation of the general-purpose digital computer helped to lay the foundations for the digital world in which we now all live.

There are many interesting threads in von Neumann's intellectual life, but one aspect that is particularly interesting to me is the early application of the new digital computing technology to the problem of simulating large complex physical systems. Modeling weather and climate were topics for which researchers sought solutions using the computational power of first-generation digital computers, and the research needed to understand and design thermonuclear devices had an urgent priority during the war and post-war years. Here is a description of von Neumann's role in the field of weather modeling in designing the early applications of ENIAC  (P. Lynch, "From Richardson to early numerical weather prediction"; link):
John von Neumann recognized weather forecasting, a problem of both great practical significance and intrinsic scientific interest, as ideal for an automatic computer. He was in close contact with Rossby, who was the person best placed to understand the challenges that would have to be addressed to achieve success in this venture. Von Neumann established a Meteorology Project at the Institute for Advanced Study in Princeton and recruited Jule Charney to lead it. Arrangements were made to compute a solution of a simple equation, the barotropic vorticity equation (BVE), on the only computer available, the ENIAC. Barotropic models treat the atmosphere as a single layer, averaging out variations in the vertical. The resulting numerical predictions were truly ground-breaking. Four 24-hour forecasts were made, and the results clearly indicated that the large-scale features of the mid-tropospheric flow could be forecast numerically with a reasonable resemblance to reality. (Lynch, 9)
image: (link, 10)

A key innovation in the 1950s in the field of advanced computing was the invention of Monte Carlo simulation techniques to assist in the invention and development of the hydrogen bomb. Thomas Haigh, Mark Priestley, and Crispin Rope describe the development of the software supporting Monte Carlo simulations in the ENIAC machine in a contribution to the IEEE Annals of the History of Computing (link). Peter Galison offers a detailed treatment of the research communities that grew up around these new computational techniques (link). Developed first as a way of modeling nuclear fission and nuclear explosives, these techniques proved to be remarkably powerful for allowing researchers to simulate and calculate highly complex causal processes. Here is how Galison summarizes the approach:
Christened "Monte Carlo" after the gambling mecca, the method amounted to the use of random, numbers (a la roulette) to simulate the stochastic processes too complex to calculate in full analytic glory. But physicists and engineers soon elevated the Monte Carlo above the lowly status of a mere numerical calculation scheme; it came to constitute an alternative reality--in some cases a preferred one--on which "experimentation" could be conducted. (119) 
At Los Alamos during the war, physicists soon recognized that the central problem was to understand the process by which neutrons fission, scatter, and join uranium nuclei deep in the fissile core of a nuclear weapon. Experiment could not probe the critical mass with sufficient detail; theory led rapidly to unsolvable integro-differential equations. With such problems, the artificial reality of the Monte Carlo was the only solution--the sampling method could "recreate" such processes by modeling a sequence of random scatterings on a computer. (120)
The approach that Ulam, Metropolis, and von Neumann proposed to take for the problem of nuclear fusion involved fundamental physical calculations and statistical estimates of interactions between neutrons and surrounding matter. They proposed to calculate the evolution of the states of a manageable number of neutrons as they traveled from a central plutonium source through spherical layers of other materials. The initial characteristics and subsequent interactions of the sampled neutrons were assigned using pseudo-random numbers. A manageable number of sampled spaces within the unit cube would be "observed" for the transit of a neutron (127) (10^4 observations). If the percentage of fission calculated in the sampled spaces exceeded a certain value, then the reaction would be self-sustaining and explosive. Here is how the simulation would proceed:
Von Neumann went on to specify the way the simulation would run. First, a hundred neutrons would proceed through a short time interval, and the energy and momentum they transferred to ambient matter would be calculated. With this "kick" from the neutrons, the matter would be displaced. Assuming that the matter was in the middle position between the displaced position and the original position, one would then recalculate the history of the hundred original neutrons. This iteration would then repeat until a "self-consistent system" of neutron histories and matter displacement was obtained. The computer would then use this endstate as the basis for the next interval of time, delta t. Photons could be treated in the same way, or if the simplification were not plausible because of photon-matter interactions, light could be handled through standard diffusion methods designed for isotropic, black-body radiation. (129)
Galison argues that there were two fairly different views in play of the significance of Monte Carlo methods in the 1950s and 1960s. According to the first view, they were simply a calculating device permitting the "computational physicist" to calculate values for outcomes that could not be observed or theoretically inferred. According to the second view, Monte Carlo methods were interpreted realistically. Their statistical underpinnings were thought to correspond exactly to the probabilistic characteristics of nature; they represented a stochastic view of physics.
King's view--that the Monte Carlo method corresponded to nature (got "back of the physics of the problem") as no deterministic differential equation ever could--I will call stochasticism. It appears in myriad early uses of the Monte Carlo, and clearly contributed to its creation. In 1949, the physicist Robert Wilson took cosmic-ray physics as a perfect instantiation of the method: "The present application has exhibited how easy it is to apply the Monte Carlo method to a stochastic problem and to achieve without excessive labor an accuracy of about ten percent." (146)
This is a very bold interpretation of a simulation technique. Rather than looking at the model as an abstraction from reality, this interpretation looks at the model as a digital reproduction of that reality. "Thus for the stochasticist, the simulation was, in a sense, of apiece with the natural phenomenon" (147).

One thing that is striking in these descriptions of the software developed in the 1950s to implement Monte Carlo methods is the very limited size and computing power of the first-generation general-purpose computing devices. Punch cards represented "the state of a single neutron at a single moment in time" (Haigh et al link 45), and the algorithm used pseudo-random numbers and basic physics to compute the next state of this neutron. The basic computations used third-order polynomial approximations (Haigh et al link 46) to compute future states of the neutron. The simulation described here resulted in the production of one million punched cards. It would seem that today one could use a spreadsheet to reproduce the von Neumann Monte Carlo simulation of fission, with each line being the computed result from the previous line after application of the specified mathematical functions to the data represented in the prior line. So a natural question to ask is -- what could von Neumann have accomplished if he had Excel in his toolkit? Experts -- is this possible?


Friday, December 4, 2015

Historical vs. sociological explanation


Think of the following matrix of explanatory possibilities of social and historical phenomena:

Vertically the matrix divides between historical and sociological explanations, whereas horizontally it distinguishes general explanations and particular explanations. A traditional way of understanding the distinction between historical and sociological explanations was to maintain that sociological explanations provide generalizations, whereas historical explanations provide accounts for particular and unique situations. Windelband and the historicist school referred to this distinction as that between nomothetic and idiographic explanations (link). It was often assumed, further, that the nomothetic / idiographic distinction corresponded as well to the distinction between causal and interpretive explanations.

On this approach, only two of the cells would be occupied: sociological / general and historical / particular. There are no general historical explanations and no particular sociological explanations.




This way of understanding social and historical explanations no longer has a lot of appeal. "Causal" and "nomological" no longer have the affinity with each other that they once had, and "idiographic" and "interpretive" no longer seem to mutually imply each other. Philosophers have come to recognize that the deductive-nomological model does a poor job of explicating causation, and that we are better served by the idea that causal relationships are established by discovering discrete causal mechanisms. And the interpretive approach doesn't line up uniquely with any particular mode of explanation.  

So historical and sociological explanations no longer bifurcate in the way once imagined. All four quadrants invoke both causal mechanisms and interpretation as components of explanation.

In fact it is straightforward to identify candidate explanations in the two "vacant" cells -- particular sociological explanations and general historical explanations. In Fascists Michael Mann asks a number of moderately general questions about the causes of European fascism; but he also asks about historically particular instances of fascism. Historical sociology involves both singular and general explanations. But likewise, historians of the French Revolution or the English Revolution often provide general hypotheses even as they construct a particular narrative leading to the storming of the Bastille (Pincus, Soboul).


There seem to be two important grounds of explanation that cut across all these variants of explanations of human affairs. It is always relevant to ask about the meanings that participants attribute to actions and social events, so interpretation is a resource for both historical and sociological explanations. But likewise, causal mechanisms are invoked in explanations across the spectrum of social and historical explanation, and are relevant to both singular and general explanations. Or in other words, there is no difference in principle between sociological and historical explanatory strategies. 

How do the issues of generalization and particularity arise in the context of causal mechanisms? In several ways. First, explanations based on social mechanisms can take place in both a generalizing and a particular context. We can explain a group of similar social outcomes by hypothesizing the workings of a common causal mechanism giving rise to them; and we can explain a unique event by identifying the mechanisms that produced it in the given unique circumstances. Second, a social-mechanism explanation relies on a degree of lawfulness; but it refrains from the strong commitments of the deductive-nomological method. There are no high-level social regularities. Third, we can refer both to particular individual mechanisms and a class of similar mechanisms. For example, the situation of "easy access to valuable items along with low probability of detection" constitutes a mechanism leading to pilferage and corruption. We can invoke this mechanism to explain a particular instance of corrupt behavior -- a specific group of agents in a business who conspire to issue false invoices -- or a general fact -- the logistics function of a large military organization is prone to repeated corruption. (Sergeant Bilko, we see you!) So mechanisms support a degree of generalization across instances of social activity; and they also depend upon a degree of generalization across sequences of events.

And what about meanings? Human actions proceed on the basis of subjective understandings and motivations. There are some common features of ordinary human experience that are broadly shared. But the variations across groups, cultures, and individuals are very wide, and there is often no substitute for detailed hermeneutic research into the mental frameworks of the actors in specific historical settings. Here again, then, explanations can take the form of either generalized statements or accounts of particular and unique outcomes.

We might say that the most basic difference between historical and sociological explanation is a matter of pragmatics -- intellectual interest rather than fundamental logic. Historians tend to be more interested in the particulars of a historical setting, whereas sociologists -- even historical sociologists -- tend to be more interested in generalizable patterns and causes. But in each case the goal of explanation is to discover an answer to the question, why and how does the outcome occur? And this typically involves identifying both causal mechanisms and human meanings. 



Sunday, November 29, 2015

How to do cephalopod philosophy


How should researchers attempt to investigate non-human intelligence? The image above raises difficult questions. The octopus is manipulating (tenticlating?) the Rubik's cube. But there are a raft of questions that are difficult to resolve on the basis of simple inductive observation. And some of those questions are as much conceptual as they are empirical. Is the octopus "attempting to solve the cube"? Does it understand the goal of the puzzle? Does it have a mental representation of a problem which it is undertaking to solve? Does it have temporally extended intentionality? How does octopus consciousness compare to human consciousness? (Here is a nice website by several biologists at Reed College on the subject of octopus cognition; link.)

An octopus-consciousness theorist might offer a few hypotheses:
  1. The organism possesses a cognitive representation of its environment (including the object we refer to as "Rubik's cube").
  2. The organism possesses curiosity -- a behavioral disposition to manipulate the environment and observe the effects of manipulation.
  3. The organism has a cognitive framework encompassing the idea of cause and effect.
  4. The organism has desires and intentions.
  5. The organism has beliefs about the environment.
  6. The organism is conscious of itself within the environment.
How would any of these hypotheses be evaluated?

One resource that the cephalopod behavior theorist has is the ability to observe octopi in their ordinary life environments and in laboratory conditions. These observations constitute a rich body of data about behavioral capacities and dispositions. For example:



Here we seem to see the organism conveying a tool (coconut shell) to be used for an important purpose later (concealment) (link). This behavior seems to imply several cognitive states: recognition of the physical characteristics of the shell; recognition of the utility those characteristics may have in another setting; and a plan for concealment. The behavior also seems to imply a capacity for learning -- adapting behavior by incorporating knowledge learned at an earlier time.

Another tool available to the cephalopod theorist is controlled experimentation. It is possible to test the perceptual, cognitive, and motor capacities of the organism by designing simple experimental setups inviting various kinds of behavior. The researcher can ask "what-if" questions and frame experiments that serve to answer them -- for example, what if the organism is separated from the shell but it remains in view; will the organism reaquire the shell?

A third tool available to the cephalopod researcher is the accumulated neuro-physiology that is available for the species. How does the perceptual system work? What can we determine about the cognitive system embodied in the organism's central nervous system?

Finally, the researcher might consult with philosophers working on the mind-body problem for human beings, to canvass whether there are useful frameworks in that discipline that might contribute to octopus-mind-body studies. (Thomas Nagel's famous article, "What is it Like to Be a Bat?", comes to mind, in which he walks through the difficulty of imagining the consciousness of a bat whose sensory world depends on echo-location; link.)

In short, it seems that cephalopod cognition is a research field that necessarily combines detailed empirical research with conceptual and theoretical framing; and the latter efforts require as much rigor as the former.


Sunday, November 22, 2015

Are emergence and microfoundations contraries?

image: micro-structure of a nanomaterial (link)

Are there strong logical relationships among the ideas of emergence, microfoundations, generative dependency, and supervenience? It appears that there are.


The diagram represents the social world as a laminated set of layers of entities, processes, powers, and laws. Entities at L2 are composed of or caused by some set of entities and forces at L1. Likewise L3 and L4. Arrows indicate microfoundations for L2 facts based on L1 facts. Diamond-tipped arrows indicate the relation of generative dependence from one level to another. Square-tipped lines indicate the presence of strongly emergent facts at the higher level relative to the lower level. The solid line (L4) represents the possibility of a level of social fact that is not generatively dependent upon lower levels. The vertical ellipse at the right indicates the possibility of microfoundations narratives involving elements at different levels of the social world (individual and organizational, for example).

We might think of these levels as "individuals," "organization, value communities, social networks," "large aggregate institutions like states," etc.

This is only one way of trying to represent the structure of the social world. The notion of a "flat" ontology was considered in an earlier post (link). Another structure that is excluded by this diagram is one in which there is multi-directional causation across levels, both upwards and downwards. For example, the diagram excludes the possibility that L3 entities have causal powers that are original and independent from the powers of L2 or L1 entities. The laminated view described here is the assumption built into debates about microfoundations, supervenience, and emergence. It reflects the language of micro, meso, and macro levels of social action and organization.

Here are definitions for several of the primary concepts.
  • Microfoundations of facts in L2 based on facts in L1 : accounts of the causal pathways through which entities, processes, powers, and laws of L1 bring about specific outcomes in L2. Microfoundations are small causal theories linking lower-level entities to higher-level outcomes.
  • Generative dependence of L2 upon L1: the entities, processes, powers, and laws of L2 are generated by the properties of level L1 and nothing else. Alternatively, the entities, processes, powers, and laws of A suffice to generate all the properties of L2. A full theory of L1 suffices to derive the entities, processes, powers, and laws of L2.
  • Reducibility of y to x : it is possible to provide a theoretical or formal derivation of the properties of y based solely on facts about x.
  • Strong emergence of properties in L2 with respect to the properties of L1: L2 possesses some properties that do not depend wholly upon the properties of L1.
  • Weak emergence of properties in L2 with respect to the properties of L1: L2 possesses some properties for which we cannot (now or in the future) provide derivations based wholly upon the properties of L1.
  • Supervenience of L2 with respect to properties of L1: all the properties of L2 depend strictly upon the properties of L1 and nothing else.
    We also can make an effort to define some of these concepts more formally in terms of the diagram.


Consider these statements about facts at levels L1 and L2:
  1. UM: all facts at L2 possess microfoundations at L1. 
  2. XM: some facts at L2 possess inferred but unknown microfoundations at L1. 
  3. SM: some facts at L2 do not possess any microfoundations at L1. 
  4. SE: L2 is strongly emergent from L1. 
  5. WE: L2 is weakly emergent from L1. 
  6. GD: L2 is generatively dependent upon L1. 
  7. R: L2 is reducible to L1. 
  8. D: L2 is determined by L1. 
  9. SS: L2 supervenes upon L1. 
Here are some of the logical relations that appear to exist among these statements.
  1. UM => GD 
  2. UM => ~SE 
  3. XM => WE 
  4. SE => ~UM 
  5. SE => ~GD 
  6. GD => R 
  7. GD => D 
  8. SM => SE 
  9. UM => SS 
  10. GD => SS 
On this analysis, the question of the availability of microfoundations for social facts can be understood to be central to all the other issues: reducibility, emergence, generativity, and supervenience. There are several positions that we can take with respect to the availability of microfoundations for higher-level social facts.
  1. If we have convincing reason to believe that all social facts possess microfoundations at a lower level (known or unknown) then we know that the social world supervenes upon the micro-level; strong emergence is ruled out; weak emergence is true only so long as some microfoundations remain unknown; and higher-level social facts are generatively dependent upon the micro-level.   
  2. If we take a pragmatic view of the social sciences and conclude that any given stage of knowledge provides information about only a subset of possible microfoundations for higher-level facts, then we are at liberty to take the view that each level of social ontology is at least weakly emergent from lower levels -- basically, the point of view advocated under the banner of "relative explanatory autonomy" (link). This also appears to be roughly the position taken by Herbert Simon (link). 
  3. If we believe that it is impossible in principle to fully specify the microfoundations of all social facts, then weak emergence is true; supervenience is false; and generativity is false. (For example, we might believe this to be true because of the difficulty of modeling and calculating a sufficiently large and complex domain of units.) This is the situation that Fodor believes to be the case for many of the special sciences. 
  4. If we have reason to believe that some higher-level facts simply do not possess microfoundations at a lower level, then strong emergence is true; the social world is not generatively dependent upon the micro-world; and the social world does not supervene upon the micro-world. 
In other words, it appears that each of the concepts of supervenience, reduction, emergence, and generative dependence can be defined in terms of the availability or inavailability of microfoundations for some or all of the facts at a higher level based on facts at the lower level. Strong emergence and generative dependence turn out to be logical contraries (witness the final two definitions above).

Thursday, November 19, 2015

Do we still need microfoundations?


For quite a few years I have found the concept of microfoundations to be central for thinking about relationships across levels of social and individual activity. Succinctly, I have argued that, while it is perfectly legitimate to formulate theories and hypotheses about the properties and causal powers of higher-level social entities, it is necessary that those entities should have microfoundations at the level of the structured activities of socially situated individuals. Higher-level social things need microfoundations at the level of the individuals whose actions and thoughts create the social entity or power. (I have also used the idea of "methodological localism" to express this idea; link.) A fresh look at the presuppositions of the concept makes me more doubtful about its validity, however.

This concept potentially plays two different roles within the philosophy of social science. It might serve as a methodological requirement about the nature of social explanation: explanations of social phenomena need to take the form of detailed accounts of the pathways that bring them about at the level of individual socially situated situated actors. Second, it might be understood as an ontological requirement about acceptable social constructs; higher-level social constructs must be such that it is credible that they are constituted by patterns of individual-level activity. Neither is straightforward.

Part of the appeal of the concept of microfoundations derived from a very simple and logical way of understanding certain kinds of social explanation. This was the idea that slightly mysterious claims about macro-level phenomena (holistic claims) can often be given very clear explanations at the micro-level. Marx’s claim that capitalism is prone to crises arising from a tendency for the rate of profit to fall is a good example. Marx himself specifies the incentives facing the capitalist that lead him or her to make investment decisions aimed at increasing profits; he shows that these incentives lead to a substitution of fixed capital for variable capital (machines for labor); profits are created by labor; so the ratio of profit to total capital investment will tend to fall. This is a microfoundational explanation, in that it demonstrates the individual-level decision making and action that lead to the macro-level result.

There is another reason why the microfoundations idea was appealing — the ontological discipline it imposed with respect to theories and hypotheses at the higher level of social structure and causation. The requirement of providing microfoundations was an antidote to lazy thinking in the realm of social theory. Elster’s critique of G. A. Cohen’s functionalism in Karl Marx's Theory of History is a case in point; Elster argued convincingly that a claim that "X exists because it brings about Y benefits for the system in which it exists” can only be supported if we can demonstrate the lower-level causal processes that allow the prospect of future system benefits to influence X (link). Careless functionalism is unsupportable. More generally, the idea that there are social properties that are fundamental and emergent is flawed in the same way that vitalist biology is flawed. Biological facts are embedded within the material biochemistry of the cell and the gene, and claims that postulate a “something extra” over and above biochemistry involve magical thinking. Likewise, social facts are somehow or other embedded within and created by a substratum of individual action.

In short, there are reasons to find the microfoundations approach appealing. However, I'm inclined to think that it is less compelling than it appears to be.

First, methodology. The microfoundations approach is a perfectly legitimate explanatory strategy; but it is only one approach out of many. So searching for microfoundations ought to be considered an explanatory heuristic rather than a methodological necessity. Microfoundational accounts represent one legitimate form of social explanation (micro-to-meso); but so do "lateral" accounts (meso-to-meso explanations) or even "descending" accounts (macro-to-meso explanations). So a search for microfoundations is only one among a number of valid explanatory approaches we might take. Analytical sociology is one legitimate approach to social research; but there are other legitimate approaches as well (link).

Second, social ontology. The insistence that social facts must rest upon microfoundations is one way of expressing the idea of ontological dependency of the social upon the individual level (understanding, of course, that individuals themselves have social properties and constraints). But perhaps there are other and more compelling ways of expressing this idea. One is the idea of ontological individualism. This is the view that social entities, powers, and conditions are all constituted by the actions and thoughts of individual human beings, and nothing else. The social world is constituted by the socially situated individuals who make it up. Brian Epstein articulates this requirement very clearly here: "Ontological individualism is the thesis that facts about individuals exhaustively determine social facts” (link). This formulation makes it evident that individualism and microfoundations are closely linked. In particular, ontological individualism is true if and only if all social facts possess microfoundations at the level of socially situated individuals.

The microfoundations approach seems to suggest a coherent and strong position about the nature of the social world and the nature of social explanation; call this the "strong theory" of microfoundations:
  1. There are discernible and real differences in level in various domains, including the domain of the social.
  2. Higher-level entities depend on the properties and powers of lower-level constituents and nothing else.
  3. The microfoundations of a higher-level thing are the particular arrangements and actions of the lower-level constituents that bring about the properties of the higher-level thing.
  4. The gold standard for an explanation for a higher-level fact is a specification of the microfoundations of the thing.
  5. At the very least we need to be confident that microfoundations exist for the higher-level thing.
  6. There are no "holistic" or non-reducible social entities.
  7. There is no lateral or downward social causation.
Taken together, this position amounts to a fairly specific and narrow view of the social world -- indeed, excessively so. It fully incorporates the assumptions of ontological individualism, it postulates that generative microfoundational explanations are the best kind of social explanation, and it rules out several other credible lines of thought about social causation.

In fact, we might want to be agnostic about ontological individualism and the strong theory of microfoundations for a couple of reasons. One is the possibility of downward and lateral causation from meso or macro level to meso level. Another is the possibility raised by Searle and Epstein that there may be social facts that cannot be disaggregated onto facts about individuals (the validity of a marriage, for example; link). A third is the difficult question of whether there might be reasons for thinking that a lower level of organization (e.g. the cognitive system or neurophysiology) is more compelling than a folk theory of individual behavior. Finally, the metaphor of levels and strata itself may be misleading or incoherent as a way of understanding the realm of the social; it may turn out to be impossible to draw clear distinctions between levels of the social. (This is the rationale for the idea of a "flat" social ontology; link.) So there seem to be a handful of important reasons for thinking that we may want to suspend judgment about the correctness of ontological individualism.

Either way, the microfoundations thesis seems to be questionable. If ontological individualism is true, then it follows trivially that there are microfoundations for a given social fact. If ontological individualism is false, then the microfoundations thesis as an ontological thesis is false as well -- there will be social properties that lack microfoundations at the individual level. Either way, the key question is the truth or falsity of ontological individualism.

Two things now seem more clear to me than they did some years ago. First, microfoundationalism is not a general requirement on social explanation. It is rather one explanatory strategy out of many. And second, microfoundationalism is not necessarily the best way of articulating the ontology of the social world. A more direct approach is to simply specify that the social world is constituted by the activities and thoughts of individuals and the artifacts that they create. The principle of ontological individualism seems to express this view very well. And when the view is formulated clearly, its possible deficiencies become clear as well. So I'm now inclined to think that the idea of microfoundations is less useful than it once appeared to be. This doesn't mean that the microfoundations concept is incoherent or misleading; but it does mean that it does not contribute to social-science imperatives, either of methodology or ontology.

Tuesday, November 17, 2015

SSHA 2015 themes



The 40th annual meeting of SSHA took place in Baltimore this weekend. The Social Science History Association is an especially rewarding academic meeting for scholars interested in the intersection between historical processes and social scientific research tools and explanations. The rationale for the organization is to provide a venue for bringing together the study of specific historical topics and the use of tools and methods of the social sciences to further understand those episodes. History and social science methods mutually inform one another at the SSHA. The membership is highly interdisciplinary — in fact, interdisciplinarity is the theme for the 2016 meeting in Chicago — and every meeting offers a chance for participants to discover new research and new theories that are relevant to their own areas of work. The overall theme for the conference was "Pluralism and Community", and a significant number of panels did indeed strive to shed new light on these topics.

Several large themes were evident in the program. One is the broadening understanding scholars are reaching about the dynamics of human population behavior — historical demography — through the development of new tools of research and analysis of population and health records. Particularly interesting is the continuing research of the EurAsian Project in Population and Family History (EAP) (link). On a related panel on mortality patterns during the Spanish influenza pandemic Matthew Miller, a molecular biologist, introduced what was to me a novel concept: viroarchaeology, or the use of data about antigens in the tissue of living individuals to work out the sequence of viral epidemics in the past. Miller showed how we might use antigen levels in living individuals for several varieties of influenza virus to draw inferences about a prior (and historically unnoted) H1 influenza virus prior to the 1918 Spanish flu pandemic. And Svenn-Erik Mamelund demonstrated the degree to which influenza mortality rates reflected indicators of socio-economic status.

Other large themes included fiscal systems and their politics; race and resistance; GIS analysis of historical patterns; conflict and states; and new tools of formal social analysis that may be useful for historical research. My own paper, "Fissioning Community", falls in the category of applying new tools from the social sciences to historical topics; I considered the relevance and applicability of agent-based modeling techniques for understanding processes of ethnic and religious conflict. The paper and slides can be found here.

Several panels were very relevant to contemporary social developments. There was a very interesting session that was relevant to the contemporary "Black Lives Matter" movement that looked back to Detroit's progressive left in the 1960s and 1970s. Austin McCoy offered a fascinating and detailed description of the DARE movement during that period, a multiracial movement for racial justice. And the real-world tragedy in Paris last weekend found its academic counterpart in a panel on ethnic and religious identities in Europe, "Am I Charlie or Am I Ahmed? Comparative and Historical Perspectivism on Pluralism and Communities in Crisis in Contemporary Europe." This panel allowed participants to reflect on the social factors and processes that surround the formation of community in multi-cultural and multi-religious Europe. Also relevant on this topic was "Rethinking Pluralism in France: The 10th Anniversary of the 2005 Riots", with papers by Patrick Simon, Jean Beaman, and Crystal Fleming.

For many readers of Understanding Society the Social Science History Association will prove to be a particularly rewarding intellectual destination. The call for papers for the 2016 meeting of the association will appear here as soon as it is available. Here is a link to the organization's journal, Social Science History.


Wednesday, November 11, 2015

Eight years of Understanding Society



This week marks the end of the eighth year of Understanding Society. This year passed the 1000 mark — the blog is now up to 1,029 posts, or well over one million words. The blog continues to be a very good venue for me for developing and sharing ideas about the foundations of the social sciences and the ways that we attempt to understand the social world. (Mark Carrigan captures a lot of the value that a blog can have for a scholar in his recent excellent book, Social Media for Academics. Thanks, Mark, for including Understanding Society in your thinking about academic uses of social media!)

Writing Understanding Society continues to stimulate me to read and think outside the confines of the specific tradition in which I work. The collage presented above represents just a few of the books I wouldn't have read in the past year if it weren't for the blog. It gives me a lot of pleasure to recall the new ideas learned from working through these books and capturing a few ideas for the blog. There is a lot of diversity of content across these many books, but there are surprising cross-connections as well. (If you want to see the post where one of these books is discussed, just search for the author in the search box above.)

There are some common themes among the hundred or so posts in the past twelve months --

  • a focus on causal mechanisms and powers;
  • attention to the theory of critical realism;
  • a continuing interest in China's recent history;
  • an interest in better understanding the dynamics of race in the US;
  • an interest in the mechanisms of social change at the micro-level;
  • an interest in the ways in which knowledge and values play causal roles in society.

I don't have an exact measure, but my impression is that the past year has witnessed a higher number of posts on topics in the philosophy of social science as such, with fewer on more contemporary topics.

I am very grateful to the many readers worldwide who find topics of interest in Understanding Society. Google Analytics reports 72,051 page views of the blog in the past month, and 718,000 page views for the past twelve months. Here is the global distribution of visitors from the month of October; it is evident that there is a fairly wide distribution of readership around the world.


Thank you for visiting, reading, and discussing!