Saturday, November 20, 2010

Consolidated quantitative history

It is fascinating to browse through the sessions on the program at the Social Science History Association this month (link). SSHA is distinguished by its deep embrace of disciplinary and methodological diversity, and there are panels deriving from qualitative, comparative, and theoretical perspectives. But particularly interesting for me this year are the more quantitative subjects -- reflecting the cliometric impulse that led to the formation of the SSHA several decades ago.  (Here are a few comments by Julia Adams, Elisabeth Stephanie Clemens, and Anne Shola Orloff, past and current presidents of SSHA, on this history.) There are panels on historical measures of the standard of living in different parts of Eurasia; on fertility, mobility, and population size in small and large regions; on longterm climate and atmospheric fluctuation over time (the year without a summer in mid-nineteenth century); levels of agricultural productivity over several centuries in several regions; the degree of inequality in landholding in Scania and North China; and many other fascinating studies of measurable social properties. And, of course, the papers offer time graphs of the variables that are the subject of the study.

So what if we had a goal of providing a unified and public measurement of factors like these over a large expanse of time and space? What if we set out to synthesize many studies currently underway and arrive at a common set of measures over time for these regions?

To an extent this is the goal of the Eurasian Population and Family History Project: to assemble a large set of research groups across Eurasia, measuring demographic data using comparable methods in the several locations (link). Though the project hasn't yet produced a synthetic volume summarizing all the results, we can hope that this kind of product will eventually be forthcoming. The researchers describe the project in these terms: “New data and new methods … have begun to illuminate the complexities of demographic responses to exogenous stress, economic and otherwise.… Combined time-series and event-history analyses of longitudinal, nominative, microlevel data now allow for the finely grained differentiation of mortality, fertility, and other demographic responses by social class, household context, and other dimensions at the individual level” (Tommy Bengtsson, Cameron Campbell, James Z. Lee, et al, Life under Pressure: Mortality and Living Standards in Europe and Asia, 1700-1900 (Eurasian Population and Family History), 2004, pp. viii-ix). Their goal is an ambitious one; it is to provide detailed, analytically sophisticated multi-generational studies of a number of populations across Eurasia. The studies are intended to permit the researchers to probe issues of causation as well as to identify important dimensions of similarity and difference across regions and communities.  The most recent volume in the series appeared earlier this fall (Noriko Tsuya, Wang Feng, George Alter, James Z. Lee, et al, Prudence and Pressure: Reproduction and Human Agency in Europe and Asia, 1700-1900 (Eurasian Population and Family History)).

Suppose we wanted to go further and create an interactive Wiki site that permitted researchers to upload their findings for a specified set of variables; and suppose the underlying software created a dynamic set of time graphs and maps representing these data over time. And suppose that the data displays can be broken out at different levels of scale -- North China, China, Eurasia. Finally, of course, we would want to specify that the data summaries are tagged with meta-data indicating the studies and methodologies leading to the graph. Could we say that this hypothetical site would then represent the meta-knowledge of the community of economic historians, climate scientists, and historical demographers? And could we speculate that this product would be an enormous benefit for historical researchers in a broad range of disciplines?

We can immediately predict some limitations to such a collective project. Most important is the unavoidable incompleteness of the data. We may have studies on farm productivity that document output for portions of North China and portions of the Yangzi Delta. But, of course, this doesn't tell us much about western China. So we can't realistically aspire to a full and complete representation of the variables full regions and periods.

Second, there is the problem of methodological inconsistencies across studies. Robert Allen is a leader in attempting to document standard of living across Europe and Asia (Robert Allen, ed., Living Standards in the Past: New Perspectives on Well-Being in Asia and Europe). And a central problem he faces is that multiple studies estimate consumption and wellbeing in different ways. So forming a composite representation requires an additional set of assumptions and models by the meta-study researcher.

Third, there is the question of defining the role for verbal analysis and reasoning in such a knowledge system. Are we to imagine this collective data set as a universal data appendix to a huge range of verbal historical narratives and analyses? Or might we come to think that the graphs speak for themselves, with no need for verbal analysis and inference?

All that said, I think the hypothetical Wiki site would be enormously valuable. It would provide us with birds-eye view of the large structural and material features that defined and constrained Eurasian history. And it has the potential of suggesting new avenues of research and new causal hypotheses about documented processes of change. For example, we may compare the time series of life expectancy and average temperature, and we may hypothesize that mortality and fertility were affected by abnormal climate conditions (through the medium of agricultural performance). But we may also be able to observe suggestive correlations between material variables and behavior -- for example, between ecological crises and the frequency of peasant uprisings. Or, conceivably, our eye might be led to a graph of sex ratios in a region and another of the incidence of banditry, and we might be led to a "bare sticks" hypothesis about social unrest: when there is an excess of unmarried young men, we can expect an upsurge of banditry and crime.

There are increasingly powerful tools available that permit scholars and the interested public to explore large public datasets such as the US Census or Bureau of Labor Statistics (link).  It is perhaps not wholly unrealistic to imagine a platform that permits multiple researchers to contribute to a meta-dataset for economic and social history of the world.

1 comment:

Agnese said...

I would suggest some links:; and also the OECD interactive datasets ( Have a nice we!