Saturday, July 19, 2025

Arrtificial intelligence tools for historians


Historical research may seem to be a field in which AI tools will be especially useful. Historians are often confronted with very large unstructured digital collections of documents, letters, images, treaties, legal settlements, contracts, and diplomatic exchanges that far exceed the ability of a single human researcher to sift and analyze for valuable historical insights. Can emerging tools in the AI revolution help to make systematic use of such historical collections?

Earlier applications of new quantitative methods of analysis of historical data

Several earlier periods of innovation in twentieth-century historical research suggest that historians can often borrow fruitfully from new methods and analytical tools developed in other sciences. The cliometric revolution of the 1980s (Fogel and Elton 1984; Rawski 1996; Wright 2015) brought tools of econometrics, demography, and statistics more fully into play in historical inquiry. Historians have made extensive and productive use of quantitative methods borrowed from the social sciences to investigate questions concerning the health status of various historical populations and the standard of living in different cities and regions (Crafts 1980; Lee and Feng 1999; Allen 2000; Allen, Bengtsson, and Dribe 2005). These tools usually depend upon the availability of structured databases of comparable data over time—for example, census data, birth, marriage, and death records, military records of recruits, and price data for representative goods (wheat, rice, salt). There are issues of comparability, reliability, and validity that arise in these applications of large historical datasets, but these issues are no more difficult for historians than for sociologists or political scientists. Another major area of innovation was the geospatial revolution of the 1990s (Presner and Shepard 2016; Skinner, Henderson, and Yuan 2000; Thill 2020). Efforts to place historical data and events into spatial order have been very productive in suggesting new historical patterns and causal influences not visible in purely narrative accounts. G. William Skinner’s pathbreaking work on the economic regionalization of China is an outstanding example (Skinner 1977), and Peter Bol and colleagues have collaborated in the establishment of a major historical GIS database for China (Bol 2006; Bol 2007). So it is quite understandable that some contemporary historians are interested in the potential value of emerging tools of digital humanities, semantic search, and big-data analytics in their efforts to make sense of very large archives of digitized text and image materials.

However, archival collections of interest to historians present special obstacles to digital research. They are usually unstructured, consisting of collections of heterogeneous text documents, contracts, local regulations, trial documents, imperial decrees, personal letters, and artifacts and images. Moreover, the meaning of legal, political, and religious vocabulary is sometimes unclear from a modern perspective, so translation and interpretation are problematic. The written language of the documents itself is problematic. Often handwritten, interspersed with references and asides in other languages, and often using vocabulary that has no exact modern equivalent, the task of interpreting each historical document itself is challenging for the historian and for the software system. Are there tools that allow the historian to sift, summarize, categorize, and highlight the texts, sentences, and paragraphs that are included in a large archival collection? Major new capabilities have emerged in each of these areas that have substantially enhanced the ability of historians to classify and analyze very large unstructured text databases and archives. These capabilities involve advances in machine learning, large language models, semantic search tools, and big-data analytics. Like any innovation in methods of inquiry and inference, it is crucial for researchers to carefully evaluate the epistemic reliability of the tools they utilize.

Digital humanities

In the past several decades scholars in the humanities, including comparative literature, art history, and various national literatures, have explored applications of computational tools for the analysis of digital texts that permit a breadth and depth of analysis not previously available. These research efforts are now described as digital humanities. Several handbooks and overviews on digital humanities have appeared (Schreibman, Siemens, Unsworth 2004; Schreibman, Siemens, Unsworth 2016; Eve 2022). The goals of research within the field are varied, but in almost all cases the research involves computational analysis of large databases of text, image, and video documents, with the general goal of discovering large patterns that may be undetectable through traditional tools of literary or art-history analysis. Franco Moretti’s Graphs, Maps, Trees: Abstract Models for a Literary History (2005) and Distant Reading (2013) offer excellent examples. Moretti wishes to explore “world literature”; and the field of documents included in this rubric is too large for any single critic or research team to read closely all the available works in the field. Moretti writes, “A larger literary history requires other skills: sampling; statistics; work with series, titles, concordances, incipits—and perhaps also the ‘trees’ that I discuss in this essay” (2013: 67). In place of the insights of close reading, Moretti emphasizes the value of “distant reading” and the effort to discover broad and long patterns across national literatures and centuries. This requires using analytical tools of computational social science to classify texts, identify word patterns, create spatial networks, and (perhaps) to algorithmically assign markers to topics and styles in the texts subject to analysis. Martin Paul Eve writes, “Under such a model, the idea is that computational detection of style, theme, content, named entities, geographic place names, etc. could be discerned at scale and aggregated into a broader and continuous literary history that would not suffer from the same defects as a model that required one to read everything” (Eve 2022: 130).

Efforts in the digital humanities have evident relevance to the problems presented by large text and image datasets available in many areas of historical research. One promising area of application involves using big data tools of text analysis—for example, machine learning, content extraction, and semantic search—to systematically survey and classify all the documents in a collection. The impetus and initiatives of the field of “digital or computational history” are described in Siebold and Valleriani 2022 and Graham, Milligan, Weingart, and Martin 2022. The methods currently creating a great deal of interest among historians are based on joining machine learning methods, big-data analytics, and large language models (LLMs) in order to permit analysis and codification of the semantic content of documents. To what extent can emerging computational tools designed for management and analysis of large unstructured text and image databases be adapted to assist the historian in the task of assimilating, interpreting, and analyzing very large databases of historical documents and artifacts?

Pre-processing and information extraction

An avenue of research in computer science that supports analysis of large unstructured datasets containing texts and images is the field of information extraction (Adnan and Akbar 2019). Information extraction technology consists of algorithms developed to analyze patterns in text (and images or videos) to apply labels or tags to segments of the data. These are generally “big data” tools using machine learning to identify patterns in target documents or images. Adnan and Akbar put the goal of information extraction tools in these terms: “It takes collection of documents as input and generates different representations of relevant information satisfying different criteria. IE techniques efficiently analyze the text in free form by extracting most valuable and relevant information in a structured format” (Adnan and Akbar 2019: 6). In general terms, information extraction tools are expected to provide a structured basis for answers to questions like these: What is the document about? What persons or things are mentioned? What relationships are specified within the document? What events are named? The tools are often based on natural-language models that require training on large text datasets and sometimes make use of machine learning based on neural networks (Rithani et al. 2023). “The concept is to automatically extract characteristics from massive artificial neural networks and then use these features to inform choices” (Rithani et al. 2023: 14766).

A useful tool developed within the field of information extraction that has clear relevance for historians attempting to analyze large unstructured databases is named entity recognition and classification (Goyal, Gupta, and Kumar 2018). This is a group of text-analysis algorithms designed to identify meaningful information contained in a given document —for example, “person, organization, location, date/time, quantities, numbers” (Goyal et al. 2018: 22). The named entities may be specialized to a particular content area; for example, public health historians may wish to include disease and symptom names. These tools are used as a basis for pre-processing of a set of documents. The tool creates a meta-file for each document including the named entities and classes that it contains, along with other contextual information. For example, historians interested in the role that agriculture played in large periods of time may be interested in quickly identifying a selection of documents that refer to hunger, famine, or starvation. Goyal, Gupta, and Kumar carefully review the methods currently in use to identify named entities in a body of texts, including rule-based identification of named entities and machine-learning identification, with or without supervision. They emphasize that none of these methods is error-free, and false positives and false negatives continue to arise after training. This means that some lexical items in a document are either missed as referring to a named entity, or are incorrectly associated with a given named entity. Nonetheless, a historian can certainly use named-entity recognition and classification to provide a basis for important exploration and discovery in a large unstructured text database.

Keller, Shiu, and Yan (2024) provide a recent example of a machine-learning approach to automatic text analysis based on the most common large language model technique (“bidirectional encoder representations from transformers” (BERT)). They use GUWEN-BERT, a BERT model pre-trained on classical Chinese characters. They evaluate the power and accuracy of this tool in analyzing the Veritable Records of the Qing Dynasty to identify events of social unrest. The document archive is vast, encompassing more than 1,200 volumes of records from the sixth century to the end of the Qing Dynasty. Their research task is to identify episodes of social unrest, and then to classify these episodes into three categories—peasant unrest, militia unrest, and secret-society unrest (Keller et al. 2024: 4). This process of event identification and classification then permitted the researchers to seek out correlates of unrest, including fluctuations in grain prices. A useful example applying the same technology is provided by Liu, Wang, and Bol (2023), demonstrating largescale extraction of biographical information from a large collection of local gazetteers. Machine recognition of hand-written Chinese literary characters and translation of sentences and phrases in classical Chinese have made great progress in the past twenty years; Liu, Jaeger, and Nakagawa 2004, Leung and Leung 2010. This capability represents a major step forward in the ability of Chinese and Asian-language historians to make extensive use of large databases of historical documents such as the Veritable Records archives.

RAG, GraphRAG, and vector-similarity search

An important tool that has been of interest to historians exploring digital tools is retrieval-augmented generation (RAG) as a complement to LLM text generation systems. This area of research attempts to provide a basis for joining LLM query engines to specialized databases so that responses to queries will be based on data contained in the associated database. RAG tools are sometimes celebrated as solving two persistent problems arising in the application of natural-language generative chat functions based on large language models: the lack of auditability and the generation of fictitious responses (hallucinations) by the generative chat program. Kim Martineau describes a RAG tool in these terms: “Retrieval-augmented generation (RAG) is an AI framework for improving the quality of LLM-generated responses by grounding the model on external sources of knowledge to supplement the LLM’s internal representation of information. RAG implementation in an LLM-based question-answer system has two main benefits: It ensures that the model has access to the most current, reliable facts, and that users have access to the model’s sources, ensuring that its claims can be checked for accuracy and ultimately trusted” (Martineau 2024). A RAG framework is intended to allow the introduction of real, documented data into a natural language query-and-response system, and it is designed to be auditable. RAG picks up where pre-processing tools discussed previously leave off. RAG tools permit the retriever tool to parse a given query into component questions, and then to retrieve relevant data from pre-existing databases of documents (Lewis et al. 2021; Zhao et al. 2024).

RAG tools have in turn been extended with two related innovations. Vector similarity search is a semantic search tool that represents a document as a vector of abstract terms (like those identified in the discussion above of named entity identification and classification) (Mohoney et al. 2023). This further simplifies the task of querying the database for documents that are “about” one or more entities or events. A second valuable analytical tool is GraphRAG, which permits the construction of a network graph of the links among the elements in a document collection. Introduced by research scientists at Microsoft in 2024, GraphRAG was designed to permit analysis of global features of a large unstructured data collection. (See Larson and Truitt 2024, Edge et al. 2024a, and Edge et al. 2024b for technical descriptions of GraphRAG capabilities.) GraphRAG combines the data provided by RAG tools and connects these to LLM generative response systems. GraphRAG thus integrates indexing, retrieval, and generation. The key output of GraphRAG analysis of a database of text documents is a knowledge graph showing relationships among the various documents based on the content vectors associated with each document. (Experienced historians who make use of RAG and GraphRAG tools note that scaling up from moderate to large databases is challenging and computationally demanding.)

Limitations of the tools for historians

These tools suggest research strategies for historians confronting very large digital collections of documents and images. They permit computational procedures that classify and index the materials in the data archive that permit the historian to quickly identify items that are relevant to particular research questions -- the occurrence of famine, civil strife, dynastic unrest, or the transmission of ideas. And they permit natural-language query of the target database that provides suggestive avenues of further investigation for the historian. Crucially, these tools provide the ability to "audit" the results of a query by returning to the specific documents on which a response is based. The problem of "hallucination" that is endemic to large-language models by themselves is substantially reduced by tying responses to specific items in the database. And the algorithms of vector search allow the AI agent to quickly pull together the documents and "chunks" of text that are most relevant to the query.

These applications present powerful new opportunities for historians to make extensive use of very large databases of texts, but they also pose novel questions for the philosophy of history. In particular, they require that historians and philosophers develop new standards and methods for validating the computational methods that are chosen for various research tasks presented by the availability of large text collections. This means that we need to examine the strengths and limitations of each of these methods of analysis. Crucially, the designers and researchers of these tools are quite explicit in acknowledging that the tools are subject to error: the problem of hallucination is not fully removed, the content database itself may be error-prone, there may be flaws and limitations inherent in the training database in use, and any errors created during the information-extraction stage will be carried forward into the results. It is therefore incumbent upon the historian who uses such tools to validate and evaluate the information provided by searches and natural language queries. Nothing in the design of these tools suggests that they are highly reliable; rather, they are best viewed as exploratory tools permitting the historian to look more deeply into the collection of documents than traditional methods would permit. It will be necessary for historians to think critically about the quality and limitations of the information they extract from these forms of big-data analysis of historical databases.

References

Adnan, Kiran, and Rehan Akbar, 2019. “An analytical study of information extraction from unstructured and multidimensional big data,” Journal of Big Data, 6(1): 91. doi10.1186/s40537-019-0254-8

Allen, Robert C., 2000. “Economic Structure and Agricultural Productivity in Europe, 1300–1800,” European Review of Economic History, 3: 1–25.

Allen, Robert C., Tommy Bengtsson, and Martin Dribe (eds.), 2005. Living standards in the past: New perspectives on well-being in Asia and Europe, Oxford; New York: Oxford University.

Bol, Peter, 2006. “Creating the China Historical Geographic Information System,” in History in the Digital Age Symposium, University of Nebraska-Lincoln, video lecture. Bol 2006 available online.

Bol, Peter. 2007. Creating the China Historical Geographic Information System (text and slides). (Digital History Project, University of Nebraska-Lincoln). http://digitalhistory.unl.edu/essays/bolessay.php.

Crafts, N.F.R., 1980. “National income estimates and the British standard of living debate: A reappraisal of 1801–1831,” Explorations in Economic History, 17: 176–88.

Edge, Darren, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, and Jonathan Larson, 2024a, “From Local to Global: A Graph RAG Approach to Query-Focused Summarization”, manuscript at arXiv.org.

Edge, Darren, Ha Trinh, Steven Truitt, and Jonathan Larson, 2024b, “GraphRAG: New Tool for Complex Data Discovery Now on GitHub”, blog post at Microsoft Research, 2 July 2024.

Eve, Martin Paul, 2022. The digital humanities and literary studies, first edition, Oxford: Oxford University Press.

Fogel, Robert William, and G. R. Elton, 1983. Which road to the past? Two views of history, New Haven: Yale University Press.

Goyal, Archana, Vishal Gupta, and Manish Kumar, 2018. “Recent named entity recognition and classification techniques: A systematic review,” Computer Science Review, 29: 21–43. Goyal, Gupta, & Kumar 2018 available online.

Graham, Shawn, Ian Milligan, Scott Weingart, and Kimberley Martin, 2022. Exploring big historical data: The historian’s macroscope, second edition, New Jersey: World Scientific.

Kamath, Uday, Kevin Keenan, Garrett Somers, and Sarah Sorenson. 2024. Large Language Models: A Deep Dive: Bridging Theory and Practice. 1st 2024. ed.: Springer Nature Switzerland: Imprint: Springer.

Keller, Wolfgang, Carol H. Shiue, and Sen Yan, 2024. “Mining Chinese historical sources at scale: A machine learning approach to Qing state capacity,” Working Paper 32982, National Bureau of Economic Research, Cambridge, MA. Keller, Shiue, & Yan 2024 available online.

Larson, Jonathan, and Steven Truitt, 2024, “GraphRAG: Unlocking LLM Discovery on Narrative Private Data”, blog post at Microsoft Research, 13 February 2024.

Lee, James Z., and Wang Feng, 1999. One quarter of humanity : Malthusian mythology and Chinese realities, 1700–2000. Cambridge, Mass.: Harvard University Press.

Leung, K. C., and C. H. Leung, 2010. “Recognition of handwritten Chinese characters by critical region analysis,” Pattern Recognition, 43(3): 949–961.

Lewis, Patrick, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela, 2021. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” manuscript available at arXiv.org.

Liu, Zhou, Hongsu Wang, and Peter K. Bol, 2023. “Automatic biographical information extraction from local gazetteers with Bi-LSTM-CRF model and BERT,” International Journal of Digital Humanities, 4: 195–212.

Liu, Cheng-Lin, Stefan Jaeger, and Masaki Nakagawa, 2004. “Online recognition of Chinese characters: The state-of-the-art,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 26: 198–213.

Martineau, Kim. 2024. "What is retrieval-augmented generation?". IBM Research. Accessed 11/23/2024. https://research.ibm.com/blog/retrieval-augmented-generation-RAG.

Mitchell, Melanie, Alessandro B. Palmarini, and Arseny Moskvichev, 2023. “Comparing humans, GPT-4, and GPT-4V on abstraction and Reasoning tasks,” manuscript at arXiv.org.

Mohoney, Jason, Anil Pacaci, Shihabur Rahman Chowdhury, Ali Mousavi, Ihab F. Ilyas, Umar Farooq Minhas, Jeffrey Pound, and Theodoros Rekatsinas. 2023. "High-Throughput Vector Similarity Search in Knowledge Graphs." arxiv>cs>arXiv:2304.01926.

Moretti, Franco, 2005. Graphs, maps, trees: Abstract models for a literary history, London; New York: Verso.

–––, 2013. Distant reading, London; New York: Verso.

Presner, Todd, and David Shepard, 2016. “Mapping the geospatial turn,” in A new companion to digital humanities, edited by Susan Schreibman, Raymond George Siemens and John Unsworth. Wiley/Blackwell.

Rawski, Thomas G. (ed.), 1996. Economics and the historian, Berkeley: University of California Press.

Rithani, M., R Prasanna Kumar, and Srinath Doss, 2023. “A review on big data based on deep neural network approaches,” Artificial Intelligence Review, 56(12): 14765–14801

Schreibman, Susan, Raymond George Siemens, and John Unsworth (eds.), 2004. A companion to digital humanities (Blackwell Companions to Literature and Culture), Malden, MA: Blackwell Pubications.

Schreibman, Susan, Raymond George Siemens, and John Unsworth (eds.), 2016. A new companion to digital humanities, Chichester, West Sussex, UK: Wiley/Blackwell.

Siebold, Anna, and Matteo Valleriani, 2022. “Digital perspectives in history,” Histories, 2(2): 170–177.

Skinner, G. William. 1977. "Regional Urbanization in Nineteenth-Century China." In The City in Late Imperial China, edited by G. William Skinner and Hugh D. R. Baker. Stanford, CA: Stanford University Press.

Skinner, G. William, Mark Henderson, and Yuan Jianhua. 2000. "China's Fertility Transition through Regional Space: Using GIS and Census Data for a Spatial Analysis of Historical Demography." Social Science History 24 (3): 613-652.

Thill, Jean-Claude (ed.), 2020. Innovations in urban and regional systems: Contributions from GIS&T, spatial analysis and location modeling, 1st edition, Cham: Springer International Publishing.

Wang, Dongbo, Chang Liu, Zhixiao Zhao, Si Shen, Liu Liu, Bin Li, Haotian Hu, Mengcheng Wu, Litao Lin, Xue Zhao, and Xiyu Wang, 2023. “GujiBERT and GujiGPT: Construction of Intelligent Information Processing Foundation Language Models for Ancient Texts,” manuscript at arXiv.org.

Zhao, Penghao, Hailin Zhang, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, Jie Jiang, and Bin Cui, 2024. "Retrieval-Augmented Generation for AI-Generated Content: A Survey," manuscript at arXiv.org.

Thursday, June 26, 2025

Stock ownership as system-wide exploitation?

 

A prior post made an effort to gain greater analytical clarity concerning the unfairness involved in the separation between the “one percent” economy and the rest of us. In what ways is the wealth owned by the super-billionaires an “unfair” extraction from the rest of US society? How can we account for the very rapid accumulation of wealth in the hands of the richest 1 percent of US wealth holders since 1980? The answer seems to largely turn on the rapid expansion in wealth represented by the US stock market over that period, and the fact that a very small number of wealth holders captured the lion’s share of these gains. The following graph shows a five-fold increase in the value of the US equity market in part of that time, from about $12 trillion in 1998 to $52 trillion in 2024. The wealth owned by the top 1% of households increased at about the same rate, which implies that this class rode the wave to wealth right along with the stock market in those years. “Corporate equities and mutual fund shares” are the largest component by far of the wealth portfolios of the top .1% and 1%, as reflected in the second chart below, produced by the Federal Reserve.

Screenshot

It was shown in the earlier post that the growth of the super-billionaires’ share of the nation’s wealth cannot be explained in normal “business profit” terms. (For reference, the top twenty billionaires in the US own 2.8 trillion dollars of wealth; link.) Rather, the bulk of the wealth now held by individuals like Mark Zuckerberg, Elon Musk, and Jeff Bezos represents the rapid appreciation of value in capital markets of the companies in which they have large ownership stakes. The companies themselves do not generate billions of dollars in dividends; rather, their total stock value has witnessed billions of dollars in gains over very short periods of time.

So why should we think this is in any way unfair? How is it exploitative? Is it not more like the fortunate visitor to “Antiques Road Show” who finds that the forgotten painting in the closet is in fact an early Picasso and is worth millions on the art market? This is good fortune for the owners of the canvas, but surely these facts don’t suggest “exploitation” of anyone else. Perhaps not in the case of the Antiques Road Show guest; but the majority owner of Amazon, Tesla, or Meta is in a different set of circumstances. Rather, the existence and continuing success of these companies depends on background conditions to which all sectors and components of the US economy contribute: a stable system of law and regulation, a robust education and research sector, a skilled workforce, an infrastructure of roads, ports, rail lines, fiber optic cables, and electricity providers. The value of US companies is at least in part a system effect: it is facilitated and constituted by a vast network of private and public stakeholders, all of whom contribute ultimately to the success of the company and the value it finds within the equity market. So the value of the US company is inseparable from the large and heterogeneous economic and political system in which it operates, and the increase in value over time of the US company reflects the continuing contribution expected by the investing public from the functioning of that system.

It will be said, of course, that the companies and their executives themselves contribute to the value that investors attribute to them: innovative products, good management systems, efficient decision-making, appropriate personnel practices, “entrepreneurship” and risk-taking. This is true. But it is also true that these contributions represent only a portion of the increase in value that the company experiences over time. The system effects described here represent an independent and important component of that substantial increase in value. So we might say that “system-created increase in value” is the uncompensated part of wealth creation in today’s economy. Companies pay little or nothing to cover the cost of these system-level inputs on which they depend; these are the inverse of “externalities”, in that they are benefits taken without compensation from the public (rather than harms imposed without compensation on the public). And these system-created increments in value constitute a very important part of the increase in value that they experience over time.

We might therefore look at “system-created increase in value” as the counterpart to “unpaid labor time” in the classic theory of exploitation. It is the source of wealth (profit) that the owners of wealth derive simply in virtue of their position in the property system and in their opportunity to benefit from the economic system upon which they depend. But now it does not derive from the “surplus value” contributed to profits by each worker, but rather from the synergies created by the socio-economic system as a whole.

It should also be noted that the ability of private companies to “extract” value from system-level inputs without compensation depends on their ability collectively to influence government policy. Therefore owners of private companies and stock wealth have strong incentives to shape the decision-making of elected officials, government policy makers, the fiscal system, and the regulatory process. This reinforces the arguments made by Thomas Volscho and Nathan Kelly in “The Rise of the Super-Rich: Power Resources, Taxes, Financial Markets, and the Dynamics of the Top 1 Percent, 1949 to 2008” (link). It follows, then, that achieving powerful influence on public policy and economic rule-making is not just a hobby for the oligarchy; it is an existential necessity.

This analysis of “system-input exploitation” has important consequences for distributive justice. If the whole of society contributes to the creation of the system-level properties that generate a significant fraction of the new wealth created in the past forty years, then surely fairness requires that all participants should receive some part of the gains. It would seem logical for the non-wealth-holding stakeholders — workers, farmers, and uncompensated contributors to social reproduction — to demand economic reforms that direct a fair share of that new wealth to the benefit of the whole population.

The previous post suggested one possible mechanism that would do this. The post discusses a hypothetical “public investment fund” that “would be automatically vested with ownership shares of businesses and corporations as they are created and grow, and that would function as a ‘wealth reserve’ for all citizens”. This would constitute a large and growing asset to be used for the benefit of the whole of society. In that discussion a distribution of gains resulting in public ownership of 1/3 of all capital was considered. Such a division would reduce (though not eliminate) the most extreme inequalities of wealth that currently exist, and would provide a financial basis for a more genuine “free community of equals” through the secure establishment of a high level of the resources most needed — healthcare, education and training, environmental protection, and provisioning of basic human needs for children, the disabled, the elderly, and the unemployed.

This idea of a public investment fund corresponding to the “systemic value creation” of the economy might go a long way towards the securing political values embodied in John Rawls’s concept of a “property-owning democracy” (link). Rawls argues that “the equal worth of liberty” is incompatible with a society in which political influence is proportional to wealth and where wealth is extremely unequally distributed. Wealth inequality of this magnitude means that the oligarch’s liberty and worth are magnified many times relative to the ordinary citizen’s situation. The creation of a substantial public investment fund representing the value created by our social, economic, and political system of cooperation would reduce the total proportion of the total value of the economy that the multi-billionaire class is able to expropriate. It would create real property entitlements for the great majority of society, and it would redress the current horrendous inequality of political influence that exists between the super-rich and the ordinary citizen.

Sunday, June 22, 2025

A new form of exploitation

 

Much thinking about economic justice for working people has been framed by the nineteenth-century concept of “capitalism”: owners of enterprises constitute a minority of the population; they hire workers who represent the majority of the population; wages and profits define the distribution of income throughout the whole population. This picture still works well enough for a range of economic activities in the advanced capitalist economies when it comes to manufacturing, agriculture, and service industries. According to recent tabulations by the US Bureau of Labor Statistics (link), there were 158 million workers in wage and salary employment in 2023. Manufacturing represented 8.2%, retail and wholesale trade 13.7%, information 1.9%, financial services 5.8%, leisure and hospitality 10.5%, and federal and state government 14.4%. This adds up to 54.5% of the US labor force, and these workers and firms can be thought of in roughly the framework offered by the traditional idea of “capitalism”. Many of these workplaces are amenable to union representation (though relatively few are in fact unionized). But improving access to union rights and workplace consultation would significantly improve the conditions of life for this segment of the US population.

Marx’s view of the unfairness of capitalism, then, comes down to workplace exploitation — the capture of “surplus value” by the firm’s owner from the workers whom he or she employs. Profits derive solely from surplus value, so wealth accumulation is fundamentally limited by the size of an enterprise.

However, current realities seem to suggest that this classical Marxist account is no longer sufficient. To see this point it is crucial to look at the details of the distribution of wealth and income in the U.S. Consider the graph of median US income by quintile above in constant 2018 dollars. Since 1989 only the top quintile of household income has demonstrated significant growth (in a timeframe of more than thirty years); and the top 5% of households shows the greatest increase of any group. 80% of US households are barely better off today than they were in 1967; whereas the top 5% of households have increased their incomes by almost 250% in real terms. The bottom 80% range in household income from “poor”, the bottom 20% at an average household income of about $14,000, to the second quintile (60%-80%) of about $102,000. But virtually all of these households — 80% of all households — earn their livings through wage and salary income, in “capitalist” workplaces.

Further, only a very small fraction of these households are in a position to accumulate significant savings or investments. As the second graph shows, the bottom 50% of households have only 2.6% of all U.S. wealth, and the 50%-90% segment owns only another 30.8%. The top 0.1% owns 13.9% of all wealth, and the remainder of the top 1% owns 16.6%. That amounts to 30.5% of all wealth, held by 1% of households — and almost incomprehensible figure.

These two graphs have a very clear, unmistakable implication: that working people, including service workers, industrial workers, and most professionals have received a declining share of the economic product of the nation over the past 40 years. (Amazon warehouse workers fall in the 2nd-lowest quintile (poorest 21-40%).) Further, the vast majority of U.S. residents have only a tiny share of all property in the U.S. According to the Federal Reserve 2022 Survey of Consumer Finances, median household net worth in 2022 was $192,700, including private savings, retirement savings, and personal property and home value (link). And, of course, this implies that the median household net worth of the bottom 80% of the U.S. population is significantly lower.

It seems apparent, then, that capitalist exploitation is no longer the primary mechanism through which wealth is accumulated by the top 10%, 1%, and .1% of wealth holders. The top group gains income at a rapid rate and increases its share of the national wealth comparably; whereas the bottom 80% have almost stagnant incomes and negligible wealth. And this accumulation occurs almost entirely through rising value of the stock issued on behalf of private companies. The national economy generates all of this wealth; but the vast preponderance of the fruits of this production flow to the top 10% and 1% of wealth holders. This is a different kind of exploitation: not exploitation of a specific group of workers (employees of General Motors, for example); but exploitation of the whole of the U.S. economy for the benefit of a tiny minority of wealth holders.

Essentially it seems fair to say that the contemporary U.S. system involves two economies — one that includes 60%-80% of all people, and who depend on wages and salaried income to earn their livings; and a second economy that is itself steeply stratified, involving only the top 10%-20% of households. This second economy includes highly paid professionals, executives, and individuals who derive a substantial income from investments, financial assets, and other capital assets. The distribution of income and wealth in this second economy depends on ownership of capital (including human capital) of increasing value in a “knowledge” economy.

It appears, then, that the gross advancement of wealth inequalities in the past three decades has little to do with traditional “exploitation” – an unfavorable wage relationship between owners and workers. Instead, the sudden explosion of tech-oligarchy in the US seems to have to do with financial markets, the stock value of private companies, and the environment of business and tax policy in which they operate. The super-wealthy class in the US came into multi-billionaire status through the rapid acceleration of market value of companies like Amazon, Tesla, and Facebook/Meta. And this process reflected a macro-level mechanism that we might describe as “exploitation of the US economy as a whole” rather than “exploitation of a specified group of workers employed by these companies.

Thomas Volscho and Nathan Kelly provide a careful analysis of the dynamics of income inequality in the US economy over time in “The Rise of the Super-Rich: Power Resources, Taxes, Financial Markets, and the Dynamics of the Top 1 Percent, 1949 to 2008” (link). They note that there was considerable variation in the share of income flowing to the top one percent between 1900 and 2020, with a rapid rise beginning in about 1980. And they attribute much of this variation to facts about political power, public policy, and fiscal legislation. (This bundle of hypotheses is referred to as “Power Resources Theory”.) And a key finding in this literature is that the relative levels of political power and influence held by economic elites versus working people have a very large effect on the degree and direction of change in inequality at the top.

Consider the short history of Facebook. Mark Zuckerberg’s wealth increased from 2008 from $1.5 billion to $236 billion in 2025. The employee count of Facebook/Meta increased comparably during that period, from 85 employees in 2008 to 76,800 employees in 2025. But Zuckerberg’s wealth does not reflect the “surplus value” created by these workers, but rather the perceived value of the company in the eyes of private and institutional investors. And critically, it is difficult to imagine institutional changes within Facebook/Meta that would lead to greater overall societal equity simply by providing the company’s workers more input into the management of the company. The median income for a Facebook/Meta worker is $257K – hardly an exploitative wage. It is the rest of society that is disadvantaged by Zuckerberg’s $236 billion, not the direct employees. The same seems to be true for Tesla and the wealth accumulated by Elon Musk and for Amazon and the wealth of Jeff Bezos. Amazon’s business operations have many of the same features of domination and exploitation identified by Engels in Manchester; but these operations do not constitute the fundament of Bezos’s wealth except perhaps for the “performative” of a company single-mindedly devoted to efficiency and speed of operations.

The experience of the reforms of the welfare state after WWII shows that capitalist exploitation can be reformed through measures that improve the public provision of some crucial services (education, healthcare, retirement income, unemployment insurance); improve the ability of workers to represent themselves effectively in the workplace (legislation ensuring unionization rights); and improve conditions of health and safety in the workplace (OSHA protection). These reforms are “redistributive” in the sense that they depend on taxation of income and profits of private individuals and corporations to fund public provisioning. But can reforms like these address the inequalities — economic and political — created by the two economies described here? Can the oligarchy economy be reined in? It would seem that the answer is “no”.

So we are forced to ask, what kinds of fiscal and tax reforms could effectively rein in the wealth inequalities created at the very top of the wealth distribution? The annual wealth taxes proposed by progressive Democrats extend to taxes in the range of 1%. But this would represent a negligible reduction in the oligarch’s portfolio, and does essentially nothing to reduce the steepness of the distribution of wealth in America. A “confiscatory” tax of 33% would have a measurable effect by increasing available public funds for expenditure; but even reducing Elon Musk’s wealth from $368 billion to $245 billion – still results in a staggering inequality relative to 99% of US workers. And this still leaves the wealth-holder with a million-fold advantage in his/her political and media influence relative to almost all other US persons. (As mentioned above, the median net worth of all Americans is currently about $192,000. It is of course striking that three of America’s largest tech-oligarchs privately own a media company: Zuckerberg (Facebook), Musk (X/Twitter), and Bezos (the Washington Post).)

It appears, then, that standard “New Deal” or “welfarist” approaches to greater economic equality have no prospect for success whatsoever when it comes to reducing the overwhelming inequalities of wealth that exist between the two US economies described here. A graduated income tax works to moderate income inequalities (when it works at all); but the rapid accumulation of wealth represented by the emergence of the “tech-oligarchy” and the graph of wealth distribution above do not derive from income inequalities. The richest 1% did not primarily gain their wealth through annual savings from their high salaries; rather, they gained their wealth through stock ownership in companies whose value appreciated exponentially during the time of their ownership. And taxing the holders of wealth on the income generated by their holdings does not materially affect the distribution of wealth across the population and across generations.

Suppose we viewed a national economy as an interconnected and highly complex form of “joint production”, in which the efforts of all parties are instrumental in the creation of the new wealth and prosperity of the economy. And suppose we believe that this system should be organized as a “fair system of cooperation” in which all parties benefit in a fair way. Can the workings of capital markets and financial systems be incorporated into our institutions in ways that would give the working public (the 80%) a fair share of the products of cooperation? Could we imagine a fiscal mechanism that would provide the public with a “fair share” of the U.S. economy as a whole, including the growth of the value of private companies (Caterpillar, General Motors, Krogers, Facebook/Meta, Microsoft, …)?

For example, would it be possible to imagine a public investment agency along the lines of CalPERS that would be automatically vested with ownership shares of businesses and corporations as they are created and grow, and that would function as a “wealth reserve” for all citizens? Suppose the hypothetical “public investment corporation” eventually possessed assets worth about 1/3 of the total value of the US stock market. (The value of stocks listed on the New York Stock Exchange is currently $28.3 trillion, so we are imagining a public wealth fund of about $10 trillion.) On this model, private owners and shareholders would own 2/3 of the capitalized economy, and the public would own 1/3. Would such a system be feasible? Could such a system redress the insupportable economic and material inequalities that exist in our country? Could it redress the gross inequalities of influence and power that exist between a tiny class of oligarchs and the vast majority of democratic citizens? Could the shareholder voting rights that correspond to the public shares created in this way serve to alter corporate priorities?

It seems clear that the photo below taken from Donald Trump’s inauguration represents a horrendous flaw in contemporary democracy. The “tech oligarchs” turned out in force for the new administration, and a group of wholly committed political partisans stand behind them to enact policies in the United States that serve their interests. If this is the best that our democracy can currently offer working people, then we need to work much harder at finding political and economic solutions that can elicit broad support from ordinary citizens, workers, farmers, and Uber drivers to push forward a better agenda for democratic equity.

Saturday, May 24, 2025

Republicanism and multicultural democracy


Philip Pettit’s writings about republicanism offer a valuable and distinctive perspective on individual freedom and the nature of a good society. He develops those ideas most fully in Republicanism : a theory of freedom and government. Pettit’s core idea is that we should conceive of freedom as “non-domination” — that is, that an individual is free when he or she is not subject to the arbitrary power of other individuals, groups, or institutions. He emphasizes that non-domination is a more demanding concept of freedom than either “negative” or “positive” freedom as characterized by writers as diverse as John Locke, John Stuart Mill, Isaiah Berlin, or Amartya Sen, because domination can occur even when crude coercion is absent. The threat of constraint or punishment for one’s “free” actions can make those actions and actors unfree, even if coercive means are never invoked. And most pertinently today, a structure of discrimination and negative stereotypes about members of a minority group can present barriers to the free and non-dominated choices and life trajectories of a despised minority — African-American men and women in the American South in the 1950s, Black South Africans during the post-apartheid period, women in a period of male domination and chauvinism, and Jewish men and women in pre-war Poland.

The other idea constitutive of republicanism is the view that the commonwealth, the civil society, or the republic have a value over and above the value of the individual activities of the citizens. Rousseau emphasizes this point in The Social Contract: entering into a unanimous and binding agreement with one’s fellow citizens is fundamentally transforming of each individual. Each is an expression and constituent of the “general will”, and the whole of the political collectivity is a moral presence for all the citizens. To be a citizen is to be civically motivated, to be concerned to bring about the good of the whole (and not solely one’s own particular good). And this civic membership is in turn constitutive of part of the value and satisfaction of the individual citizen. Citizens are morally invested in the wellbeing of their fellow citizens. This set of ideals, once again, is incompatible with a society that embodies persistent forms of social and legal domination of one group by another, because domination is incompatible with equality and dignity for all citizens.

The question of domination is central for Pettit. Here is how he explains this concept.

Domination, as I understand it here, is exemplified by the relationship of master to slave or master to servant. Such a relationship means, at the limit, that the dominating party can interfere on an arbitrary basis with the choices of the dominated: can interfere, in particular, on the basis of an interest or an opinion that need not be shared by the person affected. The dominating party can practise interference, then, at will and with impunity: they do not have to seek anyone’s leave and they do not have to incur any scrutiny or penalty. Without going further into the analysis of domination or indeed interference—we turn to that task in the next chapter—a little reflection should make clear that domination and interference are intuitively different evils. (22)

It is worth noticing how this conception of non-domination converges with Rousseau’s concept of “a free community of equals”. It is fundamental to Rousseau’s concept of a proper “republic” that no citizen is superior to another, none has dominion over another in virtue of property, status, religious authority, or other extraneous characteristic. In a free community of equals, no citizen is enabled to dominate another. This view is celebrated in The Social Contract and the Discourse on the Origin and Foundation of the Inequality Among Men.

Pettit especially praises the theory of republicanism because it provides a basis for articulating “grievances” at a wide range of causes. Whereas the “non-interference” theory of liberty does not provide a basis for articulating a grievance about engrained social hierarchy (squire to tenant), the “non-domination” perspective permits this grievance and many others: patriarchy and male supremacy, racial discrimination, a group’s exercise of unequal economic power over another group, and the many other ways in which entrenched patterns of thought and power give one group influence over the affairs and wellbeing of another group (Republicanism, p. 134).

In particular, Pettit argues that republicanism offers a strong moral basis for articulating the values of a multicultural democracy and the equal dignity of the diverse participants in such a democracy. And he forthrightly defends the idea that a democracy based on “non-domination” will require substantial programmatic efforts at reducing and eliminating the sources of domination that exist among groups in the existing society. He puts his view this way:

The challenge raised by that complaint [by members of minority groups within society] is whether the modern state can be given a rationale and a form that will enable it to serve the interests of those in minority cultures equally with the interests of those in the mainstream. The point that I want to make here, in defence of republicanism, is that if the modern state is orientated around the promotion of freedom as non­domination, then it will have a reason and a capacity to cater for the claims of those in minority cultures. … The lesson of this observation is that so far as membership in a minority culture is likely to be a badge of vulnerability to domination, the members of that culture, and the state that assumes concern for their fortunes, must address the needs of the culture in general. It is not going to be enough to claim to be concerned with individuals in the culture, without any particular reference to what binds them together. (Republicanism, 144, 145)

It might be noted that this conception of a multicultural democracy is itself somewhat limiting. It emphasizes the importance of “counting every voice” within a democracy — certainly an important value. But it doesn’t emphasize explicitly the positive value created by a multiracial, multi-ethnic, and multicultural society and the forms of learning and enhanced fulfillment that are enabled by full and equal relationships with members of groups other than one’s own. At its best a multicultural democracy is more than a social and political setting in which different groups can live peacefully together; it is one in which the lives of all members of society are enriched and enabled by the thoughts, experiences, and values of members of other groups.

Let’s now see how these ideas about republicanism intersect with the rationale for organizational programs enhancing Diversity, Equity, and Inclusion (DEI). In the past twenty years universities and businesses have introduced a variety of programs under this banner that are designed (in part anyway) to reduce and eliminate the barriers experienced for various groups in our society due to discrimination and negative stereotypes. Consider two specific examples: the well-known social phenomenon of “stereotype threat” (link) that Claude Steele and others have explored; and the persistent life disadvantages created for poor children and young adults by inadequate public schooling in urban centers (link).

Steele’s central finding as a social psychologist is that a widespread belief in society and in schools that the X group generally cannot perform as well as the Y group on a certain kind of task (a stereotype) causally brings about poor performance in members of the X group. Instead of the stereotype deriving from the facts, the facts of unequal performance derive from the stereotype. So what can be done? One possibility is to explicitly recognize the workings of this mechanism within schools, and to educate teachers about the harmful effects that follow from even veiled expressions of the stereotype. Suppose an engineering professor often begins the semester with a speech saying, “I see there are a few women students in the class. I know this will be a challenging class for you, and I want you to know that my teaching assistants and I are available to you to clarify things you don’t understand”. We may assume the professor’s intentions are good; but according to Steele’s research, the effects on some of the female students may be significant. So the professor would be well advised to learn to emphasize his or her availability in different terms, without reference to the gender of the students in the class. This suggests the value of programs in “hidden-bias” training for faculty and staff. And we might go a bit further: if the engineering college faculty is 90% male, the signal to female students seems to be that “engineering is not a profession for women”. So the college should make special efforts to recruit highly talented female faculty.

Now consider the second example: the barriers created for black and brown students who are heavily concentrated in urban neighborhoods with relatively less effective public schools, due to persistent residential segregation (link). How should selective universities address the fact that black and brown students from low-income families are persistently under-represented in their incoming classes? A program that has often been adopted is a university-funded supplementary instruction program for low-income districts in their state or region. The idea is that the university can help talented high school students close the attainment gap that exists between them and typical suburban high school students through intensive programs of this kind. This would have two positive effects: it would increase the preparation level of these low-income-neighborhood students so they are competitive for admission in selective universities; and it would potentially increase the confidence in a cohort of under-served students that the host university is indeed an attainable and attractive destination for them. These effects would increase the number of under-served students — black, brown, and white — who attend selective universities, and it would reduce the barrier that exists for residents of segregated neighborhoods and cities when it comes to college attendance. This would be one step in the direction of securing a more free and equal society, from the point of view of non-domination.

Now let’s return to “non-domination” and Pettit’s republicanism. These reparative policies are urgently needed, and many others as well, in order to eliminate the social barriers that have the effect of establishing relations of “domination” among specific groups in society. Moreover, policies and programs like these are not performed out of “charity” or noblesse oblige. Rather, measures like these are needed as a matter of reparative justice to all free and equal citizens — in the first case, in order to reduce the barriers created to female students’ ability to enter an engineering curriculum and to thrive in the profession; and in the second case, to begin to address the barriers to full educational development created by continuing racial segregation. In each case the policy is intelligently designed to reduce domination of one group in a democratic society by another group. And in this way the DEI policies currently under attack are specifically needed if we are to achieve the idea of a “free community of equals” in a multicultural and multi-racial society.

Tuesday, May 20, 2025

The micro and the social


In his influential article "A definition of physicalism" (1993) Philip Pettit attempts to formulate a consistent and coherent account of physicalism as an ontology of the world.

I believe that we can define a possibly true, substantive doctrine which holds, roughly, that the empirical world 'contains just what a true complete physics would say it contains'. (213)

The resulting view is offered as an attempt to identify the "furniture of the empirical world". Can there be a corresponding theory of the "furniture of the empirical social world"?

His statement in the follow-on reply to criticisms of this essay in "Microphysicalism without contingent micro-macro laws" (1994) is helpful.

“Physicalism – better, perhaps, microphysicalism – is the doctrine that actually (but not necessarily) everything non-microphysical is composed out of microphysical entities and is governed by microphysical laws; and this, in a sense which means that the non-microphysical facts supervene contingently on the microphysical” (1994: 253).

What is the analogy for the trivial fact, “no social structures without human beings embodying them”? Let's call the corresponding view “bare ontological dependence” (BOD). Here is a formulation of BOD constructed to be exactly parallel to Pettit's definition of microphysicalism:

“[Bare ontological dependence] … is the doctrine that actually (but not necessarily) everything [at the social level] is composed out of [existent human individuals with specified features of mentality, psychology, and cognitive capacity] and is governed by [psychological and neurophysiological] laws; and this, in a sense which means that the [social-level] facts supervene contingently on the [micro-individuals]”.

Is this a credible and defensible conception of the relation between individual human actors and "social entities"?

This formulation entails, apparently, that “the features of the social institution are constituted and governed by the micro-psychological characteristics of the individual actors who constitute it” and the social “supervenes” upon those micro-individuals. This in turn implies the supervenience maxim: “no difference at the social-structure level without some difference at the micro-individual level”. It also specifies a clear sense in which “micro-particles” have primacy over ordinary physical objects and "individuals" have primacy over "social entities"; their properties “fix” the behavior of the macro-objects. The corresponding statement for "bare ontological dependence is then that "actors" have primacy over ordinary "social" objects; their properties "fix" the behavior of the macro-social-objects.

Pettit's application of these formulae to the world of physics, chemistry, and planetary motion is reasonable enough. And part of the plausibility of the view in the case of physics is that the microphysical particles can be said to have properties that are fixed and independent from the macro-level ensembles that they constitute. However, this is not the case in the situation of "socially situated individuals" who constitute "ongoing social structures and practices". Rather, there are reciprocal causal relations up and down, or back and forth, across the levels that make the ideas of "constitution" and "fixing" no longer compelling. And this implies in turn that there are no trans-historical, universal "regularities" of human behavior that might constitute the bridge between individual actions and social entities.

Further, because of the contingency of some historical sequences (for example, the invention of monotheism or the discovery of the heavy plow), and given the path-dependency of some structural or institutional outcomes, there is no reason to expect that a common beginning point of unsocialized or pre-socialized individuals (a state of nature) would gradually develop, by perhaps unknown behavioral laws, into the establishment and articulation of specific social structures. And if we accept the point that it is sometimes the case that "different structures elicit different kinds of human mentality", then we are brought face to face with contingency all the way down: contingent structures and contingent social individuals.

So Pettit's line of thought in defining physicalism is quite implausible when applied to the social world. It is true that "the Ford Motor Company" could not exist if there were not actual human beings occupying roles within and outside the company. It is not true, however, that "the 'social' world is governed by forces or regularities that [empirical psychology] is best equipped to describe" (the analog to Pettit's statement about micro-particles). The mental characteristics and processes of the actors involved in a social entity or set of social arrangements are themselves in fact shaped by past social arrangements to which the actors have been exposed. So if the foundational body of empirical knowledge is "empirical psychology" (as JS Mill indeed believed) then we must reject the view.

One of Pettit's claims about the relation between microphysical things and macrophysical things is fundamentally a reassertion of the supervenience relation between levels: "No macrophysical difference without a microphysical one" (216). How does this proposition fare when applied to "macro-social entities" and "micro-individual states"? It corresponds to this assertion: "No macro-social difference without a micro-individual one". Unfortunately, this seems to be a trivial statement when applied to the social realm. Any two social states differ at the micro-level, for the most trivial of reasons: they are different states, with different individuals, and different individuals have different action-plans and beliefs. So whether S1 and S2 are "different" or "the same" in their macro-descriptions, it is trivially true that they will differ in their micro-composition. Consider these three facts: "The price of soybeans on the Chicago Board of Trade on May 1, 2025, is 1,060.50"; "The price of soybeans on the Chicago Board of Trade on May 1, 2024, is 1,000.00"; "The price of soybeans on the Chicago Board of Trade on May 1, 2023, is 1,060.50". Each sentence describes a structural fact: the circumstances of supply and demand on the specified day led to an equilibrium price as quoted. But none of the structural facts described here corresponds to a single set of individual actors doing the same things for similar reasons. The actors have changed, their motivations have changed, their habits have changed, and their styles of dress have changed. The pathways that led to the same structural equilibrium in 2025 and 2023 were no doubt different in multiple ways; and likewise, the actors and the pathways that led to different equilibria in 2025 and 2024 were different as well. Even if we were to perform a massive experiment in "experimental economics" and assemble 1,000 traders on May 1, 2023, and then again on May 1, 2025; control the information to which they are exposed in the preceding twelve hours; and ask them to buy and sell as they normally would, there will still remain idiosyncratic differences between the series of thoughts and actions undertaken on the two days. And the same will be true of the experiment when we model 2024 and 2025: there will inevitably be vast numbers of individual-level differences. So the supervenience condition is vacuous. There are always differences across cases at the micro-individual level.

It is pertinent to observe that some physical processes are path dependent as well, which means that the initial states of the micro-particles by themselves are not sufficient to "determine" the outcome. Rather, the outcome depends in part on the process of transition from one state to another. Suppose a physicist observes two vessels of pure water over a bunsen burner. One container is boiling vigorously, while the other, at the same observed temperature, is not boiling. The difference is that the first vessel was heated quickly while the second was heated slowly. The process made a difference in the outcome, even though the micro-constituents were indistinguishable. But this is the relevant point: there is a difference in the two states of containers of water, even though there is currently no difference in the states of the micro-particles.

Pettit considers a possibility that he considers to be inconsistent with his understanding of "physicalism":

Another [opponent of physicalism] will be the person, perhaps difficult to imagine accepts microphysical composition but thinks that the composition involved is not necessarily conservative: it allows, without further need of explanation, that two entities that are composed in the same way, and of the same materials, may yet differ intrinsically from one another. (217)

Pettit appears to think this is an absurd contention; how could these two macro-entities differ, without any difference in the composing microphysical parts?

Let's ask first what Pettit means by the phrase "composed in the same way". One natural reading is asynchronous and structural. A bar of iron may consist of precisely the same number of iron atoms, but the arrangement of the atoms is coherent in the first case and incoherent in the second arrangement. In that case the first iron bar has a property that the second bar lacks; it is "magnetic". And yet the two bars consist of exactly the same kind of microphysical parts. The solution in this case is that the arrangement of the parts makes a difference; when the iron atoms are coherently aligned, their magnetic fields aggregate to a macro-scale magnetic field.

The second possible meaning of "composed in the same way" is diachronic and historical. To be "composed in the same way" is to have undergone precisely the same set of processes of material transformation, heat transfer, application of pressure, etc. From the short explanation of path-dependent processes above, we know that differences in physical processes of material transformation can indeed lead to differences in macro-physical outcome for ensembles of precisely similar microphysical constituents; in the current case, one process leads to a magnetic bar of iron, while the second process leads to a non-magnetic bar.

Now return to the relation between individuals and social structures. Is it possible for two ensembles of individuals to be exactly similar in the current psychological characteristics of the individuals involved in the two cases but to nonetheless differ in some important way at the macro-social level? Much turns on how finely we expect to interpret "exactly similar" here. But suppose we assume an abstract conception of the individual's psychology along these lines: each individual wants a situation where he or she can satisfy as many preferences as possible, while avoiding catastrophic failures. And each individual has a set of "social emotions" that permit the emergence of social relationships based on trust and mutual solidarity. Now consider two thought experiments involving the emergence of a "wannabe" strong man dictator in the political system. In the first instance the potential dictator has the good fortune that his first efforts at taking power are generally unopposed because individuals recognize his intentions but find resistance to be too risky; as the dictator gains followers and successes this population becomes more passive; and after ten years the dictator is applauded and supported throughout much of the population. In the second instance the dictator has bad fortune. A few of his supporters are overly willing to use violence against dissidents and resisters, which stimulates a higher degree of alarm in ordinary citizens; a few of the ordinary citizens recognize the discontent present in others and form relations of trust. These "trust" circles expand over time and incidents of resistance become more frequent; the dictator and his supporters become more willing to turn to violent suppression; the dictator's violence tips more citizens into trust circles of their own; and after ten years the dictator's hold on power is precarious. His rallies are attended by his own militias and inner circle; but acts of resistance continue and proliferate.

The two scenarios begin and end with the same kinds of actors -- persons with their own interests and a capacity for forming social relationships. And yet the properties of the two regimes at the end of the experiment are quite different. This seems to correspond to the terms of Pettit's original reductio ad absurdum: different macro-arrangements constituted by the same kinds and arrangements of microparticles. What differentiates the two cases is the contingency, path dependence, and "triggering" of individual capacities that occurred during the processes involved in the two scenarios. In the first case, no events arose to trigger and encourage the emergence of trust networks; whereas in the second case, there were such events. The historical processes in the two cases were different, and the properties of the ensemble continued to evolve in different directions. The social capacities of the individuals were present in both scenarios, but they were only triggered in the second scenario.

What all of this suggests is fairly simple: the idea of "physicalism" as a fundamental model of ontology is not a suitable framework for thinking about social ontology. The individuals who "constitute" social arrangements are not analogous to the microparticles that Pettit considers; rather, they are actors whose states of agency are altered dynamically by events, processes, and structures that emerge historically, and the inherent contingency and path dependence of the social world guarantee that no version of foundationalism or individualism will suffice for social ontology.