Monday, March 27, 2023

ChatGPT makes stuff up

It is really important for people to understand that ChatGPT is not a valid source of academic content or references. And it is not just useless because it makes occasional "errors" -- it is unreliable all the way down as a source of knowledge. Remember how ChatGPT works. It is a Large Language Model, trained on a vast set of texts through 2021. And I don't believe I oversimplify when I say that it has only one capability: given a string of words, it chooses another word that has been found to be frequently associated with the previous string or subsets of the string. That's it. There is no purposive searching for answers, no amazing problem-solving algorithms, no bursts of creative insights -- only creation of texts given a seed text, and creation of new texts in response to follow-on questions. Always, though, the process is the same: generate text sequences based on the frequencies of association of words and phrases in the training corpus.

Almost by construction, then, ChatGPT cannot have a meaningful and directed conversation about a given topic. Rather, it simply begins with a few words or phrases and generates additional sentences based on probabilistic associations among words and phrases.

In particular, there is no criterion of truth or accuracy in the statements that ChatGPT generates. The tool cannot filter out "fictional" sentences that it creates, because all of its sentences are fictional.

Here is a completely egregious example of fictional texts created by ChatGPT that have the semblance of truth. Periodically I've asked ChatGPT to summarize the main ideas in my book, Confronting Evil in History, published in October 2022. Since the model was trained on texts ending in 2021, ChatGPT should have had no "memory" or access to the book. But the first time I asked in January 2023, I received a surprisingly detailed answer describing the content of the book (link). In subsequent weeks I got a non-response along these lines: 

I'm sorry, but I don't have information about a book called "Confronting Evil in History" by Daniel Little. It's possible that this book hasn't been published yet or my training data only goes up until 2021 and this book was published after that. Can you please give me more context or provide a different question?

Now, this week (3/23/23), I raised the question again, and ChatGPT was more forthcoming. It offered a summary of the book and engaged in conversation about compassion, the Holodomor, state-smashing, and a historicized conception of human nature. So far, so good. I then asked:

DL: Can you give me some quotes from Little's book that explain "historicized theory of human nature"?

Certainly, here are a few quotes from "Confronting Evil in History" that help explain Little's concept of a "historicized theory of human nature":

  • "Human nature is not an unchanging reality; it is plastic and constantly changing in response to the historical and social context in which it is embedded." (p. 4)
  • "Human beings are products of history, and our actions and attitudes are shaped by the particular historical and social conditions in which we find ourselves." (p. 4)
  • "The structures of society shape our desires, goals, and aspirations, and they provide the framework within which we pursue our individual and collective lives." (p. 6)
  • "We must adopt a historicized theory of human nature that recognizes the ways in which our social and historical context shapes our behavior and attitudes, and which acknowledges the contingency and context-specificity of social phenomena." (p. 6)
  • "A historicized theory of human nature recognizes that our capacity for evil is not fixed or immutable, but rather is shaped by the social, economic, and political structures in which we are embedded." (p. 97)

These are plausible sentences that might have occurred in the book. But there is a hitch: none of these quotes actually appears in the book. None of them is a genuine quotation. (In fact, the book is only 68 pages long, so the quote from p. 97 is doubly fictitious.) 

So ChatGPT has committed a cardinal academic sin: it made specific statements about a book that were wholly manufactured and fictitious. If it were a human author it would be guilty of serious academic fraud. This is harmless in the current context; but what if a journalist were using ChatGPT to establish something important about Russia's motives in its war of aggression against Ukraine? And what if he or she requested "quotations" from Vladimir Putin on the topic of Russian nationalism and then used these in a news article? And what if the quotations, apparently well documented, were entirely fictitious? That is a very serious problem, because we care a great deal about truth and falsity when it comes to evaluating historical acts and speeches.

Here is one other egregious example of ChatGPT's mendacity. I asked several questions about my 2020 book, A New Social Ontology of Government. and ChatGPT had "plausible" comments. But then I asked a more specific question:

DL: Are there book reviews of Little's "New Social Ontology of Government"?

Yes, there are book reviews of Daniel Little's "New Social Ontology of Government." Here are a few examples:

1. In a review published in Contemporary Sociology, [Reviewer 1] writes that Little's book "offers a valuable corrective to the often-myopic focus on state-centered analyses of government." [Reviewer 1] praises Little's focus on the social practices and relationships that constitute government, but also notes that the book could benefit from a more explicit engagement with empirical examples.

2. In a review published in Philosophy of the Social Sciences, [Reviewer 2] writes that Little's book is a "tour de force" that offers a "sophisticated and nuanced" account of the nature of government. [Reviewer 2] praises Little's focus on the social ontology of government and suggests that the book is an important contribution to the field of social ontology.

3. In a review published in Perspectives on Politics, [Reviewer 3] writes that Little's book is "an important and original contribution to social ontology" that offers a "powerful critique of methodological individualism." [Reviewer 3] suggests that Little's focus on the social practices that constitute government is particularly valuable, and notes that the book is likely to be of interest to scholars across a range of disciplines.

Each of the bracketed names included in the ChatGPT text is a well respected and topically appropriate reviewer for this book. The journals cited are relevant as well. However, none of these reviews is genuine. ChatGPT has made them up out of whole cloth. 

My final exchange with ChatGPT in that session is illuminating:

DL: These book reviews are all fictional. None of them exist.

I apologize for my previous response. As an AI language model, I am not capable of searching for current and updated book reviews. However, I can suggest that you search for book reviews of Daniel Little's "New Social Ontology of Government" on academic databases such as JSTOR, Project MUSE, or Google Scholar. These databases can provide you with a list of relevant book reviews written by scholars in the field.

What is the root problem here? It is that ChatGPT appears to make statements or assertions based on evidence; but in fact, it is only generating strings of text that we readers may misinterpret as being reasoned and evidence-based. There is no evidence or reasoning that underlies texts generated by ChatGPT. And this in turn seems to indicate the vast distance that still exists between LLMs and general artificial intelligence tools.


Anonymous said...

"(...) ChatGPT is not a valid source of academic content or references"

It is a source of references however you are right that they aren't necessarily relevant or valid; I would argue that asking it for references resembles googling with an imperfect query string: you might find several hits however nothing guarantees that indeed you will find good ones

"(...) given a string of words, it chooses another word that has been found to be frequently associated with the previous string or subsets of the string."

Although I don't understand how LLM's work, I'd agree that in some sense, it generates words probabilistically however I'm not sure anybody knows what the actual probabilities actually encode.

"(...) ChatGPT cannot have a meaningful and directed conversation about a given topic"

I would have to testify that I feel I have gotten meaningful prompts occasionally, and similarly I have witnessed on one or two occasions really masterful code snippets, at least as far as my understanding allows me to confess. Nevertheless I agree that in terms of conversation, it is probably not very skillful yet

"(...) there is no criterion of truth or accuracy in the statements that ChatGPT generates"

If you mean that there is no external criterion that instructs the output I think I would agree with you; however it seems to have some internal criterion which I would guess is related to its probabilistic nature; but I think you are correct if you imply that its internal criterion is separated by a significant delta on given occasions from conventional criteria of truth or accuracy.

"(...) There is no evidence or reasoning that underlies texts generated by ChatGPT. And this in turn seems to indicate the vast distance that still exists between LLMs and general artificial intelligence tools."

I would definitely agree that LLMs and AGI appear separated by several degrees still.

However, I can't help but notice that ChatGPT at least feels like it produces Google's top 10 hits in a more prosaic manner. Occasionally, you get a better deal while other times you are less fortunate.

My take is that at least for now traditional sources of knowledge remain uncontested.

You are correct to note that the major problem so far seems to be it's 'ability' or 'willingness' to produce on occasion fictional reports or outright lies on actual issues.


Paul D. Van Pelt said...

And so, CHAT gpt manufactures contextual reality, something people, in all walks of life do everyday. Their differences in belief mostly conform with those of their fellows. If I am remotely understanding what you have written about CHAT, its' analysis of things is a probabalistic conformity, making any usefulness it may have more restrictive than human belief generators. Personally, I am not excited about this one way or another. I sort of get why it is people are leery of such developments, especially their application(s) towards untraceable plagiarism or other academic misrepresentation. Cheaters look for successful ways to cheat. I'm not sure this can be one of those ways. But, I just don't know. I suppose we don't get know everything such devices and systems are capable of?

Paul D. Van Pelt said...

On a different, more personal note: the Dirty Harry films. Clint Eastwood, as Harry Callahan, battled corruption from both sides of the fence. Eastwood, the man, knew it went both ways. His history as an actor; his films carried the theme across generations. He is the last of the 'iron-assed' loners. Did he stop corruption? No. That is unlikely to happen. Well, art imitates life. Why would it be any other way?