Thursday, September 3, 2020

Analytic philosophy of meaning and smart AI bots


One of the impulses of the early exponents of analytic philosophy was to provide strict logical simplifications of hitherto vague or indefinite ideas. There was a strong priority placed on being clear about the meaning of philosophical concepts, and more generally, about "meaning" in language simpliciter.

Here are the opening paragraphs of Rudolf Carnap’s The Logical Structure of the World and Pseudoproblems in Philosophy:
The present investigations aim to establish a “constructional system”, that is, an epistemic-logical system of objects or concepts. The word “object” is here always used in its widest sense, namely, for anything about which a statement can be made. Thus, among objects we count not only things, but also properties and classes, relations in extension and intension, states and events, what is actual as well as what is not. Unlike other conceptual systems, a constructional system undertakes more than the division of concepts into various kinds and the investigation of the differences and mutual relations between these kinds. In addition, it attempts a step-by-step derivation or “construction” of all concepts from certain fundamental concepts, so that a genealogy of concepts results in which each one has its definite place. It is the main thesis of construction theory that all concepts can in this way be derived from a few fundamental concepts, and it is in this respect that it differs from most other ontologies. (Carnap 1928 [1967]: 5)
But the idea of absolute, fundamental clarity about the meanings of words and concepts has proven to be unattainable. Perhaps more striking, it is ill conceived. Meanings are not molecules that can be analyzed into their unchanging components. Consider Wittgenstein's critique of the project of providing a "constructional system" of the meaning of language in the Philosophical Investigations:
12. It is like looking into the cabin of a locomotive. There are handles there, all looking more or less alike. (This stands to reason, since they are all supposed to be handled.) But one is the handle of a crank, which can be moved continuously (it regulates the opening of a valve); another is the handle of a switch, which has only two operative positions: it is either off or on; a third is the handle of a brakelever, the harder one pulls on it, the harder the braking; a fourth, the handle of a pump: it has an effect only so long as it is moved to and fro.
Here Wittgenstein's point, roughly, is that it is a profound philosophical error to expect a single answer to the question, how does language work? His metaphor of the locomotive cabin suggests that language works in many ways -- to describe, to denote, to command, to praise, or to wail and moan; and it is an error to imagine that all of this diverse set of uses should be reducible to a single thing.

Or consider Paul Grice's theory of meaning in terms of intentions and conversational implicatures. His theory of meaning considers language in use: what is the point of an utterance, and what presuppositions does it make? If a host says to a late-staying dinner guest, "You have a long drive home", he or she might be understood to be making a Google-maps kind of factual statement about the distance between "your current location" and "home". But the astute listener will hear a different message: "It's late, I'm sleepy, there's a lot of cleaning up to do, it's time to call it an evening." There is an implicature in the utterance that depends upon the context, the normal rules of courtesy ("Don't ask your guests to leave peremptorily!"), and the logic of indirection. The meaning of the utterance is: "I'm asking you courteously to leave." Richard Grandy and Richard Warner provide a nice description of Grice's theory of "meaning as use" in the Stanford Encyclopedia of Philosophy (link).

This approach to meaning invites a distinction between "literal" meaning and "figurative" or contextual meaning, and it suggests that algorithmic translation is unlikely to succeed for many important purposes. On Grice's approach, we must also understand the "subtext".

Hilary Putnam confronted the question of linguistic meaning (semantics) directly in 1975 in his essay "The meaning of 'meaning'" (link). Putnam questions whether "meaning" is a feature of the psychological state of an individual user of language -- whether meanings are "mental" entities; and he argues that they are not. Rather, meanings depend upon a "social division of labor" in which the background knowledge required to explicate and apply a term is distributed over a group of experts and quasi-experts.
A socio-linguistic hypothesis. The last two examples depend upon a fact about language that seems, surprisingly, never to have been pointed out: that there is division of linguistic labor. 'Ve could hardly use such words as "elm" and "aluminum" if no one possessed a way of recognizing elm trees and aluminum metal; but not everyone to whom the distinction is important has to be able to make the distinction. (144
Putnam links his argument to the philosophical concepts of sense and reference. The reference (or extension) of a term is the set of objects to which the term refers; and the sense of the term is the set of mental features accessible to the individual that permits him or her to identify the referent of the term. But Putnam offers arguments about hypothetical situations that are designed to show that two individuals may be in identical psychological states with respect to a concept X, but may nonetheless identify different referents or extensions of X. "We claim that it is possible for two speakers to be in exactly the same psychological state (in the narrow sense), even though the extension of the term A in the idiolect of the one is different from the extension of the term A in the idiolect of the other. Extension is not determined by psychological state" (139).

A second idea that Putnam develops here is independent from this point about the socially distributed knowledge needed to identify the extension of a concept. This is his suggestion that we might try to understand the meaning of a noun as being the "stereotype" that competent language users have about that kind of thing.
In ordinary parlance a "stereotype" is a conventional (frequently malicious) idea (which may be wildly inaccurate) of what an X looks like or acts like or is. Obviously, I am trading on some features of the ordinary parlance. I am not concerned with malicious stereotypes (save where the language itself is malicious); but I am concerned with conventional ideas, which may be inaccurate. I am suggesting that just such a conventional idea is associated with "tiger," with "gold," etc., and, . moreover, that this is the sole element of truth in the "concept" theory. (169)
Here we might summarize the idea of a thing-stereotype as a cluster of beliefs about the thing that permits conversation to get started. "I'm going to tell you about glooples..." "I'm sorry, what do you mean by "gloople"?" "You know, that powdery stuff that you put in rice to make it turn yellow and give it a citrous taste." Now we have an idea of what we're talking about; a gloople is a bit of ground saffron. But of course this particular ensemble of features might characterize several different spices -- cumin as well as saffron, say -- in which case we do not actually know what is meant by "gloople" for the speaker. This is true; there is room for ambiguity, misunderstanding, and misidentification in the kitchen -- but we have a place to start the conversation about the gloople needed for making the evening's curry. And, as Putnam emphasizes in this essay and many other places, we are aided by the fact that there are "natural kinds" in the world -- kinds of thing that share a fixed inner nature and that can be reidentified in different settings. This is where Putnam's realism intersects with his theory of meaning.

What is interesting about this idea about the meaning of a concept term is that it makes the meaning of a concept or term inherently incomplete and corrigible. We do not offer "necessary and sufficient conditions" for applying the concept of gloople, and we are open to discussion about whether the characteristic taste is really "citrous" or rather more like vinegar. This line of thought -- a more pragmatic approach to concept meaning -- seems more realistic and more true to actual communicative practice than the sparse logical neatness of the first generation of logical positivists and analytic philosophers.

Here is how Putnam summarizes his analysis in "The Meaning of "Meaning"":
Briefly, my proposal is to define "meaning" not by picking out an object which will be identified with the meaning (although that might be done in the usual set-theoretic style if one insists), but by specifying a normal form (or, rather, a type of normal form) for the description of meaning. If we know what a "normal form description" of the meaning of a word should be, then, as far as I am concerned, we know what meaning is in any scientifically interesting sense.
My proposal is that the normal form description of the meaning of a word should be a finite sequence, or "vector," whose components should certainly include the following (it might be desirable to have other types of components as well): ( 1) the syntactic markers that apply to the word, e.g., "noun"; (2) the semantic markers that apply to the word, e.g., "animal," "period of time"; ( 3) a description of the additional features of the stereotype, if any; ( 4) a description of the extension. (190)
Rereading this essay after quite a few years, what is striking is that it seems to offer three rather different theories of meaning: the "social division of labor" theory, the stereotype theory, and the generative semantics theory. Are they consistent? Or are they alternative approaches that philosophers and linguists can take in their efforts to understand ordinary human use of language?

There is a great deal of diversity of approach, then, in the ways that analytical philosophers have undertaken to explicate the question of the meaning of language. And the topic -- perhaps unlike many in philosophy -- has some very important implications and applications. In particular, there is an intersection between "General artificial intelligence" research and the philosophy of language: If we want our personal assistant bots to be able to engage in extended and informative conversations with us, AI designers will need to have useable theories of the representation of meaning. And those representations cannot be wholly sequential (Markov chain) systems. If Alexa is to be a good conversationalist, she will need to be able to decode complex paragraphs like this, and create a meaningful "to-do" list of topics that need to be addressed in her reply.
Alexa, I was thinking about my trip to Milan last January, where I left my umbrella. Will I be going back to Milan soon? Will it rain this afternoon? Have I been to Lombardy in the past year? Do I owe my hosts at the university a follow-up letter on the discussions we had? Did I think I might encounter rain in my travels to Europe early in the year?
Alexa will have a tough time with this barrage of thoughts. She can handle the question about today's weather. But how should her algorithms handle the question about what I thought about the possibility of rain during my travels last January? I had mentioned forgetting my umbrella in Milan; that implies I had taken an umbrella; and that implies that I thought there was a possibility of rain. But Alexa is not good at working out background assumptions and logical relationships between sentences. Or in Gricean terms, Alexa doesn't get conversational implicatures.

Luca Gasparri and Diego Marconi provide a very interesting article on "Word Meaning" in the Stanford Encyclopedia of Philosophy (link) that allows the reader to see where theories of meaning have gone in philosophy, linguistics, and cognitive science since the 1970s. For example, linguists have developed a compositional theory of word meaning:
The basic idea of the Natural Semantic Metalanguage approach (henceforth, NSM; Wierzbicka 1972, 1996; Goddard & Wierzbicka 2002) is that word meaning is best described through the combination of a small set of elementary conceptual particles, known as semantic primes. Semantic primes are primitive (i.e., not decomposable into further conceptual parts), innate (i.e., not learned), and universal (i.e., explicitly lexicalized in all natural languages, whether in the form of a word, a morpheme, a phraseme, and so forth). According to NSM, the meaning of any word in any natural language can be defined by appropriately combining these fundamental conceptual particles. (36)
This approach is strikingly similar to that described in the Carnap passage above, in that it attempts to decompose meanings of complex concepts into "atoms" of meaning.

Some research on language in computer science and artificial intelligence has been striking and surprising. For example, OpenAI's GPT3 natural language text producer, grounded in massive machine learning on billions of pages of text, creates text that appears to be conversational and intelligent -- without a milligram of meaning (link). Here is a short (two paragraph) piece of philosophy text about freedom of speech, generated purely mechanically by Arram Sabeti using GPT3 as a demonstration of the capabilities of GPT3 (link):
There’s a concept that comes up often in debates about free speech called “shouting fire in a crowded theater”. The basic idea is that you can’t just say whatever you want whenever you want – especially if it’s likely to cause panic. And the reason for this is obvious: panic leads to violence, and violence hurts people. If you are causing panic, then your right to free speech ends where the other person’s right to not get hurt begins. 
But this isn’t really true. If shouting fire in a crowded theater were illegal, then there wouldn’t be any actual fires in theaters, would there? How could there be? If everyone knew that shouting fire would get them thrown in jail, then they’d all keep quiet when they smelled smoke – and thus there would never be any fires at all. The only way someone could shout fire is if they were lying – but lying is also illegal. So if shouting fire in a crowded theater were illegal, then theaters would never burn down, and no one would ever die from a fire.
The way I see it, the phrase “shouting fire in a crowded theater” should be replaced with “yelling ‘there’s an angry mob coming with torches and pitchforks’ in a theater”. This doesn’t sound quite as catchy, but it has the advantage of being true. And this matters because the whole concept of free speech is based on people getting the facts right. If you’re going to shout fire in a crowded theater, then you need to actually have seen a fire – not just heard about one from someone else. You need to have checked that there really is a fire before you warn everyone else that there’s a fire. Otherwise you’re just causing panic for no reason – and that’s not free speech, that’s just yelling “boo” in a crowded theater.
The passage is no argument at all -- no guiding intelligence, and obvious non-sequiturs from one sentence to another. The first four sentences make sense. But then the next several sentences are nonsensical and illogical. In sentence 5 -- what "isn't really true"? Sentence 6 is flatly illogical. In fact, it is as illogical as Trump's insistence that if we had less testing then there would be less COVID in the United States. And the statement, "... but lying is also illegal" -- no, it's not. The bot is misinformed about the law. Or more precisely: these are just words and phrases strung together algorithmically with no logical construction or understanding guiding the statements. And the second paragraph has the same features. It is kind of entertaining to see the logical flaws of the text; but maybe there is an important underlying discovery as well: machine learning cannot create or discover rules of logic that allow for argument and deduction. There is an analogy with Chomsky's argument that syntax cannot be understood as a Markov-chain process at the beginning of the generative linguistics revolution. The passage is analogous to Noam Chomsky's example of a syntactically correct but semantically meaningless sentence, "Colorless green ideas sleep furiously". This GPT3 text is syntactically correct from phrase to phase, but lacks the conceptual or logical coherence of a meaningful set of thoughts. And it seems pretty clear that the underlying approach is a dead end when it comes to the problem of natural language comprehension.

1 comment:

jed said...

Overall an interesting post. I share most of your criticisms of analytic theories of language. However your analysis of GPT-3 (and similar systems) falls short of the mark. Chomsky was using examples including "Colorless green ideas sleep furiously" to argue that utterances could be well formed but nonsensical -- and GPT-3 certainly proves his case! On the other hand Chomsky has always argued that people must have language specific mechanisms in their minds, with a specific sort of recursive structure (what structure exactly has changed over the years). GPT-3 clearly disproves that claim, and as such should be a major epistemic issue for Chomsky-aligned linguists, but I have not seen any of them respond to that aspect of GPT-3 -- either to accept the results or to dispute them.

The point about semantics deserves a little more discussion. Clearly GPT-3 has some sort of "surface semantics" -- it puts words together much more coherently than in Chomsky's example. It can manage local semantic agreement in "Winograd sentences" where the resolution of pronouns depends on some "world knowledge" (or we can now see more accurately "how people talk about the world knowledge").

But GPT-3 doesn't have -- and wasn't claimed to have or designed to have -- any deep knowledge about how we do things, how things work, what things look like, or even how words sound (so it isn't good with rhymes or puns). Those kinds of knowledge are somewhat modular and in humans depend on learning by other parts of the brain. We can reasonably expect that some language engine like GPT-3 can be linked to similar engines that have learned to hear and produce speech, use hands, recognize objects, etc. (All things that other deep learning systems do reasonably well.)

Probably such linkages would give the language produced by GPT-3 the deep semantics of those additional domains. Of course it would not have all the deep semantics humans have -- surgeons, carpenters, ethnomethodologists and so forth all have their own deep semantics learned by interacting with their special domains.