Recently much has been made of LLMs apparently not knowing how many ‘r’s there are in strawberry making them look stupid. But actually the whole affair shows something different and much more profound.
First of all, the incident, LLMs across the board will anwer you that there are two ‘r’s in strawberry if you ask them how many ‘r’s there are. Obviously there are three ‘r’s which makes people laugh at the LLMs. But actually what no one seems to realise is that the LLM anser is correct, not erroneous. There are indeed two ‘r’s in strawberry (and one and three). Why it feels wrong tells us more about human communication than about the function or malfunction of the LLM. The reason is that humans do not communicate by giving true answers to questions. We communicate by giving the most relevant answer.
If you are human you can easily see that the effect we wanted was to know the TOTAL amount of ‘r’s in “strawberry” are, but the machine has no concept of relevance. From the machine point of view the question was actually wrong given the expectations of the answer. The right question would have been what is the total amount of ‘r’s.
For much of the history of línguistics, language had been thought of as a way of coding and decoding meaning referring to something in the world. This was the paradigm of Swiss linguist Ferdinand de Saussure, where language was a system through which expressions could be connected to what they signified. This paradigm was the most influential until the middle of the previous century.
Slowly it began falling apart, in particular with the advent of pragmatic linguistic theory and in particular philosophers like J.L Austin and .P. Grice who created awarenes that language was performative and rested on implications not explicitly stated. Communication according to Grice involves not a coding/decoding or a logically determined framework for carrying meaning but involves intention and beliefs.
“A uttered x with the intention of inducing a belief by means of the recognition of this intention”. (Grice 1989, p.219)
Communication is to do something with the intention of creating a belief or understanding in another person. According to Grice the meaning of a sentence is achieved by an utterance only through the Gricean maxims implicature and the cooperative principle.
Implicature concerns what the speaker implicates by an utterance. To understand what is meant by an utterance goes beyond what is actually said in such a way as to be inferred from non-linguistic features of a conversation along with the principles of cooperation. The cooperative principle is understood thus:
“Make your contribution such as it is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged.” (Grice 1989: 26)
When we speak we do it we do it by continuously adapting the utterances to the talk and the context of the current communication. We understand the other person to be trying the same and therefore have a basic cooperative approach. We want to cooperate to communicate and understand through which we use the context to “guess” what the utterance of the other could have been intended to mean.
LLMs are notoriously bad at using contextual information and have no concept of implicature or a cooperative principle, which is why they fail to grasp the meaning of “how many ‘r’s are in there ín “Strawberry”.” For humans the principle of cooperation will lead us to assume that the speaker would probably want to know the total number of ‘r’s not just a random logically correct answer (of which there are three: 1,2, and 3). That leads a human speaker to ascertain that the implication is to answer the total amount of ‘r’s.
The strawberry affair shows more about human expectations than about LLM capabilities. We expect the LLMs to be able to gauge our intention and the implicatures of what we say. But that is never going to happen because the LLMs are not human, they do not have an intention, they do not have a principle of cooperation. If they answered how we expected it is because of statistical regularities of what people usually say rather than a command of the implicatures and intentions behind the utterance, that is, a command of real communication.
Building successful human and AI solutions will require setting expectations and also understanding better how humans actually work. Philosophers like Grice can remind us that communication is not code that is transmitted and executed in the receiver (even if that is the case for LLMs), even if it is put into code in the form of an alphabet. Communication is a constantly moving window of context and guessing the intentions of communicators, something that is so self-evident to us that we never really notice.
Sources:
Grice, H.P. (1989). Studies in the Way of Words. Harvard University Press.
https://x.com/RobDenBleyker/status/1828157720736002527?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1828157720736002527%7Ctwgr%5E1a525f5c88e2ae9cd19933a2dff2c5e0e39356ba%7Ctwcon%5Es1_c10&ref_url=https%3A%2F%2Ftechcrunch.com%2F2024%2F08%2F27%2Fwhy-ai-cant-spell-strawberry%2F
Photo by Oliver Hale on Unsplash
