We appear to have crossed another great divide in artificial intelligence. It is not just the constant shuffle of driverless cars in my Silicon Valley neighborhood on their endless driving lessons. Nor is it the machine learning gains in diagnostic accuracy that exceed those so expertly trained in radiology and dermatology. Those are visual advances in machine learning. This time it’s language.
Steven Johnson, in a marvelously well-done article in the New York Times Magazine, sets out what machine learning is making of writing. It is the driverless car equivalent of the keyboard. Just feed in your destination and it takes you there. The article’s only flaw, however, may be its title: “A.I. Is Mastering Language. Should We Trust What It Says?” The less attention-grabbing but more important subtitle question is, “What Can We Learn from What It Does?” Our learning is no small thing when it comes to machine learning. We don’t typically get in on the learning. The machine goes straight for the answer, without pausing to explain its reasoning.
Now, the divide that has been crossed with language, placing the machine well beyond the Turing test, is the result of running a “large language model” on a neural network program. A leading instance is run by OpenAI (“to advance digital intelligence in the way that is most likely to benefit humanity as a whole”) with a program known as Generative Pre-Trained Transformer 3 (GPT-3). The program draws its language lessons from an inordinately large corpus – “700 gigabytes of data drawn from across the web, including Wikipedia, supplemented with a large collection of text from digitized books” – using “one of the most powerful supercomputers on the planet.”
The results are staggering. When Johnson asks GPT-3 to “write an essay discussing the role of metafiction in the work of Italo Calvino,” it immediately offers a flawless response in tone and detail that opens right on cue:
Italian author Italo Calvino is considered a master of metafiction, a genre of writing in which the author breaks the fourth wall to discuss the act of writing itself. For Calvino, metafiction is a way of exploring the nature of reality and the ways in which stories can shape our perceptions of the world. His novels often incorporate playful, labyrinthine structures that play with the boundaries between reality and fiction…
Johnson uses Google to establish that “not one of the sentences in the Calvino essay has ever been written before.” GPT-3 has learned how we write, when we write about Calvino. For Johnson, by “simply through playing ‘predict the next word’ a trillion times, the software is now clearly capable of writing complex sentences and presenting arguments in a technically proficient manner.”
Among the many things on which this achievement has a bearing is our concept of intellectual property. Here the question is not who holds the copyright on this Calvino piece. No one does. Rather, the use of GPT-3 offers insight into our regard for expression of ideas as intellectual property, and how we tend to think of the intelligence and thinking behind that expression. It is a point that I fear a number of the experts whom Johnson cites miss by assuming what appears to be a metaphysical quality to our ability to write.
Johnson himself asks “whether GPT-3 is actually generating its own ideas or merely paraphrasing the syntax of language it has scanned from the servers of Wikipedia, or Oberlin College, or The New York Review of Books?” As I see it, GPT-3 calls into question the concept of “its own ideas” by so effectively generating coherent thoughts by “paraphrasing the syntax” of others. Tulsee Doshi, who leads Google’s Responsible A.I. and M.L. Fairness team, states that “it’s very easy to personify the [GPT-3] model — we talk about it ‘having understanding’ or ‘having knowledge’ or ‘knowing things.’” Yet maybe it’s easy to personify the model because we fit the model. That is, when we talk about “understanding” something, it amounts to what we can say about it based on the language that we have accumulated from others (if nowhere near as extensively and systematically as the GPT-3 program). What we say may be “original” but then Johnson establishes that GPT-3 generates original text each time it responds, even to the same prompt.
Johnson comes close to my take on this when he ruminates that “maybe predicting the next word is just part of what thinking is.” But still, I think we owe it ourselves to ask what part of thinking, when it comes to writing, doesn’t have to do with making something out of the language we possess? Certainly, I revise again and again (now aided by Google Docs’ version of machine-learning prompts as I write). GPT-3 does not revise. Yet to revise is only to refine my calculations.
If it can be said that GPT-3 simulates how we compose a simple email or a complex piece like Johnson’s article, then it throws our common concepts of intellectual property into more disarray than I can deal with in a column like this. At the very least, it suggests a new appreciation of – and perhaps new measures for – what it means for (human) writers to go beyond the many predictive possibilities of language and thereby truly alter language’s corpus of probabilities, much as Italio Calvino may well have done with his metafiction If on a Winter’s Night.