With OpenAI’s release of ChatGPT-3 on November 30th 2022, it very quickly became clear to people that the innocent-sounding Large Language Models (LLMs) had crossed a historic threshold when it came to the intelligence exhibited by Artificial Intelligence. Many were seeing, for the first time, a computer responding to their questions and prompts with well informed and well-formed prose that, apart from the occasional “hallucination,” spoke directly to what was asked or prompted.
The immediate legal question was whether the resulting text, while certainly exhibiting intellectual properties, constituted the sort of intellectual property that the law was intended to encourage and protect. In previous columns for this magazine, I have shared the quick decisions of scholarly publishers and the U.S. Copyright Office to exclude work produced by AI from having any claim on authorship (here and here). And I continue to see the good sense of this, at least as a starting point, while we learn more about how AI can contribute to scholarly inquiry.
In this column, I want to share what I find to be an instance of the needed clarity when it comes to drawing a legal line around LLMs. On September 20th, the New York Times reported that the Authors Guild, representing the authors John Grisham, Jonathan Franzen, Elin Hilderbrand and others, filed a lawsuit against OpenAI’s infringement of these authors’ intellectual property. As a result of ingesting their books, the suit holds, ChatGPT is capable of producing “derivative works” that will harm the market for these books. I’m sure that the Writers Union of Canada is attending closely to this case. Such a suit can, indeed, help distill the intellectual property issues, if somewhat inadvertently in this case.
In reporting on the story in the Times, Alexandra Alter and Elizabeth A. Harris turn to the novelist Douglas Preston, former president of the Authors Guild, for comment. Preston was frankly stunned by how ChatGPT could comment on the role of even his book’s minor characters, which he pointed out were not described in book reviews or Wikipedia. “That’s when I looked at this and said, ‘My God, ChatGPT has read my books, how many of my books has it read?’” Preston said. “It knew everything, and that’s when I got a bad feeling.” Now, as someone who has been called out for only reading the book reviews, as well as, on other occasions, for knowing even less than that when I’ve actually read the book in question, I can imagine Preston’s shock (if not envy, in my case).
Yet at the same time sharing this candid response to ChatGPT clears the air. He manages to illustrate exactly how it is that ChatGPT’s mastery of his and other novelists’ work is not an infringement of intellectual property rights. In the first instance, Large Language Models generate rather than regurgitate. They do not plagiarize others’ work nor themselves, for that matter. But more importantly, what Preston relays about his ChatGPT conversation pretty much reflects a literary economy in which authors, of Preston’s ilk, have long thrived. After all, book reviewers do not pay for copies of the books they review. Their reviews are critical to sales of the book (in the spirit of “any review is a good review”), even as many of us subscribe to book review outlets and are inspired, on occasion to purchase books. And to have a source available to you, whether a reviewer, teacher, librarian, friend, or an LLM, who or that knows everything about a book, can hardly harm a book’s market. Just the opposite.
Now book reviewers (and friends) have long been scrupulously courteous about not revealing plot twists or resolutions. But for some time, Wikipedia has overlooked that nicety in its “Plot Summary” for books without even a “spoiler alert.” All of which is to say that ChaptGPT fits very tidily into this market. In fact, to hold that its mastery of detail poses a threat surely insults an author’s literary talents.
With this infringement question settled, to admittedly get ahead of myself, we might consider how LLMs can gain access to important sources of knowledge through copyright’s fair dealing or fair use clauses. For if LLMs hallucinate sources on occasion, it is because, like a wanderer in the desert spying an oasis, they are starved of research. What is bound to lead to far richer and more reliable knowledge discovery and translation in the treatment of disease is to enable the LLMs access to the entire biomedical research, only a portion of which is currently open (despite my many Slaw columns on this theme).
Let me conclude with LLMs’ ever-present philosophical question, which has crept into my other columns. It pops up with Preston’s “My God… I got a bad feeling.” This is not about infringement, I hold. It has about it the air of existential dread. ChatGPT could, of course, produce competing novels, but what can really unsettle a novelist is seeing a probability network so readily master one’s work down to the minor characters. It is as if the driverless car of his mind had pulled up, revealing the empty driver’s seat. That bad feeling, Mr. Preston, may well come from glimpsing how we all may be just so much calculating.