January 19 th 2023 Comments Off

Posted in:

Intellectual Property

Column

Copyright and Generative AI

by Alan Macek

Training Data

AI systems typically are ‘trained’ on examples of existing items, such as images for an image generator. The source of the training data can implicate copyright if it is obtained without a clear license or approval of the copyright owner. For example, if data has been scrapped from internet websites the website owners may not have authorized the use of the data for commercial purposes. Building a training set will almost invariable involve some copying and reproducing of the works that form part of the training set.

Some jurisdictions have fair use exceptions that may apply to using copyright protected works for training for AI systems. In Canada, “fair dealing” may be for the purpose of research, private study, education, parody or satire. Even if an AI system is completely different than the original works using in training and not competing with the original work, the copying may not be considered fair dealing if the act did not also fall within one of the listed categories of fair dealing.

As a result, some AI systems may be vulnerable to copyright infringement allegations if copyright owners learn that their rights were infringed when their works were incorporated into a training set used for an AI system. In practice, it may be difficult to identify which works were included in the training set because it may be impossible to discern the training data from the output of the AI system.

In some cases, the output of AI systems may resemble or reproduce significant portions of some of the items in the training data. In such a case, the arguments of copyright owners that the operator of the AI system has infringed copyright may be stronger and may the copyright owner may even have a claim against any users of the AI generated output that incorporates a substantial portion of the copyright owner’s works.

Outputs

When a generative AI system ‘creates’ its output, is it covered by copyright? Who is the author? To be protected by copyright, the courts held that there must be some ‘skill and judgment’ by an author in creating an original work. This was summarized by the Supreme Court in CCH Canadian Ltd. v. Law Society of Upper Canada, 2004 SCC 13 at paragraph 16:

For a work to be “original” within the meaning of the Copyright Act, it must be more than a mere copy of another work. At the same time, it need not be creative, in the sense of being novel or unique. What is required to attract copyright protection in the expression of an idea is an exercise of skill and judgment. By skill, I mean the use of one’s knowledge, developed aptitude or practised ability in producing the work. By judgment, I mean the use of one’s capacity for discernment or ability to form an opinion or evaluation by comparing different possible options in producing the work. This exercise of skill and judgment will necessarily involve intellectual effort. The exercise of skill and judgment required to produce the work must not be so trivial that it could be characterized as a purely mechanical exercise.

For the output of generative AI, whether some text or a graphic, is there skill and judgment in its creation and if so, by whom? For some, the selection of specific prompts by the user is the necessary skill and judgment for the resulting output to be a work protected by copyright and the for the user to be considered the author. In the UK, legislation has gone this route in section 9 of its Copyright, Designs and Patents Act which provides that,

In the case of a literary, dramatic, musical or artistic work which is computer-generated, ‎the author shall be taken to be the person by whom the arrangements necessary for the ‎creation of the work are undertaken‎.

On the other hand, the selection of a prompt may be considered insufficient, perhaps merely an idea, which is not protected by copyright. In such a case, perhaps the software developer, or proprietor of the technology could be considered the author. In such case, the skill and judgment is more directed to the creation (and training of the tool) rather than to the specific output. Typically the creator or manufacturer of a tool is not considered the author of works created using the tool so there are limits to how far this argument that the developer of the AI is the author of the resulting output.

If there is no author, then there may not be any copyright protection in the output of the generative AI and the output would be considered public domain. If the output is used commercially by a first person, such as being put on the front of a t-shirt, anyone could reproduce the design in competition with the original first person.

What happens next?

Canada conducted a consultation, entitled “A Consultation on a Modern Copyright Framework for Artificial Intelligence and the Internet of Things”, on how to handle AI generative works, among other issues. The consultation did not lead to recommendations on the ownership of AI generated works but did recommend amendments on data gathering and fair dealing. At this time there is no proposed amendments to the Copyright Act to handle these types of systems and their outputs.

Until there are amendments that clarify these issues, the copyright implications would likely depend on the specifics of the system, its training data, and how it generates its output. Factors such as the involvement of the user and developer and any skill and judgment they contribute, as well as the similarity of the output with any of the training data would be important. With AI generative tools being used in more applications, these issues are likely to continue to be debated and litigated, and although legislative amendments may lead to more certainty this likely won’t happen for several years.

Aspects of this column were inspired by the debate I participated in at the Intellectual Property Institute of Canada (IPIC) conference in September 2022 along with Jenene Roberts, Jessica Zagar, Roch Ripley and moderated by Tamara Ramsey.

Comments are closed.

Most Recent Comments

Steph Swierenga on Resisting the Echo Chamber: AI-Assisted Judgment Writing and the Risk of Homogenization:

It would be interesting to measure this convergence. Citation diversity could be tracked. If models keep reaching for the same… more »
Kari D Boyle on Meaningful Participation of Children and Youth in Justice: Voice Is Not Enough:

Sorry for my delay in getting back to you Noel. Great question! We definitely need more research in this area.… more »
Alastair Clarke on Issues of Self-Representation in a Landmark Decision: Reflecting on Ahluwalia v. Ahluwalia:

Indeed, this situation is very serious within the immigration context. IRCC encourages applicants to follow their guides and they actively… more »
David Collier-Brown on Resisting the Echo Chamber: AI-Assisted Judgment Writing and the Risk of Homogenization:

I find LLMs are better at critiquing text than writing it. I also tell the editor-bots "If you suggest alternate… more »

+ -

Voice Is Not Enough: Co-Creating the Future of Child-Inclusive Mediation

A Woman’s Work Is Never Done. or Valued Appropriately.

New Perspectives on the Legal Treatise

What Does It Mean to Be a Competent Lawyer in the Age of AI?

The Dangers of Catastrophizing in Client Communications

The Wellness Lawyer: “How Are You?”

Copyright and Generative AI

Training Data

Outputs

What happens next?