Caught Between a Rock and a Hard Place: Research Libraries, AI Research and Contract Override
Research libraries are integral to scholarship, scientific discovery and economic innovation. A foundational element of this support is providing access to extensive collections of scholarly content — peer-reviewed journals and monographs, databases, archives and primary source materials — that enable current research methodologies. Of increasing concern is how libraries can fulfill these obligations when access to much of the essential research corpus is through digital access and governed by contracts that often explicitly or implicitly prohibit certain uses, including AI-driven research methodologies.
AI, cloud computing and increased processing power have accelerated the growth of computational research methodologies and scholars from diverse academic disciplines are increasingly turning to advanced forms of analysis such as text and data mining (TDM). This automated process uses AI, machine learning and natural language processing to extract structured information from vast amounts of unstructured text or numerical data. It enables researchers to analyze massive datasets and identify patterns, correlations and insights without human interaction or hours of analysis. Scholars at post-secondary institutions typically want to apply TDM to find, read and analyze information in academic journals and other content in their research library’s collection. However, library licenses frequently do not permit these activities.
Contractual override refers to situations where the terms of a contract take precedence over, or limit, rights provided by law, a longstanding concern for libraries. Licensing agreements for digital content often restrict uses that would otherwise be legally permitted under Canadian copyright law, including user rights and exceptions such as fair dealing. The challenges are well documented and discussed. In Canada, the Canadian Association of Research Libraries (CARL) and the Canadian Federation of Research Libraries (CFLA) have spoken out about these concerns. The International Federation of Library Associations and Institutions (IFLA) issued a statement on the impact of contract override, and the Association of Research Libraries (ARL) in the United States published a guidebook for libraries on how to understand and address licensing constraints. ARL, a member of the Library Copyright Alliance, sponsored a public symposium on “Protecting Copyright User Rights from Contractual Override” at American University’s Washington College of Law in May 2023 that brought together intellectual property scholars from around the world, including Canada.
While it has been argued that contracts may not override legal exceptions, the lack of jurisprudence and clarity in Canadian copyright legislation, combined with inferred or explicit risks of license contravention, can have real implications. As summarized by UC Berkeley, non-compliance with restrictive licensing terms for text and data mining may have repercussions for the institution, including loss of access to scholarly content as well as penalties for the researcher. In one particularly concerning case, an article was retracted because researchers conducted TDM in violation of the vendor agreement. This puts libraries in the difficult position of having to negotiate for robust research rights, while also keeping researchers and vendors informed about what is at stake when those negotiations are less than successful.
While contract override may be an overarching issue, it is only one element of a problematic licensing environment. AI-assisted research highlights various related licensing and technical restrictions experienced by libraries and their researchers. Vendors can exert control and impede access and use through restrictive contract and licensing terms and the implementation of technological protection measures (TPM) or digital locks. The impact of TPMs is significant, and the option to purchase an add-on application for TDM, generally at substantial additional cost, is not an equitable solution for libraries. This means that a provision in Canadian copyright law that contracts cannot override exceptions and limitations already codified in the legislation would only be a partial solution. In addition to a contract override provision, legislation would need to explicitly permit computational research methodologies, such as text and data mining, and permit the circumvention of technical protection measures for non-infringing purposes.
As demonstrated in other jurisdictions, such as the United Kingdom (UK), which has exceptions permitting TDM for non-commercial purposes and prohibiting contract override in their copyright legislation, technological protection measures still pose significant challenges. Similarly, Singapore has statutes that authorize commercial and non-commercial TDM and limit contractual override, but anti-circumvention rules allow technological restrictions on computational research. Notably, a 2018 report commissioned by the European Parliament’s Policy Department for Citizens’ Rights and Constitutional Affairs at the request of the Committee on Legal Affairs (JURI-Committee) acknowledged that protections against both contractual and technological override should be extended to TDM. However, the EU Copyright Directive (Directive 2019/790 on Copyright in the Digital Single Market) only includes provisions enabling TDM for scientific research and prevents any contractual provisions that attempt to override the exception from being enforceable. It does not limit the impact of TPMs. Canada and others can learn from international copyright precedents set by the UK, Singapore, Japan, France and Germany for both contract override and text and data mining and recommendations from associated organizations.
Libraries and library consortia who manage licenses are working to negotiate reasonable AI-related terms. This process is often challenging due to the lack of consistency in terminology and clauses from vendors, non-disclosure agreement requirements and an imbalance of power between libraries and vendors. However, successful negotiations have occurred, particularly in U.S. research libraries; for example, Colorado State recently renegotiated a prominent agreement to ensure licensed content can be used with any LLM tool.
When research libraries are unable to meet the needs of their community due to restrictive vendor agreements, it provides an opportunity to raise awareness of how complicated the licensing environment is. TDM and related methods are essential for researchers, yet differing legal exceptions mean that colleagues in other countries may use the same content without limitation, making cross-border collaboration difficult. Researchers need to understand why these differences exist across jurisdictions. Equally important is an awareness of the broader issues and understanding the potential publication risks.
While research libraries contend with negotiating reasonable terms for AI-related research methodologies and national library associations advocate for legislative provisions that may smooth the way, researcher backing is valuable. For example, the University of California Academic Council issued a statement supporting UC Libraries’ negotiating position with respect to licenses. Such collective, public support may not always be feasible, but individual researchers can also signal their solidarity with libraries by keeping these licensing optics in mind while pursuing publication and dissemination.
It is clear that computational research methodologies are here to stay and libraries and vendors are having to reconceptualize their collections and services in response. In a recent interview with Library Journal, Xuemao Wang, Dean of Libraries at Northwestern University summarized the new reality well, “AI is changing expectations. Our collections are no longer just materials to read; they are becoming data to compute, analyze, model, and interact with conversationally.” Research libraries and their representative organizations are still contending with the ramifications of this profound change and vendor agreements are just one facet of this much broader recalibration towards collections as data. While libraries are committed to supporting research endeavours at their institutions and are working to improve restrictive and prohibitive licenses both from a legislative and contractual negotiation perspective, a future-proof research sector depends on all stakeholders understanding the current issues and engaging in solutions.


Start the discussion!