Legal Sidebari

Generative Artificial Intelligence and
Copyright Law
Updated September 29, 2023
Innovations in artificial intelligence (AI) are raising new questions about how copyright law principles
such as authorship, infringement, and fair use will apply to content created or used by AI. So-called
“generative AI” computer programs—such as Open AI’s DALL-E and ChatGPT programs, Stability AI’s
Stable Diffusion program, and Midjourney’s self-titled program—are able to generate new images, texts,
and other content (or “outputs”) in response to a user’s textual prompts (or “inputs”). These generative AI
programs are trained to generate such outputs partly by exposing them to large quantities of existing
works such as writings, photos, paintings, and other artworks. This Legal Sidebar explores questions that
courts and the U.S. Copyright Office have begun to confront regarding whether generative AI outputs
may be copyrighted and how generative AI might infringe copyrights in other works.
Copyright in Works Created with Generative AI
The widespread use of generative AI programs raises the question of who, if anyone, may hold the
copyright to content created using these programs.
Do AI Outputs Enjoy Copyright Protection?
The question of whether or not copyright protection may be afforded to AI outputs—such as images
created by DALL-E or texts created by ChatGPT—likely hinges at least partly on the concept of
“authorship.” The U.S. Constitution authorizes Congress to “secur[e] for limited Times to Authors . . . the
exclusive Right to their . . . Writings.” Based on this authority, the Copyright Act affords copyright
protection to “original works of authorship.” Although the Constitution and Copyright Act do not
explicitly define who (or what) may be an “author,” the U.S. Copyright Office recognizes copyright only
in works “created by a human being.” Courts have likewise declined to extend copyright protection to
nonhuman authors, holding that a monkey who took a series of photos lacked standing to sue under the
Copyright Act; that some human creativity was required to copyright a book purportedly inspired by
celestial beings; and that a living garden could not be copyrighted as it lacked a human author.
A recent lawsuit challenged the human-authorship requirement in the context of works purportedly
“authored” by AI. In June 2022, Stephen Thaler sued the Copyright Office for denying his application to
Congressional Research Service
https://crsreports.congress.gov
LSB10922
CRS Legal Sidebar
Prepared for Members and
Committees of Congress

Congressional Research Service
2
register a visual artwork that he claims was authored “autonomously” by an AI program called the
Creativity Machine. Dr. Thaler argued that human authorship is not required by the Copyright Act. On
August 18, 2023, a federal district court granted summary judgment in favor of the Copyright Office. The
court held that “human authorship is an essential part of a valid copyright claim,” reasoning that only
human authors need copyright as an incentive to create works. Dr. Thaler has stated that he plans to
appeal the decision.
Assuming that a copyrightable work requires a human author, works created by humans using generative
AI could still be entitled to copyright protection, depending on the nature of human involvement in the
creative process. However, a recent copyright proceeding and subsequent Copyright Registration
Guidance indicate that the Copyright Office is unlikely to find the requisite human authorship where an
AI program generates works in response to text prompts. In September 2022, Kris Kashtanova registered
a copyright for a graphic novel illustrated with images that Midjourney generated in response to text
inputs. In October 2022, the Copyright Office initiated cancellation proceedings, noting that Kashtanova
had not disclosed the use of AI. Kashtanova responded by arguing that the images were made via “a
creative, iterative process.” On February 21, 2023, the Copyright Office determined that the images were
not copyrightable, deciding that Midjourney, rather than Kashtanova, authored the “visual material.” In
March 2023, the Copyright Office released guidance stating that, when AI “determines the expressive
elements of its output, the generated material is not the product of human authorship.”
Some commentators assert that some AI-generated works should receive copyright protection, arguing
that AI programs are like other tools that human beings have used to create copyrighted works. For
example, the Supreme Court has held since the 1884 case Burrow-Giles Lithographic Co. v. Sarony that
photographs can be entitled to copyright protection where the photographer makes decisions regarding
creative elements such as composition, arrangement, and lighting. Generative AI programs might be seen
as a new tool analogous to the camera, as Kashtanova argued.
Other commentators and the Copyright Office dispute the photography analogy and question whether AI
users exercise sufficient creative control for AI to be considered merely a tool. In Kashtanova’s case, the
Copyright Office reasoned that Midjourney was not “a tool that [] Kashtanova controlled and guided to
reach [their] desired image” because it “generates images in an unpredictable way.” The Copyright Office
instead compared the AI user to “a client who hires an artist” and gives that artist only “general
directions.” The office’s March 2023 guidance similarly claims that “users do not exercise ultimate
creative control over how [generative AI] systems interpret prompts and generate materials.” One of
Kashtanova’s lawyers, on the other hand, argues that the Copyright Act does not require such exacting
creative control, noting that certain photographs and modern art incorporate a degree of happenstance.
Some commentators argue that the Copyright Act’s distinction between copyrightable “works” and
noncopyrightable “ideas” supplies another reason that copyright should not protect AI-generated works.
One law professor has suggested that the human user who enters a text prompt into an AI program—for
instance, asking DALL-E “to produce a painting of hedgehogs having a tea party on the beach”—has
“contributed nothing more than an idea” to the finished work. According to this argument, the output
image lacks a human author and cannot be copyrighted.
While the Copyright Office’s actions indicate that it may be challenging to obtain copyright protection for
AI-generated works, the issue remains unsettled. Applicants may file suit in U.S. district court to
challenge the Copyright Office’s final decisions to refuse to register a copyright (as Dr. Thaler did), and it
remains to be seen whether federal courts will agree with all of the office’s decisions. While the
Copyright Office notes that courts sometimes give weight to the office’s experience and expertise in this
field, courts will not necessarily adopt the office’s interpretations of the Copyright Act.
In addition, the Copyright Office’s guidance accepts that works “containing” AI-generated material may
be copyrighted under some circumstances, such as “sufficiently creative” human arrangements or

Congressional Research Service
3
modifications of AI-generated material or works that combine AI-generated and human-authored material.
The office states that the author may only claim copyright protection “for their own contributions” to such
works, and they must identify and disclaim AI-generated parts of the work if they apply to register their
copyright. In September 2023, for instance, the Copyright Office Review Board affirmed the office’s
refusal to register a copyright for an artwork that was generated by Midjourney and then modified in
various ways by the applicant, since the applicant did not disclaim the AI-generated material.
Who Owns the Copyright to Generative AI Outputs?
Assuming some AI-created works may be eligible for copyright protection, who owns that copyright? In
general, the Copyright Act vests ownership “initially in the author or authors of the work.” Given the lack
of judicial or Copyright Office decisions recognizing copyright in AI-created works to date, however, no
clear rule has emerged identifying who the “author or authors” of these works could be. Returning to the
photography analogy, the AI’s creator might be compared to the camera maker, while the AI user who
prompts the creation of a specific work might be compared to the photographer who uses that camera to
capture a specific image. On this view, the AI user would be considered the author and, therefore, the
initial copyright owner. The creative choices involved in coding and training the AI, on the other hand,
might give an AI’s creator a stronger claim to some form of authorship than the manufacturer of a camera.
Companies that provide AI software may attempt to allocate the respective ownership rights of the
company and its users via contract, such as the company’s terms of service. OpenAI’s Terms of Use, for
example, appear to assign any copyright to the user: “OpenAI hereby assigns to you all its right, title and
interest in and to Output.” A previous version, by contrast, purported to give OpenAI such rights. As one
scholar commented, OpenAI appears to “bypass most copyright questions through contract.”
Copyright Infringement by Generative AI
Generative AI also raises questions about copyright infringement. Commentators and courts have begun
to address whether generative AI programs may infringe copyright in existing works, either by making
copies of existing works to train the AI or by generating outputs that resemble those existing works.
Does the AI Training Process Infringe Copyright in Other Works?
AI systems are “trained” to create literary, visual, and other artistic works by exposing the program to
large amounts of data, which may include text, images, and other works downloaded from the internet.
This training process involves making digital copies of existing works. As the U.S. Patent and Trademark
Office has described, this process “will almost by definition involve the reproduction of entire works or
substantial portions thereof.” OpenAI, for example, acknowledges that its programs are trained on “large,
publicly available datasets that include copyrighted works” and that this process “involves first making
copies of the data to be analyzed” (although it now offers an option to remove images from training future
image generation models). Creating such copies without permission may infringe the copyright holders’
exclusive right to make reproductions of their work.
AI companies may argue that their training processes constitute fair use and are therefore noninfringing.
Whether or not copying constitutes fair use depends on four statutory factors under 17 U.S.C. § 107:
1. the purpose and character of the use, including whether such use is of a commercial
nature or is for nonprofit educational purposes;
2. the nature of the copyrighted work;
3. the amount and substantiality of the portion used in relation to the copyrighted work as a
whole; and

Congressional Research Service
4
4. the effect of the use upon the potential market for or value of the copyrighted work.
Some stakeholders argue that the use of copyrighted works to train AI programs should be considered a
fair use under these factors. Regarding the first factor, OpenAI argues its purpose is “transformative” as
opposed to “expressive” because the training process creates “a useful generative AI system.” OpenAI
also contends that the third factor supports fair use because the copies are not made available to the public
but are used only to train the program. For support, OpenAI cites The Authors Guild, Inc. v. Google, Inc.,
in which the U.S. Court of Appeals for the Second Circuit held that Google’s copying of entire books to
create a searchable database that displayed excerpts of those books constituted fair use.
Regarding the fourth fair use factor, some generative AI applications have raised concern that training AI
programs on copyrighted works allows them to generate similar works that compete with the originals.
For example, an AI-generated song called “Heart on My Sleeve,” made to sound like the artists Drake and
The Weeknd, was heard millions of times on streaming services. Universal Music Group, which has deals
with both artists, argues that AI companies violate copyright by using these artists’ songs in training data.
OpenAI states that its visual art program DALL-E 3 “is designed to decline requests that ask for an image
in the style of a living artist.”
Plaintiffs have filed multiple lawsuits claiming the training process for AI programs infringed their
copyrights in written and visual works. These include lawsuits by the Authors Guild and authors Paul
Tremblay, Michael Chabon, Sarah Silverman, and others against OpenAI; separate lawsuits by Michael
Chabon, Sarah Silverman, and others against Meta Platforms; proposed class action lawsuits against
Alphabet Inc. and Stability AI and Midjourney; and a lawsuit by Getty Images against Stability AI. The
Getty Images lawsuit, for instance, alleges that “Stability AI has copied at least 12 million copyrighted
images from Getty Images’ websites . . . in order to train its Stable Diffusion model.” This lawsuit appears
to dispute any characterization of fair use, arguing that Stable Diffusion is a commercial product,
weighing against fair use under the first statutory factor, and that the program undermines the market for
the original works, weighing against fair use under the fourth factor.
In September 2023, a U.S. district court ruled that a jury trial would be needed to determine whether it
was fair use for an AI company to copy case summaries from Westlaw, a legal research platform, to train
an AI program to quote pertinent passages from legal opinions in response to questions from a user. The
court found that, while the defendant’s use was “undoubtedly commercial,” a jury would need to resolve
factual disputes concerning whether the use was “transformative” (factor 1), to what extent the nature of
the plaintiff’s work favored fair use (factor 2), whether the defendant copied more than needed to train the
AI program (factor 3), and whether the AI program would constitute a “market substitute” for Westlaw
(factor 4). While the AI program at issue might not be considered “generative” AI, the same kinds of facts
might be relevant to a court’s fair-use analysis of making copies to train generative AI models.
Do AI Outputs Infringe Copyrights in Other Works?
AI programs might also infringe copyright by generating outputs that resemble existing works. Under
U.S. case law, copyright owners may be able to show that such outputs infringe their copyrights if the AI
program both (1) had access to their works and (2) created “substantially similar” outputs.
First, to establish copyright infringement, a plaintiff must prove the infringer “actually copied” the
underlying work. This is sometimes proven circumstantially by evidence that the infringer “had access to
the work.” For AI outputs, access might be shown by evidence that the AI program was trained using the
underlying work. For instance, the underlying work might be part of a publicly accessible internet site that
was downloaded or “scraped” to train the AI program.
Second, a plaintiff must prove the new work is “substantially similar” to the underlying work to establish
infringement. The substantial similarity test is difficult to define and varies across U.S. courts. Courts

Congressional Research Service
5
have variously described the test as requiring, for example, that the works have “a substantially similar
total concept and feel” or “overall look and feel” or that “the ordinary reasonable person would fail to
differentiate between the two works.” Leading cases have also stated that this determination considers
both “the qualitative and quantitative significance of the copied portion in relation to the plaintiff’s work
as a whole.” For AI-generated outputs, no less than traditional works, the “substantial similarity” analysis
may require courts to make these kinds of comparisons between the AI output and the underlying work.
There is significant disagreement as to how likely it is that generative AI programs will copy existing
works in their outputs. OpenAI argues that “[w]ell-constructed AI systems generally do not regenerate, in
any nontrivial portion, unaltered data from any particular work in their training corpus.” Thus, OpenAI
states, infringement “is an unlikely accidental outcome.” By contrast, the Getty Images lawsuit alleges
that “Stable Diffusion at times produces images that are highly similar to and derivative of the Getty
Images.” One study has found “a significant amount of copying” in less than 2% of the images created by
Stable Diffusion, but the authors claimed that their methodology “likely underestimates the true rate” of
copying.
Two kinds of AI outputs may raise special concerns. First, some AI programs may be used to create works
involving existing fictional characters. These works may run a heightened risk of copyright infringement
insofar as characters sometimes enjoy copyright protection in and of themselves. Second, some AI
programs may be prompted to create artistic or literary works “in the style of” a particular artist or author,
although—as noted above—some AI programs may now be designed to “decline” such prompts. These
outputs are not necessarily infringing, as copyright law generally prohibits the copying of specific works
rather than an artist’s overall style. Regarding the AI-generated song “Heart on My Sleeve,” for instance,
one commentator notes that the imitation of Drake’s voice appears not to violate copyright law, although
it may raise concerns under state right-of-publicity laws. Nevertheless, some artists are concerned that AI
programs are uniquely capable of mass-producing works that copy their style, potentially undercutting the
value of their work. Plaintiffs in one lawsuit against Stable Diffusion, for example, claim that few human
artists can successfully mimic another artist’s style, whereas “AI Image Products do so with ease.”
A final question is who is (or should be) liable if generative AI outputs do infringe copyrights in existing
works. Under current doctrines, both the AI user and the AI company could potentially be liable. For
instance, even if a user were directly liable for infringement, the AI company could potentially face
liability under the doctrine of “vicarious infringement,” which applies to defendants who have “the right
and ability to supervise the infringing activity” and “a direct financial interest in such activities.” The
lawsuit against Stable Diffusion, for instance, claims that the defendant AI companies are vicariously
liable for copyright infringement. One complication of AI programs is that the user might not be aware
of—or have access to—a work that was copied in response to the user’s prompt. Under current law, this
may make it challenging to analyze whether the user is liable for copyright infringement.
Considerations for Congress
Congress may consider whether any of the copyright law questions raised by generative AI programs
require amendments to the Copyright Act or other legislation. Congress may, for example, consider
legislation clarifying whether AI-generated works are copyrightable, who should be considered the author
of such works, or when the process of training generative AI programs constitutes fair use. Given how
little opportunity the courts and Copyright Office have had to address these issues, Congress may adopt a
wait-and-see approach. As the courts gain experience handling cases involving generative AI, they may be
able to provide greater guidance and predictability in this area through judicial opinions. Based on the
outcomes of these cases, Congress may reassess whether legislative action is needed.

Congressional Research Service
6
Author Information

Christopher T. Zirpoli

Legislative Attorney

Disclaimer
This document was prepared by the Congressional Research Service (CRS). CRS serves as nonpartisan shared staff
to congressional committees and Members of Congress. It operates solely at the behest of and under the direction of
Congress. Information in a CRS Report should not be relied upon for purposes other than public understanding of
information that has been provided by CRS to Members of Congress in connection with CRS’s institutional role.
CRS Reports, as a work of the United States Government, are not subject to copyright protection in the United
States. Any CRS Report may be reproduced and distributed in its entirety without permission from CRS. However,
as a CRS Report may include copyrighted images or material from a third party, you may need to obtain the
permission of the copyright holder if you wish to copy or otherwise use copyrighted material.

LSB10922 · VERSION 5 · UPDATED