Who owns the copyright on AI-assisted writing?

AI writing tools like ChatGPT and Claude continue to get more popular every day. As they become more capable, they’re able to take on more and more writing tasks and will continue to become more and more helpful to writers in the future.

However, using generative AI for writing blog posts, papers and even entire novels raises the question: who owns the copyright to the final work? Can an author use ChatGPT to help write a book and still claim the copyright?

While no copyright case is ever straightforward, AI-assisted writing is definitely an area where the law hasn’t quite caught up yet. However, we can still use some existing legal opinions and precedent to make a “best guess” at how the law might evolve in the future.

It’s important to note here that I’m not a lawyer and this is definitely not legal advice. That said, I’ll make my best attempt to give you the current lay of the land when it comes to generative AI and copyright.

Of course, I’d be remiss if I didn’t thank Claude for help with some of the legal research and initial outlining on this piece.

How does copyright work?

Copyright law in the United States is designed to protect original works of authorship fixed in a “tangible medium of expression” (17 U.S.C. § 102). In other words, the work must be recorded or stored in some physical or digital format. This means that for a work to be eligible for copyright protection, it must be both original and creative. The threshold for originality is relatively low, requiring only a minimal degree of creativity, as established in Feist Publications v. Rural Telephone Service Co. (1991), where it was decided that a directory of telephone numbers was copyrightable.

However, not all elements of a work are protected by copyright. Copyright law distinguishes between protectable expression and unprotectable ideas, facts, and methods of operation (17 U.S.C. § 102(b)). While the specific way an idea is expressed can be protected, the underlying idea itself cannot be copyrighted.

However, when it comes to writing generated with AI, these lines get a bit blurry.

When is a piece “human” enough for a copyright?

James Grimmelmann (2016) argues that, for an AI-generated work to be copyrightable, it must involve creative input or arrangement from a human author. Because the internal processes of AI systems are often opaque and complex, it can be difficult to prove whether an AI-generated output directly copies or derives from the copyrighted works used in its training data. This means that merely providing prompts to an AI system, without further modification or curation of the output, may not meet the threshold of originality required for copyright protection.

This is why it’s important for a human to be in the loop on any and all AI-generated writing. It will ensure higher-quality output and it may just be protecting your copyright too. This involvement may take many forms, such as generating multiple AI outputs and selecting the most promising ones, combining and rearranging AI-generated text in new ways, and substantially editing and refining the AI-assisted content.

For example, if a writer uses ChatGPT to generate a rough draft of a story based on a prompt, but then substantially edits, rewrites, and expands upon that draft, the final story would likely be a copyrightable work of human authorship. The AI system, in this case, is merely a tool that assists in the creative process, rather than the author of the work.

Copyright law protects original works of authorship that involve human creativity and expression. And while AI-generated content can be a valuable tool for creators, it is the human input and direction that ultimately determines whether the resulting work is eligible for copyright protection.

What about the input to LLMs?

When discussing the copyright implications of AI-generated works, it is important to consider the vast amounts of copyrighted material that may be used to train AI systems like ChatGPT and Claude. These training datasets often include books, articles, websites, and other creative works that are themselves protected by copyright law.

Does creating something new based on these works constitute a violation of that copyright or is it a fair use?

What is Fair Use?

Fair use is a legal doctrine that allows for the use of copyrighted material without permission from the copyright owner under certain circumstances. In the United States, fair use is codified in 17 U.S.C. § 107, which outlines four factors that must be considered when determining whether a particular use is fair:

The purpose and character of the use, including whether it is commercial or non-profit educational in nature;
The nature of the copyrighted work;
The amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
The effect of the use upon the potential market for or value of the copyrighted work.

As Benjamin L. W. Sobel (2017) suggests, the purpose of using copyrighted works in AI training is not to reproduce or replace the original works, but rather to create a new tool that can generate novel outputs. This transformative purpose, combined with the fact that AI systems do not typically store or reproduce verbatim copies of the training data, could weigh in favor of a finding of fair use.

Some media outlets are currently in a legal battle with OpenAI over this very issue: whether training tools like ChatGPT is an example of fair use or whether it is copyright infringement.

What’s the current opinion?

Despite the complexities involved, many legal experts believe that the use of copyrighted works in AI training data is likely to be considered a fair use, given the transformative nature of the use and the societal benefits of advancing AI technology.

Disclosing and attributing AI usage

Given that the legal opinion on AI-assisted writing is still being written, let’s look at some of the ways you can protect yourself. One of those ways, and a great way to be up front with your readers, is to acknowledge that some of your writing process was assisted by AI. This is also helpful in addressing the concept of plagiarism when using these tools.

First, it is important to recognize that plagiarism is not a legal concept, but rather an ethical and professional one. Plagiarism involves presenting someone else’s work or ideas as your own without proper attribution. In the context of AI-generated content, plagiarism may occur if an author presents AI-generated text as their own work without acknowledging the role of the AI system.

While there are no universally accepted standards for attributing AI-generated content, some best practices are emerging.

You might consider a simple disclaimer or note indicating that the work was created with the assistance of an AI system. For example, an author might include a statement like: “This novel was written with the help of ChatGPT, an AI writing assistant developed by OpenAI.”

However, the expectations around these sorts of disclosures depend on the kind of writing you’re creating. In academic writing, for example, there may be a higher expectation of detailed attribution and transparency, while in creative writing, a general disclaimer may be sufficient.

Ultimately, the key to avoiding plagiarism and ensuring proper attribution when using AI writing tools is transparency and honesty. By being upfront about the role of AI in their creative process and providing appropriate attribution where necessary, authors can use these powerful tools in an ethical and responsible manner.

What do the AI vendors (OpenAI, Anthropic, Google) say?

Now that we’ve covered copyright law and possible plagiarism concerns, it’s also important to look at what the AI tools themselves say in their terms of service. This is because, whether or not your usage is a legal issue, these vendors can block you from using their tools if you violate their terms of use.

OpenAI (ChatGPT, GPT-4, etc.)

OpenAI doesn’t provide much information about ChatGPT specifically, instead focusing on copyright and terms of use of their API. It’s still an open question whether these same rules apply to ChatGPT.

In their Policy section, OpenAI notes that they “will not claim copyright over content generated by the API for you or your end users.”

More generally, they specify that “As between you and OpenAI, and to the extent permitted by applicable law, you (a) retain your ownership rights in Input and (b) own the Output. We hereby assign to you all our right, title, and interest, if any, in and to Output.”

So, even outside of the API vs ChatGPT distinction, it would appear that as far as OpenAI is concerned, you own your output from any of these tools.

OpenAI also points to their Sharing and Publication Policy, which states the following.

Creators who wish to publish their first-party written content (e.g., a book, compendium of short stories) created in part with the OpenAI API are permitted to do so under the following conditions:

  - The published content is attributed to your name or company.
  - The role of AI in formulating the content is clearly disclosed in a way that no reader could possibly miss, and that a typical reader would find sufficiently easy to understand.
  - Topics of the content do not violate OpenAI’s Content Policy or Terms of Use, e.g., are not related to adult content, spam, hateful content, content that incites violence, or other uses that may cause social harm.
  - We kindly ask that you refrain from sharing outputs that may offend others.

These rules and policies are just more reasons to stick with the attribution guidelines we discussed earlier.

Anthropic (Claude)

Anthropic’s Commercial Terms of Service and Consumer Terms of Service relatively straightforward, noting that, “As between you and Anthropic, and to the extent permitted by applicable law, you retain any right, title, and interest that you have in the Prompts you submit. Subject to your compliance with our Terms, we assign to you all of our right, title, and interest—if any—in Outputs.” meaning that you own any content generated through Anthropic services if whether you’re a paying customer or not, as long as you comply with their terms.

Interestingly, Anthropic will also help defend users if they are sued for copyright infringement unless they “knows or reasonably should know that they are infringing copyright”.

Google (Gemini)

It would appear that Google’s Terms are the simplest of all three major providers, specifying that “As required by the API Terms, you’ll comply with applicable law in using generated content, which may require the provision of attribution to your users when returned as part of an API call.”

This is yet another reason that attribution is key when using AI content, not only to keep you compliant legally and in the right ethically, but also to comply with many vendor’s terms of service.

Practical tips for writing with AI

I’m the first to suggest that using AI tools in your writing process can be a very powerful way to write more quickly and more efficiently. However, there are a few things you should be sure to do to stay on the correct side of the legal (copyright) and ethical (plagiarism) lines:

Use original, unique prompts: When using AI writing tools like ChatGPT or Claude, it is important to start with original, unique prompts that reflect your own creative ideas. Avoid using prompts that are too similar to existing works, as this may increase the risk of the AI system generating content that is substantially similar to those works. Instead, focus on using these tools to build on your own ideas.
Substantially modify and curate AI outputs: While AI writing tools can generate impressive and coherent text, it is important to remember that this text is ultimately the result of an algorithm, not a human author. To create a truly original work, you should substantially modify, edit, and curate the AI-generated content. This may involve rewriting sections, adding new material, or rearranging the structure of the text to better fit your creative vision.
Maintain records of your creative process: As we talked about earlier, the copyright status of AI-generated works depends largely on the level of human creativity and input involved. To help show the human authorship of a work that incorporates AI-generated content, keep a record of your creative process. This might include notes, outlines, drafts, and other documentation that shows how you developed and refined the work over time.
Attribute and disclose AI usage: You should be transparent about the use of AI tools in your creative process. This might involve a general disclaimer indicating that the work was created with the assistance of an AI system, or a more specific attribution for particular passages or sections that rely heavily on AI-generated content. By being upfront about the role of AI in your work, you can build trust with your audience and avoid accusations of plagiarism.
Consult with legal experts for specific situations: If you have other questions or are looking for more specific advice, please consult with a lawyer, preferably one who has experience in the AI field or has dealt with copyright cases before.

Go forth and create

AI generation tools are only going to continue to get more powerful. Just like moving from typewriters to modern word processors, writers would be remiss not to try and leverage these tools in their writing process. While there are definite outstanding legal questions, staying involved in the writing process and using AI as a tool rather than a completely outsourced digital creator that you can put your name on, is the best way to avoid any legal issues.

With that in mind, I can’t wait to see what you write using the help of AI tools like ChatGPT and Claude! If you’re up to it, send me a copy of your first draft or finished product at keanan@floorboardai.com. I’m looking forward to seeing what you write!