What you need to know about copyright “small claims”

Today the U.S. Copyright Office released the Copyright Claims Board website. The site provides information about the copyright small claims process, and will eventually include an electronic case management system once the Office begins to accept claims. This blog post explains what the copyright small claims is, what UC Berkeley community members should know if they receive a copyright small claims notice, and where you can get help or additional information.

— — — — —

In 2020, Congress passed a law called the “Copyright Alternative in Small-Claims Enforcement Act of 2020,” known as the “CASE Act.” The CASE Act mandated the formation of the Copyright Claims Board (“CCB”), a tribunal operating through the U.S. Copyright Office instead of the federal judicial branch, for the purpose of deciding “small claims” copyright infringement actions via a quicker, less expensive process—that is, without all of the procedural requirements of a normal federal court case. Damages are capped at $30,000 for CCB cases.

This page is for UC Berkeley faculty, staff, students, and scholars who might one day find themselves in receipt of a notice that a CCB action has been filed against them. The University of California also has a systemwide information page for UC-affiliated scholars, students, and employees.

Please note that the U.S. Copyright Office is still creating the rules that implement this new law, so the information on this page will evolve. And as with all information on this Office of Scholarly Communication Services website, our office cannot provide you with legal advice. However, we can help you understand how the law works. If you have further questions, contact us at schol-comm@berkeley.edu.

If you receive a claim notice

What will a notice look like?

If you live in California, then a genuine CCB claim notice is required to be “served” to you either in-person (i.e., handed to you) or by U.S. mail. If you have received only an email, you should be wary of its contents because email is not considered valid “service of process” in California.

A genuine CCB case notice will include a docket number and other information yet to be determined. The notice will have a link to the CCB website, where you can enter the docket number on your notice, view information about the particular claim filed against you, and take various actions.

What does it mean?

A claim filed against you in the CCB means that a purported copyright owner is asserting that you have infringed their copyright through something you have uploaded, reproduced, published, created, distributed, performed, or displayed.

The notice you receive signifies that the claimant has alleged copyright infringement, but the notice does not mean you have actually infringed or that the CCB will ultimately determine you have infringed.

Indeed, there are many reasons why your use of a copyrighted work may not be an infringement. For instance, there are key exceptions to copyright law that support teaching, scholarship, and research—most notably, fair use. These exceptions provide complete defenses to claims of infringement or, in some instances, permit a significant reduction of damages. Further, not everything is actually protected by copyright. Claimants may believe they hold copyright in materials that are not subject to copyright (e.g., because the materials reflect only facts or ideas) or are no longer protected by copyright (e.g., because the copyright in the materials has expired). Claimants may also believe that they hold copyright to materials for which copyright is actually held by a third party.

If you believe one of these situations applies to you—that is, that your use of the material is protected by an exception or that the allegations in the claim are not valid—you may wish to dispute the claim or opt out of the CCB proceeding entirely. We explain your options below. Regardless, we recommend you seek legal counsel as soon as possible after receipt of a CCB case notice.

What are your options?

If you receive a properly-served notice, do not ignore it. If you ignore it and do nothing, the case will proceed in the CCB, and a default judgment can be entered against you. This means that the CCB can enter a judgment holding you responsible for all the damages claimed in the notice (up to $30,000), regardless of whether the assertions are true or whether you could have claimed any defenses.

To avoid a default judgment, you will need to respond in the time prescribed by the notice. You can choose to respond in one of two ways:

  • Proceed within the CCB tribunal. If you proceed, the case will be heard by the CCB. The CCB predicts that most cases will be handled completely online, so you will not need to travel to Washington D.C. (where the U.S. Copyright Office is physically located). You will be bound by the CCB’s decision. If the claimant wins, you may have to pay up to $15,000 for each infringed work, with a maximum cap of $30,000. CCB determinations are final. There are only limited circumstances—such as fraud, corruption, and misrepresentation—when a CCB determination can be reviewed by a federal court or the Copyright Office.
  • Opt out of the CCB proceeding. It’s important to understand that, if you opt out, the copyright claimant cannot restart the same claim against you in front of the CCB. So, if you opt out of the CCB, the claimant can either stop pursuing the matter entirely or decide to file suit against you in federal court (assuming they meet all of the federal court filing requirements). Federal court is more expensive and complex than the CCB’s small claims process, so many small claimants may not want to incur the expense or may feel that their allegations will not survive scrutiny in federal court. Also, UC employees likely have broader protections in federal court than in the CCB, so a timely opt-out may be a good option.

If you decide to opt out, you must mail the paper opt-out form provided with your notice, or complete an online opt-out form on the CCB website, within 60 days of service. Note that in California, additional time may be added to the deadline for your response if service of the notice to you was made by mail, pursuant to California rules for service of process.

Note that if you decide to opt out, your decision applies only in response to that particular claim you received. As an individual (as opposed to certain organizations), you cannot opt out prospectively from all future CCB claims.

Where can you get help or more information?

If you’re a UC Berkeley student, staff, or faculty member, and the claim is related to what you do at UC, contact the UC Berkeley Office of Legal Affairs or UC’s Office of General Counsel promptly.

The UC Berkeley Office of Scholarly Communication Services can also answer questions about how the law works, but cannot dispense legal advice to you. You can contact us with questions at schol-comm@berkeley.edu.

The U.S. Copyright Office provides additional information on their Copyright Claims Board Frequently Asked Questions page.


Upcoming workshop reminder: Copyright & Fair Use for Digital Projects

title slide for copyright & fair use for digital projects

We just wrapped up three publishing workshops last week, but there’s more in store. Check out the details below and sign up for the next one offered by the Office of Scholarly Communication Services. See you there!

Copyright and Fair Use for Digital Projects

November 10, 2021
11:00am–12:30pm
RSVP

This online training will help you navigate the copyright, fair use, and usage rights of including third-party content in your digital project. Whether you seek to embed video from other sources for analysis, post material you scanned from a visit to the archives, add images, upload documents, or more, understanding the basics of copyright and discovering a workflow for answering copyright-related digital scholarship questions will make you more confident in your project. We will also provide an overview of your intellectual property rights as a creator and ways to license your own work.


Reminder: 3 grad student publishing workshops coming up soon

The Office of Scholarly Communication Services is again offering a slate of workshops aimed to help graduate students understand copyright in the context of their dissertation or thesis, demystify the book publishing process, and manage their scholarly profile. Click the links below to sign up and get the Zoom details.

Copyright and Your Dissertation

October 25, 2021
1:00pm–2:30pm
RSVP

This workshop will provide you with practical guidance for navigating copyright questions and other legal considerations for your dissertation or thesis. Whether you’re just starting to write or you’re getting ready to file, you can use our tips and workflow to figure out what you can use, what rights you have as an author, and what it means to share your dissertation online.

From Dissertation to Book: Navigating the Publication Process

October 26, 2021
1:00pm–2:30pm
RSVP

Hear from a panel of experts—an acquisitions editor, a first-time book author, and an author rights expert—about the process of turning your dissertation into a book. You’ll come away from this panel discussion with practical advice about revising your dissertation, writing a book proposal, approaching editors, signing your first contract, and navigating the peer review and publication process.

Managing and Maximizing Your Scholarly Impact

October 28, 2021
1:00pm–2:30pm
RSVP

This workshop will provide you with practical strategies and tips for promoting your scholarship, increasing your citations, and monitoring your success. You’ll also learn how to understand metrics, use scholarly networking tools, evaluate journals and publishing options, and take advantage of funding opportunities for Open Access scholarship.


Back in action with your scholarship

decorative
Photo by Chris Montgomery on Unsplash

As the school year restarts in Berkeley, we know the pandemic is not over. But the Office of Scholarly Communication Services is here to help UC Berkeley faculty, students, and staff understand copyright and scholarly publishing with online resources, Zoom workshops, and virtual consultations.

If you’re interested in a recap of our progress and achievement over the last year, check out our 2020-21 annual report

Here’s what’s coming up this semester.

Upcoming Workshops

Publish Digital Books and Open Educational Resources with Pressbooks

September 14, 2021
11:00am–12:30pm
RSVP

If you’re looking to self-publish work of any length and want an easy-to-use tool that offers a high degree of customization, allows flexibility with publishing formats (EPUB, PDF), and provides web-hosting options, Pressbooks may be great for you. Pressbooks is often the tool of choice for academics creating digital books, open textbooks, and open educational resources, since you can license your materials for reuse however you desire. Learn why and how to use Pressbooks for publishing your original books or course materials. You’ll leave the workshop with a project already under way! Signup at the link below and the Zoom login details will be emailed to you.

Copyright and Your Dissertation

October 25, 2021
1:00pm–2:30pm
RSVP

This workshop will provide you with practical guidance for navigating copyright questions and other legal considerations for your dissertation or thesis. Whether you’re just starting to write or you’re getting ready to file, you can use our tips and workflow to figure out what you can use, what rights you have as an author, and what it means to share your dissertation online.

From Dissertation to Book: Navigating the Publication Process

October 26, 2021
1:00pm–2:30pm
RSVP

Hear from a panel of experts—an acquisitions editor, a first-time book author, and an author rights expert—about the process of turning your dissertation into a book. You’ll come away from this panel discussion with practical advice about revising your dissertation, writing a book proposal, approaching editors, signing your first contract, and navigating the peer review and publication process.

Managing and Maximizing Your Scholarly Impact

October 28, 2021
1:00pm–2:30pm
RSVP

This workshop will provide you with practical strategies and tips for promoting your scholarship, increasing your citations, and monitoring your success. You’ll also learn how to understand metrics, use scholarly networking tools, evaluate journals and publishing options, and take advantage of funding opportunities for Open Access scholarship.

Copyright and Fair Use for Digital Projects

November 10, 2021
11:00am–12:30pm
RSVP

This training will help you navigate the copyright, fair use, and usage rights of including third-party content in your digital project. Whether you seek to embed video from other sources for analysis, post material you scanned from a visit to the archives, add images, upload documents, or more, understanding the basics of copyright and discovering a workflow for answering copyright-related digital scholarship questions will make you more confident in your project. We will also provide an overview of your intellectual property rights as a creator and ways to license your own work.

Other ways we can help

We’re here to help answer a variety of questions you might have on intellectual property, digital publishing, and information policy.

Want help or more information? Send us an email at schol-comm@berkeley.edu. We can provide individualized support and personal consultations, online class instruction, and customized support and training for departments.


UC’s new copyright ownership policy: What does it mean for campus?

The University of California has released an updated copyright ownership policy (and accompanying FAQ). This is the policy across all of the UC campuses that governs who (as between the University or the author) owns copyright in scholarly and aesthetic works created by faculty, staff, and students. The copyright ownership policy was last revised in 1992. The Library submitted comments on the draft policy in December 2019, and many of our proposed suggestions and clarifications have been included. 

But what does the updated copyright policy mean for different groups and individuals around campus? This post will break down some of the changes in the new policy. 

What types of works are eligible for copyright ownership? 

Quick Answer: Any copyrightable works created by Academic Authors within the scope of their employment as part of their teaching, research, or scholarship are eligible for copyright ownership. This includes (but is not limited to) journal articles, textbooks, course materials, and more. And now, Academic Authors are eligible to own copyright in software they create. 

Read More: The revised copyright ownership policy includes a new definition: Scholarly & Aesthetic Works. These are “copyrightable works authored by Academic Authors within the scope of their employment as part of or in connection with their teaching, research, or scholarship.” Academic Authors hold copyright in these works when they are created without direct assignment or supervision by the University. The policy provides a non-exhaustive list of works that will be considered Scholarly & Aesthetic Works, such as journal articles, books, case examples, course materials, and visual works of art. Importantly, the updated policy includes software as a category in which Academic Authors may hold copyright (although the UC continues to own the patent rights created in software). 

When do authors own their copyrights? 

Quick Answer:  Now, any employee who has a “general obligation to create copyrightable scholarly or aesthetic works” gets to keep their copyright in those works. Note that students always hold copyright in anything they create while at the UC, unless certain conditions apply (e.g. the student created a copyrightable work in the scope of their employment, it was part of a sponsored grant project, etc.).

Read More: Under U.S. copyright law, the copyright in a work prepared by an employee in the course of  their employment typically resides with the employer. The revised copyright ownership policy continues the practice whereby the University transfers any copyrights it may own in Scholarly & Aesthetic Works to the Academic Authors who prepared those works.

The updated copyright ownership policy expands the definition of “Academic Authors” eligible to own copyrights. Now, Academic Authors means “employees who have a general obligation to create copyrightable scholarly or aesthetic works.” Sometimes it’s relatively clear which employees have a general obligation to create copyrightable works, such as Senate faculty when they conduct and publish research. And represented librarians bear a “general obligation” to produce scholarly works because, under their labor contract and review criteria, creating scholarship is a factor affecting career advancement. 

But what about other campus employees whose job descriptions are less clear and who have no clarifying labor contract? To help you figure out whether you have a “general obligation” to create scholarly works, you can consult copyright ownership policy FAQ #9, which basically says that there is likely some document in writing—whether a contract or written job description or documentation—that would suggest that creating scholarly works is part of your job. In instances where it’s not clear from that documentation, employees may want to consult with their supervisors to discuss the matter.

One other note on collective bargaining agreements. The revised copyright ownership policy contains a provision that if there’s a conflict between it and a union agreement governing copyright ownership by represented employees, then the union agreement prevails.

Note that students (as opposed to employees) always hold copyright in anything they create while studying at the UC, unless certain conditions apply (e.g. the student created a work in the scope of their employment, it was part of a sponsored grant project, etc.) The new copyright ownership policy also clarifies that works created by graduate students (such as theses, dissertations, etc.) are considered “Student Works,” thus under most circumstances the copyright resides with the graduate student. 

When does the University (rather than the author) own copyrights? 

Quick Answer: The University holds the copyright for works created by employees who do not have a “general obligation” to create scholarly works, for most grant-sponsored or commissioned works, and for works that require “Significant University Resources” in the creation of the work.

Read More: There are several common situations in which the University owns the copyright in works created by employees:

  1. When the work is created by an employee who does not have a “general obligation to create scholarly and aesthetic works”: This remains largely unchanged from the previous policy.
  2. When the work is a sponsored or commissioned work:  Sponsored or commissioned works are typically situations in which the University has entered into a separate written agreement with the employee, such as for certain grants or special projects. But the definition of “sponsored works” has been clarified in the new policy so that it’s clear that Academic Authors retain copyright in scholarly articles or other works created based on the underlying findings or deliverables of the sponsored project.
  3. When the work was created with significant university resources: Under the old policy, the University could claim copyright ownership to the works produced by Academic Authors if the author leveraged any “University Resources” while creating the material. The revised copyright ownership policy further limits the cases where this would apply. Now, the University may not claim copyright ownership unless “Significant University Resources” were used to facilitate creation of the work. For something to be considered significant, it must be “beyond the usual support provided by the University and generally available to similarly situated Academic Authors.” And the policy clarifies that support such as customary administrative assistance, library facilities, office space, personal computers, network access, and salary are considered usual support and will not be deemed “significant.” FAQs #14-17 provide more information on “Significant University Resources.” 

In sum, the new copyright ownership policy provides some useful clarity for authors to understand when they hold the copyright. And in a few important ways, it expands and strengthens the ability for these authors to hold copyright in works they produce at the UC.

If you have any questions, please contact schol-comm@berkeley.edu.

 

 

 

 

 


Upcoming workshop on how to share and publish data

The image is a slide with the title of the workshop, data, and presenters

On December 1, 2020 from 12:30pm–2:00pm the Library is teaming up with Research Data Management to host a workshop How to Share and Publish Data: Resources, Law, and Policy. Signup here.

Are you unsure about how you can use or reuse other people’s data in your teaching or research, and what the terms and conditions are? Do you want to share your data with other researchers or license it for reuse but are wondering how and if that’s allowed? Do you have questions about university or granting agency data ownership and sharing policies, rights, and obligations? We will provide clear guidance on all of these questions and more in this interactive webinar on the ins-and-outs of data sharing and publishing.

Join the Library’s Office of Scholarly Communication Services and the Research Data Management Program as we:

  • Explore venues and platforms for sharing and publishing data
  • Unpack the terms of contracts and licenses affecting data reuse, sharing, and publishing
  • Help you understand how copyright does (and does not) affect what you can do with the data you create or wish to use from other people
  • Consider how to license your data for maximum downstream impact and reuse
  • Demystify data ownership and publishing rights and obligations under university and grant policies

Intended audiences include faculty, grad students, post-docs, instructors, and academic support staff, but anyone interested is welcome to attend.


Fall workshops on copyright and publishing

Person sitting in front of a computer screen with sunset in the background.
Photo by Simon Abrams on Unsplash

Welcome back to a strange semester. While we can’t meet up together on campus, the Office of Scholarly Communication Services will continue to offer a full slate of online workshops to help students and early career researchers confidently steer their way through the waters of copyright and publishing. Here is what’s in store for the coming few months.  

Upcoming Workshops

Publish Digital Books and Open Educational Resources with Pressbooks
September 15, 2020
10:00–11:30am

If you’re looking to self-publish work of any length and want an easy-to-use tool that offers a high degree of customization, allows flexibility with publishing formats (EPUB, MOBI, PDF), and provides web-hosting options, Pressbooks may be great for you. Pressbooks is often the tool of choice for academics creating digital books, open textbooks, and open educational resources, since you can license your materials for reuse however you desire. Learn why and how to use Pressbooks for publishing your original books or course materials. You’ll leave the workshop with a project already under way! Signup at the link below and the Zoom login details will be emailed to you.

Copyright and Your Dissertation
October 19, 2020
1:00–2:30pm

This workshop will provide you with a practical guidance for navigating copyright questions and other legal considerations for your dissertation or thesis. Whether you’re just starting to write or you’re getting ready to file, you can use our tips and workflow to figure out what you can use, what rights you have as an author, and what it means to share your dissertation online.

Managing and Maximizing Your Scholarly Impact
October 20, 2020
1:00–2:30pm

This workshop will provide you with practical strategies and tips for promoting your scholarship, increasing your citations, and monitoring your success. You’ll also learn how to understand metrics, use scholarly networking tools, evaluate journals and publishing options, and take advantage of funding opportunities for Open Access scholarship.

From Dissertation to Book: Navigating the Publication Process
October 22, 2020
1:00–2:30pm

Hear from a panel of experts—an acquisitions editor, a first-time book author, and an author rights expert—about the process of turning your dissertation into a book. You’ll come away from this panel discussion with practical advice about revising your dissertation, writing a book proposal, approaching editors, signing your first contract, and navigating the peer review and publication process.

Copyright and Fair Use for Digital Projects
November 10, 2020
11:00am–12:30pm

This training will help you navigate the copyright, fair use, and usage rights of including third-party content in your digital project. Whether you seek to embed video from other sources for analysis, post material you scanned from a visit to the archives, add images, upload documents, or more, understanding the basics of copyright and discovering a workflow for answering copyright-related digital scholarship questions will make you more confident in your publication. We will also provide an overview of your intellectual property rights as a creator and ways to license your own work.

 

Archived Recordings

We hosted a few workshops over the summer that might be of interest to you. 

Copyright in Course Design & Digital Learning Environments
Video Recording
Slides

If you’re wondering what you can or can’t upload and distribute in your online courses, we’re here to help with answers and best practices. We will cover copyright, fair use, and contractual issues that emerge in online course design. The goal of the webinar is for attendees to gain a deeper understanding of the legal considerations in creating digital courses, and to feel more confident in their content design decisions to support student learning. This webinar is appropriate both for instructors and staff supporting online courses.

Can We Digitize This? Understanding Law, Policy, & Ethics in Bringing our Collections to Digital Life
Video Recording
Slides

As part of the Digital Lifecycle Program, the UC Berkeley Library aims to digitize 200 million items from its special collections (rare books, manuscripts, photographs, archives, and ephemera) for the world to discover and use. But before we can digitize and publish them online for worldwide access, we have to sort out legal and ethical questions. We’ve created and released “responsible access workflows” that will benefit not only our Library’s digitization efforts, but also those of cultural heritage institutions such as museums, archives, and libraries throughout the nation.

Building Legal Literacies for Text Data Mining Institute
Video Recordings
Transcripts + Slides

In June, we welcomed 32 digital humanities (DH) researchers and professionals to the Building Legal Literacies for Text Data Mining (Building LLTDM) Institute. Our goal was to empower DH researchers, librarians, and professional staff to confidently navigate law, policy, ethics, and risk within digital humanities text data mining (TDM) projects—so they can more easily engage in this type of research and contribute to the further advancement of knowledge.

Other ways we can help

In addition to the workshops, we’re here to help answer a variety of questions you might have on intellectual property, digital publishing, and information policy.  

Want help or more information? Send us an email. We can provide individualized support and personal consultations, online class instruction, presentations and workshops for small or large groups & classes, and customized support and training for departments and disciplines.

 

 


Workshop: Copyright in Course Design and Digital Learning Environments

The Library’s Office of Scholarly Communication Services is hosting an online workshop on July 9, from 10-11:30 on copyright, fair use, and contracts issues that arise in online course development.

Copyright in Course Design and Digital Learning Environments

If you’re wondering what you can or can’t upload and distribute in your online courses, we’re here to help with answers and best practices. We will cover copyright, fair use, and contractual issues that emerge in online course design. The goal of the webinar is for attendees to gain a deeper understanding of the legal considerations in creating digital courses, and to feel more confident in their content design decisions to support student learning. This webinar is appropriate both for instructors and staff supporting online courses.

Publish your scholarship like a pro!

Woman wearing gold watch, sitting at table, typing on a Microsoft Surface notebook
Photograph by Women of Color in Tech, CC-BY 2.0.

We’re more than a month into the fall semester, and if you’re a graduate student or postdoc you’ve probably been thinking about some of the milestones on your horizon, from filing your thesis or dissertation to pitching your first book project or looking for a job.

While we can’t write your dissertation or submit your job application for you, the Library can help in other ways! We are collaborating with GradPro to offer a series of professional development workshops for grad students, postdocs, and other early career scholars to guide you through important decisions and tasks in the research and publishing process, from preparing your dissertation to building a global audience for your work.

  • October 22: Copyright and Your Dissertation
  • October 23: From Dissertation to Book: Navigating the Publication Process
  • October 25: Managing and Maximizing Your Scholarly Impact

These sessions are focused on helping early career researchers develop real-world scholarly publishing skills and apply this expertise to a more open, networked, and interdisciplinary publishing environment.

These workshops are also taking place during Open Access Week 2019, an annual global effort to bring attention to Open Access around the world and highlight how the free, immediate, online availability of scholarship can remove barriers to information, support emerging scholarship, and foster the spread of knowledge and innovation.

Below is the list of next week’s workshop offerings. Join us for one workshop or all three! Each session will take place at the Graduate Professional Development Center, 309 Sproul Hall. Please RSVP at the links below.

Light refreshments will be served at all workshops.

If you have any questions about these workshops, please get in touch with schol-comm@berkeley.edu. And if you can’t make it to a workshop but still need help with your publishing, we are always here for you!

 

Copyright and Your Dissertation

Workshop | October 22 | 1-2:30 p.m. | 309 Sproul Hall

This workshop will provide you with a practical workflow for navigating copyright questions and legal considerations for your dissertation or thesis. Whether you’re just starting to write or you’re getting ready to file, you can use this workflow to figure out what you can use, what rights you have, and what it means to share your dissertation online.

RSVP (Copyright)

 

From Dissertation to Book: Navigating the Publication Process

Panel Discussion | October 23 | 3-4:30 p.m. | 309 Sproul Hall

Hear from a panel of experts – an acquisitions editor, a first-time book author, and an author rights expert – about the process of turning your dissertation into a book. You’ll come away from this panel discussion with practical advice about revising your dissertation, writing a book proposal, approaching editors, signing your first contract, and navigating the peer review and publication process.

RSVP (Book)

 

Managing and Maximizing Your Scholarly Impact

Workshop | October 25 | 1-2:30 p.m. | 309 Sproul Hall

This workshop will provide you with practical strategies and tips for promoting your scholarship, increasing your citations, and monitoring your success. You’ll also learn how to understand metrics, use scholarly networking tools, evaluate journals and publishing options, and take advantage of funding opportunities for Open Access scholarship.

RSVP (Impact)


Team Awarded Grant to Help Digital Humanities Scholars Navigate Legal Issues of Text Data Mining

We are thrilled to share that the National Endowment for the Humanities (NEH) has awarded a $165,000 grant to a UC Berkeley-led team of legal experts, librarians, and scholars who will help humanities researchers and staff navigate complex legal questions in cutting-edge digital research.

What is this grant all about?

If you were to crack open some popular English-language novels written in the 1850’s–say, ones from Brontë, Hawthorne, Dickens, and Melville–you would find they describe men and women in very different terms. While a male character might be said to “get” something, a female character is more likely to have “felt” it. Whereas the word “mind” might be used when describing a man, the word “heart” is more likely to be used about a woman. Yet, as the 19th Century became the 20th, these descriptive differences between genders actually diminish. How do we know all this? We confess we have not actually read every novel ever written between the 19th and 21st Centuries (though we’d love to envision a world in which we could). Instead, we can make this assertion because researchers (including David Bamman, of UC Berkeley’s School of Information) used automated techniques to extract information from the novels, and analyzed these word usage trends at scale. They crafted algorithms to turn the language of those novels into data about the novels.

In fields of inquiry like the digital humanities, the application of such automated techniques and methods for identifying, extracting, and analyzing patterns, trends, and relationships across large volumes of unstructured or thinly-structured digital content is called “text data mining.” (You may also see it referred to as “text and data mining” or “computational text analysis”). Text data mining provides humanists and social scientists with invaluable frameworks for sifting, organizing, and analyzing vast amounts of material. For instance, these methods make it possible to:

The Problem

Until now, humanities researchers conducting text data mining have had to navigate a thicket of legal issues without much guidance or assistance. For instance, imagine the researchers needed to scrape content about Egyptian artifacts from online sites or databases, or download videos about Egyptian tomb excavations, in order to conduct their automated analysis. And then imagine the researchers also want to share these content-rich data sets with others to encourage research reproducibility or enable other researchers to query the data sets with new questions. This kind of work can raise issues of copyright, contract, and privacy law, not to mention ethics if there are issues of, say, indigenous knowledge or cultural heritage materials plausibly at risk. Indeed, in a recent study of humanities scholars’ text analysis needs, participants noted that access to and use of copyright-protected texts was a “frequent obstacle” in their ability to select appropriate texts for text data mining. 

Potential legal hurdles do not just deter text data mining research; they also bias it toward particular topics and sources of data. In response to confusion over copyright, website terms of use, and other perceived legal roadblocks, some digital humanities researchers have gravitated to low-friction research questions and texts to avoid decision-making about rights-protected data. They use texts that have entered into the public domain or use materials that have been flexibly licensed through initiatives such as Creative Commons or Open Data Commons. When researchers limit their research to such sources, it is inevitably skewed, leaving important questions unanswered, and rendering resulting findings less broadly applicable. A growing body of research also demonstrates how race, gender, and other biases found in openly available texts have contributed to and exacerbated bias in developing artificial intelligence tools. 

The Solution

The good news is that the NEH has agreed to support an Institute for Advanced Topics in the Digital Humanities to help key stakeholders to learn to better navigate legal issues in text data mining. Thanks to the NEH’s $165,000 grant, Rachael Samberg of UC Berkeley Library’s Office of Scholarly Communication Services will be leading a national team (identified below) from more than a dozen institutions and organizations to teach humanities researchers, librarians, and research staff how to confidently navigate the major legal issues that arise in text data mining research. 

Our institute is aptly called Building Legal Literacies for Text Data Mining (Building LLTDM), and will run from June 23-26, 2020 in Berkeley, California. Institute instructors are legal experts, humanities scholars, and librarians immersed in text data mining research services, who will co-lead experiential meeting sessions empowering participants to put the curriculum’s concepts into action.

In October, we will issue a call for participants, who will receive stipends to support their attendance. We will also be publishing all of our training materials in an openly-available online book for researchers and librarians around the globe to help build academic communities that extend these skills.

Building LLTDM team member Matthew Sag, a law professor at Loyola University Chicago School of Law and leading expert on copyright issues in the digital humanities, said he is “excited to have the chance to help the next generation of text data mining researchers open up new horizons in knowledge discovery. We have learned so much in the past ten years working on HathiTrust [a text-minable digital library] and related issues. I’m looking forward to sharing that knowledge and learning from others in the text data mining community.” 

Team member Brandon Butler, a copyright lawyer and library policy expert at the University of Virginia, said, “In my experience there’s a lot of interest in these research methods among graduate students and early-career scholars, a population that may not feel empowered to engage in “risky” research. I’ve also seen that digital humanities practitioners have a strong commitment to equity, and they are working to build technical literacies outside the walls of elite institutions. Building legal literacies helps ease the burden of uncertainty and smooth the way toward wider, more equitable engagement with these research methods.”

Kyle K. Courtney of Harvard University serves as Copyright Advisor at Harvard Library’s Office for Scholarly Communication, and is also a Building LLTDM team member. Courtney added, “We are seeing more and more questions from scholars of all disciplines around these text data mining issues. The wealth of full-text online materials and new research tools provide scholars the opportunity to analyze large sets of data, but they also bring new challenges having to do with the use and sharing not only of the data but also of the technological tools researchers develop to study them. I am excited to join the Building LLTDM team and help clarify these issues and empower humanities scholars and librarians working in this field.”

Megan Senseney, Head of the Office of Digital Innovation and Stewardship at the University of Arizona Libraries reflected on the opportunities for ongoing library engagement that extends beyond the initial institute. Senseney said that, “Establishing a shared understanding of the legal landscape for TDM is vital to supporting research in the digital humanities and developing a new suite of library services in digital scholarship. I’m honored to work and learn alongside a team of legal experts, librarians, and researchers to create this institute, and I look forward to integrating these materials into instruction and outreach initiatives at our respective universities.”

Next Steps

The Building LLTDM team is excited to begin supporting humanities researchers, staff, and librarians en route to important knowledge creation. Stay tuned if you are interested in participating in the institute. 

In the meantime, please join us in congratulating all the members of the project team:

  • Rachael G. Samberg (University of California, Berkeley) (Project Director)
  • Scott Althaus (University of Illinois, Urbana-Champaign)
  • David Bamman (University of California, Berkeley)
  • Sara Benson (University of Illinois, Urbana-Champaign)
  • Brandon Butler (University of Virginia)
  • Beth Cate (Indiana University, Bloomington)
  • Kyle K. Courtney (Harvard University)
  • Maria Gould (California Digital Library)
  • Cody Hennesy (University of Minnesota, Twin Cities)
  • Eleanor Koehl (University of Michigan)
  • Thomas Padilla (University of Nevada, Las Vegas; OCLC Research)
  • Stacy Reardon (University of California, Berkeley)
  • Matthew Sag (Loyola University Chicago)
  • Brianna Schofield (Authors Alliance)
  • Megan Senseney (University of Arizona)
  • Glen Worthey (Stanford University)