UC Berkeley Library and Internet Archive co-directing project to help text data mining researchers navigate cross-border legal and ethical issues

We are excited to announce that the National Endowment for the Humanities (NEH) has awarded nearly $50,000 to UC Berkeley Library and Internet Archive to study legal and ethical issues in cross-border text data mining. The funding was made possible through NEH’s Digital Humanities Advancement Grant program

NEH funding for the project, entitled Legal Literacies for Text Data Mining – Cross Border (“LLTDM-X”), will support research and analysis to address law and policy issues faced by U.S. digital humanities practitioners whose text data mining research and practice intersects with foreign-held or -licensed content, or involves international research collaborations. 

LLTDM-X builds upon the highly successful Building Legal Literacies for Text Data Mining Institute (Building LLTDM), previously funded by the NEH in 2019. UC Berkeley Library directed Building LLTDM in June 2020, bringing together expert faculty from across the country to train 32 digital humanities researchers on how to navigate law, policy, ethics, and risk within text data mining projects. (All of the results and impacts are summarized in the white paper here.) 

In Building LLTDM’s instructional sessions and post-workshop evaluations, participants identified cross-border research collaborations as an ongoing and critical legal and policy problem, and they also noted that foreign law and ethics issues pervaded their research. UC Berkeley Library’s Office of Scholarly Communication Services partnered with Internet Archive to begin to address these essential needs, and LLTDM-X sprung to life.

Why is LLTDM-X needed?

Text data mining, or TDM, is an increasingly essential and widespread research approach. TDM relies on automated techniques and algorithms to extract revelatory information from large sets of unstructured or thinly-structured digital content. These methodologies allow scholars to identify and analyze critical social, scientific, and literary patterns, trends, and relationships across volumes of data that would otherwise be impossible to sift through.

While TDM methodologies offer great potential, they also present scholars with nettlesome law and policy challenges that can prevent them from understanding how to move forward with their research. Building LLTDM trained TDM researchers and professionals on essential principles of copyright, licensing, and privacy law, as well as ethics—thereby helping them move forward with impactful digital humanities research.

As Building LLTDM revealed, United States digital humanities scholars do not conduct text data mining research only in or about the U.S. Further, digital humanities research in particular is marked by collaboration across institutions and geographical boundaries. Yet, U.S. practitioners encounter expanding and increasingly complex cross-border problems. 

For example, U.S. contract law may supersede rights under copyright, such that a U.S. database license agreement may prohibit text data mining and other fair uses, whereas UK licenses cannot. Therefore U.S. TDM practitioners collaborating with UK-based colleagues face impactful choices about which agreements to apply, as this may determine whether text data mining is permitted. In the U.S., “breaking” technological protection measures to conduct text data mining is now authorized within certain parameters, yet other jurisdictions prohibit such work or apply different conditions. U.S. text data mining researchers must accordingly consider how they work with internationally-held or -licensed materials or collaborators. 

There are at least three such “cross-border” TDM scenarios that scholars must parse, including: (i) if the materials they want to mine are housed in a foreign jurisdiction, or are otherwise subject to foreign database licensing or laws; (ii) if the human subjects they are studying or who created the underlying content reside in another country; or, (iii) if the colleagues with whom they are collaborating reside abroad, yielding uncertainty about which country’s laws, agreements, and policies apply. These may collectively be considered the “cross-border” TDM scenarios.

U.S. researchers are uncertain about how to navigate each of these scenarios. As evidenced in an informal survey that we conducted with digital humanities scholars, 70% of respondents reported cross-border copyright questions, 72% reported uncertainty about cross-border licensing terms, 52% noted privacy issues, and 48% identified ethical concerns. This confusion greatly impacted their TDM research. Twenty-eight percent (28%) of respondents confirmed that these cross-border copyright, licensing, privacy, or ethical issues impeded or prevented their project entirely. Of equal concern is that 40% of responding practitioners reported hesitation to share their workflows, methodology, or sources because of possible cross-border LLTDM issues. Without transparency, findings are deemed unreliable and scholarship may be rejected for publication. These problems will only mount given the increasing collaborativeness of research and the substantial amount of cross-border research occurring.

How will LLTDM-X help the world? 

Our long-term goal is to design instructional materials and institutes to support digital humanities TDM scholars facing cross-border issues, but our first step with LLTDM-X is getting a better handle on the specific law and policy challenges they face.

Through a series of virtual roundtable discussions, and accompanying legal research and analysis, LLTDM-X will surface these cross-border issues and begin to distill preliminary guidance to help scholars in navigating them. 

The first roundtable will engage U.S. digital humanities text data mining practitioners in sharing their cross-border TDM experiences. U.S. and global law and ethics experts will help guide the roundtable discussion to elicit the contours of practitioner experiences. During two subsequent roundtables—one focusing on cross-border copyright and licensing, and another on cross-border privacy and ethics—the experts will discuss practitioners’ hurdles in depth, and begin to develop customized guidance. 

After the roundtables, we will work with the law and ethics experts to create instructive case studies that reflect the types of cross-border TDM issues practitioners encountered. These case studies will incorporate recommendations to help a broad audience of U.S. digital humanities text data mining practitioners navigate LLTDM-X concerns. Case studies, guidance, and recommendations will be widely-disseminated via an open access report to be published at the completion of the project. And most importantly, they will be used to inform our future educational offerings.

An experienced team

The team for LLTDM-X (introduced below) is eager to get started. The project is co-directed by Thomas Padilla, Deputy Director, Archiving and Data Services at Internet Archive. 

LLTDM-X responds strategically to a pervasive challenge that needlessly complicates, inhibits, and weakens the fullest potential of research. This work paves a critical path toward building future training institutes that address cross-border legal issues in TDM. At Internet Archive we’re committed to supporting universal access to all knowledge—LLTDM-X couldn’t be more clearly aligned with what we hope to achieve. We look forward to working with our partners at UC Berkeley Library and the wider community to advance this work.”

Rachael Samberg, who leads UC Berkeley Library’s Office of Scholarly Communication Services and oversaw Building LLTDM, joins Thomas as co-director and explains that: 

“We are ready to begin analyzing and sorting out the complex legal challenges for digital humanities TDM researchers. We’ve already secured an incredible group of international legal and ethics experts to conduct the analyses, and will share more on that soon. In the meantime, we are gearing up to build out an even larger group of participating scholars whose experiences will help us create case studies.”

On behalf of the entire project team, we would like to thank NEH’s Office of Digital Humanities again for funding this important work. We invite you to contact us with any questions you may have. 

Thomas Padilla (Project Director): Thomas is Deputy Director, Archiving and Data Services at Internet Archive, and has deep experience cultivating library, archive, and museum ability to support TDM research. He has previously served as Principal Investigator of the Andrew W. Mellon supported Collections as Data: Part to Whole, the Institute of Museum and Library Services supported, Always Already Computational: Collections as Data, and as author of the library community research agenda, Responsible Operations: Data Science, Machine Learning, and AI in Libraries. In addition, Padilla was an expert faculty for Building LLTDM, the precursor to LLTDM-X.

Rachael Samberg (Project Co-Director): Rachael is Scholarly Communication Officer & Program Director of the University of California, Berkeley Library’s Office of Scholarly Communication Services. She served as Project Director and legal expert for Building LLTDM. A Duke Law graduate, Rachael practiced intellectual property litigation at Fenwick & West LLP for seven years before spending six years at Stanford Law School’s library, where she was Head of Reference & Instructional Services and a Lecturer in Law. Rachael speaks throughout the country about copyright and TDM issues, about which she is widely published. Her chapter, Law & Literacy in Non-Consumptive Text Mining, was published in Copyright Conversations (ALA, 2019).

Stacy Reardon (Project Team Member): Stacy Reardon is Literatures and Digital Humanities Librarian at the University of California, Berkeley Library, where she provides guidance and instruction on digital humanities projects and methods. Stacy served as a library expert on the Project Team for the NEH-funded Building Legal Literacies for Text Data Mining. She is co-chair of the UC Berkeley’s Digital Humanities Working Group, and received her Ph.D. in literature from the University of Massachusetts, Amherst.

Timothy Vollmer (Project Manager): Timothy Vollmer is Scholarly Communication and Copyright Librarian at UC Berkeley Library. He served as Project Manager for the NEH-funded Building Legal Literacies for Text Data Mining. Tim worked as a senior public policy manager for Creative Commons, and contributed to writing and advocacy on the text data mining exceptions in the EU’s Directive on Copyright in the Digital Single Market. He formerly was the Assistant Director to the Program on Public Access to Information at the American Library Association.


Back in action with your scholarship

decorative
Photo by Chris Montgomery on Unsplash

As the school year restarts in Berkeley, we know the pandemic is not over. But the Office of Scholarly Communication Services is here to help UC Berkeley faculty, students, and staff understand copyright and scholarly publishing with online resources, Zoom workshops, and virtual consultations.

If you’re interested in a recap of our progress and achievement over the last year, check out our 2020-21 annual report

Here’s what’s coming up this semester.

Upcoming Workshops

Publish Digital Books and Open Educational Resources with Pressbooks

September 14, 2021
11:00am–12:30pm
RSVP

If you’re looking to self-publish work of any length and want an easy-to-use tool that offers a high degree of customization, allows flexibility with publishing formats (EPUB, PDF), and provides web-hosting options, Pressbooks may be great for you. Pressbooks is often the tool of choice for academics creating digital books, open textbooks, and open educational resources, since you can license your materials for reuse however you desire. Learn why and how to use Pressbooks for publishing your original books or course materials. You’ll leave the workshop with a project already under way! Signup at the link below and the Zoom login details will be emailed to you.

Copyright and Your Dissertation

October 25, 2021
1:00pm–2:30pm
RSVP

This workshop will provide you with practical guidance for navigating copyright questions and other legal considerations for your dissertation or thesis. Whether you’re just starting to write or you’re getting ready to file, you can use our tips and workflow to figure out what you can use, what rights you have as an author, and what it means to share your dissertation online.

From Dissertation to Book: Navigating the Publication Process

October 26, 2021
1:00pm–2:30pm
RSVP

Hear from a panel of experts—an acquisitions editor, a first-time book author, and an author rights expert—about the process of turning your dissertation into a book. You’ll come away from this panel discussion with practical advice about revising your dissertation, writing a book proposal, approaching editors, signing your first contract, and navigating the peer review and publication process.

Managing and Maximizing Your Scholarly Impact

October 28, 2021
1:00pm–2:30pm
RSVP

This workshop will provide you with practical strategies and tips for promoting your scholarship, increasing your citations, and monitoring your success. You’ll also learn how to understand metrics, use scholarly networking tools, evaluate journals and publishing options, and take advantage of funding opportunities for Open Access scholarship.

Copyright and Fair Use for Digital Projects

November 10, 2021
11:00am–12:30pm
RSVP

This training will help you navigate the copyright, fair use, and usage rights of including third-party content in your digital project. Whether you seek to embed video from other sources for analysis, post material you scanned from a visit to the archives, add images, upload documents, or more, understanding the basics of copyright and discovering a workflow for answering copyright-related digital scholarship questions will make you more confident in your project. We will also provide an overview of your intellectual property rights as a creator and ways to license your own work.

Other ways we can help

We’re here to help answer a variety of questions you might have on intellectual property, digital publishing, and information policy.

Want help or more information? Send us an email at schol-comm@berkeley.edu. We can provide individualized support and personal consultations, online class instruction, and customized support and training for departments.


Now available: Open educational resource of Building Legal Literacies for Text Data Mining

Last summer we hosted the Building Legal Literacies for Text Data Mining institute. We welcomed 32 digital humanities researchers and professionals to the weeklong virtual training, with the goal to empower them to confidently navigate law, policy, ethics, and risk within digital humanities text data mining (TDM) projects. Building Legal Literacies for Text Data Mining (Building LLTDM) was made possible through a grant from the National Endowment for the Humanities

Since the remote institute in June 2020, the participants and project team reconvened in February 2021 to discuss how participants had been thinking about, performing, or supporting TDM in their home institutions and projects with the law and policy literacies in mind.

To maximize the reach and impact of Building LLTDM, we have now published a comprehensive open educational resource (OER) of the contents of the institute. The OER covers copyright (both U.S. and international law), technological protection measures, privacy, and ethical considerations. It also helps other digital humanities professionals and researchers run their own similar institutes by describing in detail how we developed and delivered programming (including our pedagogical reflections and take-aways), and includes ideas for hosting shorter literacy teaching sessions. The resource (available as a web-book or in downloadable formats including PDF and EPUB) is in the public domain under the CC0 Public Domain Dedication, meaning it can be accessed, reused, and repurposed without restriction. 

In addition to the OER, we’ve also published a white paper that describes the institute’s origins and goals, project overview and activities, and reflections and possible follow-on actions. 

Thank you to the National Endowment for the Humanities, the project team, institute participants, and staff at the UC Berkeley Library for making Building LLTDM a success.

[Note: this content is cross-posted on the LLTDM blog.]

 


Upcoming workshop on how to share and publish data

The image is a slide with the title of the workshop, data, and presenters

On December 1, 2020 from 12:30pm–2:00pm the Library is teaming up with Research Data Management to host a workshop How to Share and Publish Data: Resources, Law, and Policy. Signup here.

Are you unsure about how you can use or reuse other people’s data in your teaching or research, and what the terms and conditions are? Do you want to share your data with other researchers or license it for reuse but are wondering how and if that’s allowed? Do you have questions about university or granting agency data ownership and sharing policies, rights, and obligations? We will provide clear guidance on all of these questions and more in this interactive webinar on the ins-and-outs of data sharing and publishing.

Join the Library’s Office of Scholarly Communication Services and the Research Data Management Program as we:

  • Explore venues and platforms for sharing and publishing data
  • Unpack the terms of contracts and licenses affecting data reuse, sharing, and publishing
  • Help you understand how copyright does (and does not) affect what you can do with the data you create or wish to use from other people
  • Consider how to license your data for maximum downstream impact and reuse
  • Demystify data ownership and publishing rights and obligations under university and grant policies

Intended audiences include faculty, grad students, post-docs, instructors, and academic support staff, but anyone interested is welcome to attend.


Fall workshops on copyright and publishing

Person sitting in front of a computer screen with sunset in the background.
Photo by Simon Abrams on Unsplash

Welcome back to a strange semester. While we can’t meet up together on campus, the Office of Scholarly Communication Services will continue to offer a full slate of online workshops to help students and early career researchers confidently steer their way through the waters of copyright and publishing. Here is what’s in store for the coming few months.  

Upcoming Workshops

Publish Digital Books and Open Educational Resources with Pressbooks
September 15, 2020
10:00–11:30am

If you’re looking to self-publish work of any length and want an easy-to-use tool that offers a high degree of customization, allows flexibility with publishing formats (EPUB, MOBI, PDF), and provides web-hosting options, Pressbooks may be great for you. Pressbooks is often the tool of choice for academics creating digital books, open textbooks, and open educational resources, since you can license your materials for reuse however you desire. Learn why and how to use Pressbooks for publishing your original books or course materials. You’ll leave the workshop with a project already under way! Signup at the link below and the Zoom login details will be emailed to you.

Copyright and Your Dissertation
October 19, 2020
1:00–2:30pm

This workshop will provide you with a practical guidance for navigating copyright questions and other legal considerations for your dissertation or thesis. Whether you’re just starting to write or you’re getting ready to file, you can use our tips and workflow to figure out what you can use, what rights you have as an author, and what it means to share your dissertation online.

Managing and Maximizing Your Scholarly Impact
October 20, 2020
1:00–2:30pm

This workshop will provide you with practical strategies and tips for promoting your scholarship, increasing your citations, and monitoring your success. You’ll also learn how to understand metrics, use scholarly networking tools, evaluate journals and publishing options, and take advantage of funding opportunities for Open Access scholarship.

From Dissertation to Book: Navigating the Publication Process
October 22, 2020
1:00–2:30pm

Hear from a panel of experts—an acquisitions editor, a first-time book author, and an author rights expert—about the process of turning your dissertation into a book. You’ll come away from this panel discussion with practical advice about revising your dissertation, writing a book proposal, approaching editors, signing your first contract, and navigating the peer review and publication process.

Copyright and Fair Use for Digital Projects
November 10, 2020
11:00am–12:30pm

This training will help you navigate the copyright, fair use, and usage rights of including third-party content in your digital project. Whether you seek to embed video from other sources for analysis, post material you scanned from a visit to the archives, add images, upload documents, or more, understanding the basics of copyright and discovering a workflow for answering copyright-related digital scholarship questions will make you more confident in your publication. We will also provide an overview of your intellectual property rights as a creator and ways to license your own work.

 

Archived Recordings

We hosted a few workshops over the summer that might be of interest to you. 

Copyright in Course Design & Digital Learning Environments
Video Recording
Slides

If you’re wondering what you can or can’t upload and distribute in your online courses, we’re here to help with answers and best practices. We will cover copyright, fair use, and contractual issues that emerge in online course design. The goal of the webinar is for attendees to gain a deeper understanding of the legal considerations in creating digital courses, and to feel more confident in their content design decisions to support student learning. This webinar is appropriate both for instructors and staff supporting online courses.

Can We Digitize This? Understanding Law, Policy, & Ethics in Bringing our Collections to Digital Life
Video Recording
Slides

As part of the Digital Lifecycle Program, the UC Berkeley Library aims to digitize 200 million items from its special collections (rare books, manuscripts, photographs, archives, and ephemera) for the world to discover and use. But before we can digitize and publish them online for worldwide access, we have to sort out legal and ethical questions. We’ve created and released “responsible access workflows” that will benefit not only our Library’s digitization efforts, but also those of cultural heritage institutions such as museums, archives, and libraries throughout the nation.

Building Legal Literacies for Text Data Mining Institute
Video Recordings
Transcripts + Slides

In June, we welcomed 32 digital humanities (DH) researchers and professionals to the Building Legal Literacies for Text Data Mining (Building LLTDM) Institute. Our goal was to empower DH researchers, librarians, and professional staff to confidently navigate law, policy, ethics, and risk within digital humanities text data mining (TDM) projects—so they can more easily engage in this type of research and contribute to the further advancement of knowledge.

Other ways we can help

In addition to the workshops, we’re here to help answer a variety of questions you might have on intellectual property, digital publishing, and information policy.  

Want help or more information? Send us an email. We can provide individualized support and personal consultations, online class instruction, presentations and workshops for small or large groups & classes, and customized support and training for departments and disciplines.

 

 


What happened at the Building LLTDM Institute

This is a logo of the Building LLTDM Institute

This update is cross-posted from the Building LLTDM blog

On June 23-26, we welcomed 32 digital humanities (DH) researchers and professionals to the Building Legal Literacies for Text Data Mining (Building LLTDM) Institute. Our goal was to empower DH researchers, librarians, and professional staff to confidently navigate law, policy, ethics, and risk within digital humanities text data mining (TDM) projects—so they can more easily engage in this type of research and contribute to the further advancement of knowledge. We were joined by a stellar group of faculty to teach and mentor participants. Building LLTDM is supported by a grant from the National Endowment for the Humanities.

Why was the Institute needed?

Until now, humanities researchers conducting text data mining in the U.S. have had to maneuver through a thicket of legal issues without much guidance or assistance. As an example, take a researcher scraping content about Egyptian artifacts from online sites or databases, or downloading videos about Egyptian tomb excavations, in order to conduct automated analysis about religion or philosophy. The researcher then shares these content-rich data sets with others to encourage research reproducibility or enable other researchers to query the data sets with new questions. This kind of work can raise issues of copyright, contract, and privacy law. It can also raise concerns around ethics, for example, if there are plausible risks of exploitation of people, natural or cultural resources, or indigenous knowledge.

Potential law and policy hurdles do not just deter text data mining research: They also bias it toward particular topics and sources of data. In response to confusion over copyright, website terms of use, and other perceived legal roadblocks, some digital humanities researchers have gravitated to low-friction research questions and texts to avoid making decisions about rights-protected data. When researchers limit their research to such sources, it is inevitably skewed, leaving important questions unanswered, and rendering resulting findings less broadly applicable.

Moving an interactive, design-thinking Institute online

After months of preparation, we had been looking forward to working and learning together at UC Berkeley, but the world had other plans for our Institute. Due to the global health crisis, we had to transform our planned in-person, intensive workshop into an interactive and relevant remote experience. 

How did we do this? The pandemic meant we had to transition everything online, which of course presents challenges for a design-thinking framework. We are thrilled that our approach to interactive remote pedagogy was successful! (You can check out the schedule and framework in our Participant Packet.) The substantive content was pre-recorded and delivered in a flipped classroom model. Faculty created a series of short videos, and shared readings relevant to the legal literacies. We also provided the video transcripts and slides to participants to promote accessibility and accommodate multiple learning styles. 

We used Zoom to meet synchronously for discussion in groups of various sizes. We used Slack for asynchronous communication, and interactive tools such as Mural for design thinking exercises like journey mapping so that everyone could live edit and collaborate. We capped each day with a “happy half hour” on Zoom as an informal way to get to know each other a little better, even from afar. 

We also relied on an institute moderator and daily writing exercises to reinforce the design-thinking stages and learning outcomes. Each night, we reviewed the participants’ free-writes and began the next morning by reflecting back to the participants the themes from what they had shared.

This is a collage of themes presented at our morning plenary institute sessions.
A collection of themes from our morning plenary reflections.

Reflections on goals: social justice & effective empowerment

One of our priorities for the Institute was to invite a diverse pool of participants, including those involved in social justice research, in order to maximize the public value impact of Building LLTDM. We looked for demonstrated commitments to diversity and equity but could hardly have imagined the breadth and depth of experiences that applicants were willing to share. The selected participants research everything from understanding “place” data from community histories of historic African American settlements to the development of AIDS activist networks in communities of color; to portrayals of autism in literature; and more. Others demonstrated a commitment to bringing back the skills they learn to expand TDM opportunities for students and communities who have traditionally been marginalized or under-resourced. They also came from a variety of institution types, from research advising and support experience, professional roles, levels of experience with TDM, career stages, and disciplinary perspectives.

We are also moved by the participants’ own reflections on the experience. One of the last interactive exercises we hosted during the online Institute was a collective week-in-review discussion, and gratitude wall. We asked the participants to share what they were thankful for, highlighting other participants where possible. So many of the participants wrote about how valuable the learning experience was and how thoughtfully it was put together and delivered.

Digital stickies from our week-in-review and gratitude wall.
Digital stickies from our week-in-review and gratitude wall.

We can’t express the transformational impact of the week better than the participants, themselves. In Institute evaluation forms, they shared feelings like: 

  • “This is by far the best organized event that I have ever attended. The content was by far the most substantive. The faculty were by far the most engaged. A+ across the board.” 
  • “I am so grateful to have had the opportunity to engage with a diverse group of scholars (researchers and professionals)… The deliberately thought through breakdown and mix fostered incredibly valuable discussions and I would hope this kind of framework is used as a best practice for future DH institutes of all kinds going forward. Also, thank you for such an amazing virtual experience which I can only imagine took a tremendous amount of work to coordinate and plan with limited time to shift to an entirely different format–I was overjoyed to critically engage with complex subjects…” 
  • “This has been phenomenal. I don’t want to qualify it (by adding something like “…for having to be moved online”), because it’s been so, so good: well organized, thoughtful, and human throughout.” 
  • “There was clearly so much thought, care, and planning that went into the preparation of this institute, and it was an amazing opportunity to learn from a group of people — organizers, faculty, and participants — who all have such deep expertise. The video and readings lists alone are a huge resource, but to be able to process and reflect on that material together with a diverse group of people was really wonderful.”

Next steps, and our own gratitude

What’s next for Building LLTDM? The “Institute” is not over yet; only the 1-week training is complete. The cohort will be meeting again virtually in February 2021 to discuss how implementation of the literacies into our local communities and practices has gone. In the meantime, as the participants bring back the law and policy literacies they’ve learned to their home institutions, we are excited to see several cohort members already organizing their own post-Institute research subgroups, such as those whose TDM work relies heavily on social media content, and others who are exploring how to disseminate the Building LLTDM literacies within other instructional formats and frameworks. 

As part of the grant, the project team will also be aggregating the resources from the Institute and developing supplementary material for an Open Educational Resource (OER). We know there is a large community of TDM researchers and professionals who may be interested in or who can benefit from these materials, and the OER will be made available for broad reuse in the public domain.

Thank you to all the participants for their insights and contributions, willingness to share, and flexibility in transitioning to a fully-remote Institute. Thank you to all the faculty for their unmatched legal and policy expertise, ongoing commitment to mentorship, and adaptability in content creation and delivery. And thank you again to the NEH for making such a meaningful experience possible.


Workshop: Copyright in Course Design and Digital Learning Environments

The Library’s Office of Scholarly Communication Services is hosting an online workshop on July 9, from 10-11:30 on copyright, fair use, and contracts issues that arise in online course development.

Copyright in Course Design and Digital Learning Environments

If you’re wondering what you can or can’t upload and distribute in your online courses, we’re here to help with answers and best practices. We will cover copyright, fair use, and contractual issues that emerge in online course design. The goal of the webinar is for attendees to gain a deeper understanding of the legal considerations in creating digital courses, and to feel more confident in their content design decisions to support student learning. This webinar is appropriate both for instructors and staff supporting online courses.

Reporting back on Publish or Perish panel discussion at Morrison Library

Picture showing some of the speakers at the event.
Photograph by Rachael Samberg, CC BY 4.0.

On Friday, January 31, early career faculty, graduate students, librarians, and others joined us for Publish or Perish Reframed: Navigating the New Landscape of Scholarly Publishing. The event, hosted by the Office of Scholarly Communication Services, aimed to help everyone understand the behind-the-scenes workings of scholarly publishing, especially for the early career researchers and students interested in publishing. 

Why are we concerned about the state of scholarly publishing? Things are looking rather sunny for UC authors, who publish nearly 10% of all scholarly literature in the United States. However, there are actually a lot of tensions in the scholarly publishing ecosystem today, and the landscape can be confusing or murky. One of the tensions has to do with access to research, as 85% of journal articles being published each year are still stuck behind paywalls, thus slowing scientific discovery because only people who have subscriptions can access and read it. Subscription prices of commercial scholarly journals continue to increase, while university library collections budgets continue to shrink–further constricting access to knowledge. Another challenge is ongoing publishing expectations: PhD students, post-docs, and young faculty are under ongoing pressure to publish in the most prestigious journals available in order to receive promotion and tenure, even though many of these publishing venues continue to be the most closed and expensive to which libraries subscribe. 

The publishing lifecycle, stakeholder power, and library budgets

Rachael Samberg and Timothy Vollmer from the Office of Scholarly Communication Services kicked off the event by taking a closer look at the publishing lifecycle today. While this process can vary somewhat based on the nature of the research, there are some common aspects, such as (1) reading the works of others and then forming your own research, (2) creating a new knowledge product such as a written article, (3) submitting that work to a publisher which coordinates a peer review process, (4) publishing the work in a scholarly journal, (5) distributing the work via library subscriptions or open access, and (6) preserving the work. 

The image is a graphic of the scholarly publishing lifecycle.

The publishing lifecycle involves many different players, and power is not distributed equally amongst these various participants. For instance:

  • The reading public or scholars at other institutions have an interest in reading the outputs of scholarship, but little power in demanding how it’s made available. They can’t vote with their feet and decide not to read a journal article they need for their research, the way you could with a car that was too expensive, and for which an equivalent car might be available from another manufacturer for less. 
  • The author has an interest in producing good quality work, but is in some ways beholden to needing it to be selected by reputable journals to build reputation and achieve career advancement. 
  • Universities are interested in recruiting and retaining high profile scholars and students—and also grants and donor funds—and the reception of scholarship created at the institutions affect their ability to do this. The more prestigious the publications, the better this reflects on the universities, so there is some pressure universities can exert over authors about the journals in which their authors should publish.
  • Funders like federal agencies and philanthropic foundations want the research they support to make a societal and global impact, and are therefore interested in how that research is disseminated. Funders can require dissemination of work product, but they can’t necessarily interfere with academic freedom about where to publish.
  • Libraries want to purchase or license access to the content to provide it to the readers at their universities or institutions. But they’re not the ones creating the content as a way to try to control costs, and further if they refuse to purchase or license content, their authors and universities will be affected
  • Scholarly societies are interested in putting out high quality scholarship, but they may also wish to generate enough money to fund not only their publishing efforts but also other society operations like conferences or education—so this limits how “low they can go,” so to speak in terms of the price point for what they publish.
  • Most of the market power—at least on the surface—lies with large third party commercial publishers, who stand poised to generate substantial profits in exchange for the opportunity to publish in or read their valued journals. 

Image of scholarly publishing stakeholders.

Many institutions aren’t lucky enough to have the millions of dollars needed to spend on getting subscriptions to high priced journal content. And if they don’t  have that money and can’t subscribe, then the people affiliated with that institution can’t read the latest scholarship. In turn, if the institution’s scholars don’t have access to it, then their ability to use it to help them come up with new ideas and insights in their own scholarship is severely limited. So, scientific progress is hampered.

Open access publishing approaches

In order to understand how any stakeholders can encourage an open outcome, we’ve first got to understand what types of open access financial strategies exist. How is OA funded? If we replace the subscription system with OA end products, why would publishers stay in the game? If publishers are going to invest time and effort in publishing, how do they recover costs in an OA universe? 

Before we dive into OA funding approaches, one important thing to keep in mind is that publishing a scholarly article or book open access does not mean foregoing peer review or any of the other stringent editorial processes that ensure high quality scholarship. In fact, peer review can be even carried out in more cost effective ways for OA journals. At its core, open access is just an outcome: Scholarship is published online in a way that can be read and used by anyone, and without any financial, legal, or technical barriers other than gaining access to the Internet, itself. 

Okay, so on to who gets paid and how. One approach to achieving OA is “green open access.” This “flavor” of OA means that authors or institutions make works that would otherwise only be available via a subscription freely available by depositing certain versions of their scholarship into online repositories, typically institutional repositories run by a researcher’s university (like the UC’s eScholarship), or even a funder repository like the NIH’s PubMed Central

The version that can be deposited depends in part on the specific terms of the publication agreement the author signs with the publisher. You might be wondering, why on earth does it matter what publication agreement says? Well, in exchange for the publisher agreeing to publish a journal article—they often demand that authors relinquish some or all relevant rights to share or reuse the work. So, in order to publish in most commercial journals, the author must transfer their copyright to the journal. And unless their publication agreement reserves certain rights for the author, the copyright transfer means the author will no longer retain the necessary rights to publicly share the final article—even on their own course website or institutional repository. 

So, if authors assign all their rights to publishers, why are they permitted to deposit certain versions of their work in a repository? There are two reasons. First, many publishers’ agreements now provide authors with permission to self archive what’s called the “post-print”— the final peer-reviewed article but that lacks the publisher’s final copy-editing and formatting. Second, institutional OA policies preemptively secure the rights for universities to host works notwithstanding the language of author publication agreements. These policies can attach to articles before an author ever signs a publication agreement. 

This is what the UC’s Open Access Policy does. As a UC author, you have a right to deposit your post-print of your article into UC’s institutional repository called eScholarship at the time of publication. The UC takes a license to display the peer-reviewed version of your work, such that any publication agreement you later sign is subject to the UC’s pre-existing right.. 

There’s also “gold open access.” Gold OA means that what the publisher puts out online on its website—immediately upon publication of the article, whether in print or online—is free access to the final, publisher-version of the article. Typically these articles are shared under a Creative Commons license. Some gold OA publishers recoup production costs via charges for authors to publish (“article processing charges” or “book processing charges”) rather than having readers (or libraries) pay to access and read it. In general, gold OA is a system in which the author pays, rather than the reader paying. At the same time, the fees to be paid for publishing don’t actually have to be paid by the author. They can be covered by various sources, such as: research accounts, research grants, the university, the monies the libraries previously were spending on subscriptions to that journal, scholarly societies, and consortia. (Read on for the program the UC Berkeley Library runs to cover these fees.)

There’s also a type of gold open access that does not involve APCs. Here, the publisher provides permanent and free access to readers with neither author fees nor reader fees. Typically a society, organization, government, or endowment would be necessary to cover the cost of publication. 

Empowering universities and authors

We explored what UC Berkeley is doing to leverage these open access models to make scholarship more available. The UC is pursuing a wide array of strategies to improve access to research, including many outlined in the Pathways to Open Access toolkit. To be sure, UC authors have been publishing their articles open access for years, and UC was one of the early institutions with a post-print, green open access policy. But Pathways to Open Access analyzed a panoply of additional funding strategies, and made recommendations for a plurality of approaches.

One example of a new strategy being pursued is negotiating transformative agreements. These types of arrangements have been supported by the UC systemwide faculty senate library committee, who pushed ahead the goal of replacing subscription-based publishing with open access by releasing a declaration of rights and principles to transform scholarly communication. These principles are now guiding the UC libraries in pursuing transformative publishing agreements. 

The UC’s goal with transformative agreements is to both changing subscription agreements into agreements that enable open access publishing, and also reduce how much we are spending on the publishing enterprise to begin with. It’s important to emphasize that transformative agreements are just one of the ways the UC campuses and the UC Berkeley library are pursuing open access. 

The University of California has been exploring different types of flexible models for transformative agreements. For instance, the agreement the UC has pursued with Cambridge University Press is a multi-payer model, where both libraries and authors (if they have grant funds) contribute to the open access publishing fee.

From the author’s perspective, the Cambridge transformative workflow attempts to minimize intrusion into the publishing process, while still working to incorporate authors into the payment process in some form so they understand the costs of publishing in a new landscape. Authors still choose their journal, submit their manuscript to the journal, and pass through the peer review process as normal; we’re not asking them to change how they do any of those things, especially how they select a journal. Once a manuscript is accepted by a journal with a publisher with whom we have a transformative agreement, then the author is asked to choose whether to publish open access or to opt out of the agreement and publish closed access. Of course, the Library prefers for authors to publish open access, and our intention is to make that the default option, but we don’t require this.

Assuming an author chooses to publish open access, they will be asked to coordinate payment of an APC. In general, this APC is discounted from the list price that the publisher may currently be charging. The library commits to paying a portion of every open access fee. We then ask the author whether they have research funding available which may be used to pay for publication.  If they do, then the author pays the remainder of the OA fee. If they do not have research funding available for this purpose, then the library pays the remainder of the fee on the author’s behalf. In this way, authors engage with the payment process, and they contribute a portion of the cost if they have funding available to do so. However, authors without research funding are not disadvantaged, and we never ask an author to reach into their own pockets to make a payment.

Even if the UC hasn’t entered into a transformative agreement with a publisher, there are many other opportunities for authors to get involved in impactful OA decision-making. We discussed that one thing UC Berkeley authors can do right now is to take part in the existing UC open access publishing mechanisms, such as by depositing post-prints in eScholarship an. We also mentioned the UC Berkeley Library program called the Berkeley Research Impact Initiative (BRII) that covers up to $2500 of an APC an author is charged for publishing in a fully open access journal.

Another way authors can empower themselves as scholars is by retaining various rights in the publishing process. Making smart decisions about copyright can help scholarly authors maximize the impact of their research by promoting greater readership and reuse . In most cases, the author of an article is the copyright holder, and authors maintain their copyright to the scholarship until they transfer all or certain rights to a publisher. Now, the publisher might ask for a full transfer of copyright. But as an author you don’t necessarily have to just sign the agreement a publisher presents to you. You can ask for an alternative wording, and sometimes they immediately just send you their alternate agreement with that change already baked in. Some publishers take a different approach through which authors keep their copyright and instead agree to share their work under an open license. For example, copyright in all Public Library of Science articles stays with the authors, but the authors agree to share the work under an open license, in this case the Creative Commons Attribution (CC BY 4.0) license. This license permits unrestricted use and sharing provided the original author and source are credited. In the end, authors have choices to make, both in managing their rights through the author agreement, or even pursuing full open access journals that leverage open licensing. 

Challenges in Publishing for Promotion and Tenure

Benjamin Hermalin, Vice Provost for the Faculty, discussed some of the tensions within scholarly publishing as they relate to promotion and tenure, and provided some advice to new authors in making their way through the publication process. While the Office of the Vice Chancellor for the Faculty reviews all outside letters in each tenure and promotion submission, he said there’s still some conservatism in how tenure and review committees assess a scholar’s publishing outputs and impact. Hermalin advised young researchers to take a measured approach by understanding the particular requirements and publishing practices for their specific field, and aim for publishing several publications in high quality journals relevant to their area. 

Of course, the question keeps coming up: How does a researcher get published in the top journals? No one knows the complete answer to this, but authors need to be systematic, and diligent. Hermalin advocated that it’s more important to work toward becoming a major contributor to one or two areas than to be a minor contributor to several fields of research. 

Hermalin also talked about some of the challenges researchers face in determining when to publish. He noted that while an author shouldn’t send something out before they’re ready, they also shouldn’t let the perfect be the enemy of the good: getting the research off your desk in a timely fashion is best for your academic profile and chances for tenure down the road. He also suggested that authors not split their scholarly output too thinly: it’s better to publish a few substantial, in-depth papers on a particular topic than several separate publications that individually cover too little of the research endeavor. 

Open publishing: A View from the Faculty 

Philip Stark, Professor of Statistics and Associate Dean of the Division of Math and Physical Sciences, provided an on-the-ground perspective of open science, OA publishing, and how he deals with copyright and publishing contracts with commercial publishers. Stark showed several examples of how he marks up publishing agreements since he no longer gives any publisher an exclusive right to publish. He also showed how to strike language and amend it to retain copyright and other publishing rights, and said in his experience, most publishers have accepted these changes. 

Professor Stark discussed one paper that he published open access through help from BRII. The paper analyzed gender bias in student evaluations, and Stark and his co-authors wanted it to be open access. But Philip was concerned that if they published it in an open access publication, his co-authors—who were a junior faculty and PhD student at the time—might not get as much recognition or impact from the paper than if they were to shoot for publishing in one of journals considered to be “high impact” under certain standards.j. However, the initial fears about publishing on ScienceOpen were unfounded, as the paper has since been widely accessed, cited, and freely downloaded over 70,000 times. Stark said he earned a much bigger impact publishing open access there than if it’d been published in a commercial journal. 

Finally, Professor Stark discussed academic freedom in relation to faculty publishing choices. While many think the concept of academic freedom means that researchers are privileged with the ability to work on what they find interesting and important without outside pressure from the university administration, the reality is that faculty—especially early career researchers—are under ongoing pressure to publish in journals that will secure them tenure, or to obtain grants to support their (or their students’) research. In this sense, faculty publishing decisions are driven more by economic forces than the principle of academic freedom. Stark said that this temptation to publish in the most prestigious journal to advance your career is a persistent moral hazard because it challenges the more noble perceptions we have about academic pursuits and how the work of academics benefits science, and the public interest. 

Certainly, no one had all the answers for simplifying the complexities of scholarly publishing, but by understanding the driving forces and power dynamics, early career authors can make informed choices that will carry their scholarship far both in impact and in their professional advancement.


Event: Publish or Perish Reframed: Navigating the New Landscape of Scholarly Publishing

University of California authors published about 50,000 scholarly articles last year alone—comprising nearly 10% of all research in the United States. Despite this tremendous productivity, UC scholars continue to experience a tension between publishing their research in ways that ensure readership or access, and perceptions about the effect of certain outlets and publishing choices on their research impact or career advancement.

In this panel, we’ll unpack the landscape of modern scholarly publishing by exploring economics and stakeholder power structures, and what the University of California is doing to address these issues through recent publisher negotiations.

We will also learn from publishing experts about how to maximize research dissemination, access, and impact through the decisions we make about open access, copyright transfer, and publication choices. Faculty will share publishing advice and guidance for early career researchers as they navigate their academic careers. They will also discuss how tenure and promotion practices are being adjusted to better reflect diversity in publishing outputs and venues. There will be a Q&A session at the end of the discussion.

Speakers will include:

  • Benjamin Hermalin, Vice Provost for the Faculty; Professor of Finance and Professor of Economics, UC Berkeley
  • Philip B. Stark, Professor of Statistics, Associate Dean, Division of Mathematical and Physical Sciences, Regional Associate Dean (Interim), College of Chemistry and Division of Mathematical and Physical Sciences, UC Berkeley
  • Rachael Samberg, Scholarly Communication Officer, UC Berkeley Library
  • Timothy Vollmer, Scholarly Communication & Copyright Librarian, UC Berkeley Library

RSVP to join us for this timely conversation on current scholarly publishing issues.

Image of the poster for the Publish or Perish event

 


Join us January 31 for Publish or Perish Reframed: Navigating the New Landscape of Scholarly Publishing

Image of the poster for the Publish or Perish eventUniversity of California authors published about 50,000 scholarly articles last year alone—comprising nearly 10% of all research in the United States. Despite this tremendous productivity, UC scholars continue to experience a tension between publishing their research in ways that ensure readership or access, and perceptions about the effect of certain outlets and publishing choices on their research impact or career advancement.

Event details:
Friday, January 31, 2020
4:00 pm – 5:30 pm
Morrison Library
Refreshments provided
RSVP now!

In this panel, we’ll unpack the landscape of modern scholarly publishing by exploring economics and stakeholder power structures, and what the University of California is doing to address these issues through recent publisher negotiations. 

We will also learn from publishing experts about how to maximize research dissemination, access, and impact through the decisions we make about open access, copyright transfer, and publication choices. Faculty will share publishing advice and guidance for early career researchers as they navigate their academic careers. They will also discuss how tenure and promotion practices are being adjusted to better reflect diversity in publishing outputs and venues. There will be a Q&A session at the end of the discussion. 

Speakers will include: 

  • Benjamin Hermalin, Vice Provost for the Faculty; Professor of Finance and Professor of Economics, UC Berkeley
  • Philip B. Stark, Professor of Statistics, Associate Dean, Division of Mathematical and Physical Sciences, Regional Associate Dean (Interim), College of Chemistry and Division of Mathematical and Physical Sciences, UC Berkeley
  • Rachael Samberg, Scholarly Communication Officer, UC Berkeley Library 
  • Timothy Vollmer, Scholarly Communication & Copyright Librarian, UC Berkeley Library

RSVP to join us for this timely conversation on current scholarly publishing issues.