Tag: digital humanities
Digital Archives and the DH Working Group on Nov. 4
To my delight, I can now announce that the next Digital Humanities Working Group at UC Berkeley is November 4 at 1pm in Doe Library, Room 223.
For the workshop, we have two amazing speakers for lightning talks. They are:
Danny Benett, MA Student in Folklore, will discuss the Berkeley folklore archive which is making ~500,000 folklore items digitally accessible.
Adrienne Serra, Digital Projects Archivist at The Bancroft Library, will demo an interactive map in ArcGIS allowing users to explore digital collections about the Spanish and Mexican Land grants in California.
We hope to see you there! Do consider signing up (link) as we order pizza and like to have loose numbers.
The UC Berkeley Digital Humanities Working Group is a research community founded to facilitate interdisciplinary conversations in digital humanities and cultural analytics. It is a welcoming and supportive community for all things digital humanities.
The event is co-sponsored by the D-Lab and Data & Digital Scholarship Services.
A&H Data: What even is data in the Arts & Humanities?
This is the first of a multi-part series exploring the idea and use of data in the Arts & Humanities. For more information, check out the UC Berkeley Library’s Data and Digital Scholarship page.
Arts & Humanities researchers work with data constantly. But, what is it?
Part of the trick in talking about “data” in regards to the humanities is that we are already working with it. The books and letters (including the one below) one reads are data, as are the pictures we look at and the videos we watch. In short, arts and humanities researchers are already analyzing data for the essays, articles, and books that they write. Furthermore, the resulting scholarship is data.
For example, the letter below from Bancroft Library’s 1906 San Francisco Earthquake and Fire Digital Collection on Calisphere is data.
George Cooper Pardee, “Aid for San Francisco: Letter from the Mayor in Oregon,”
April 24, 1906, UC Berkeley, Bancroft Library on Calisphere.
One ends up with the question “what isn’t data?”
The broad nature of what “data” is means that instead of asking if something is data, it can be more useful to think about what kind of data one is working with. After all, scholars work with geographic information; metadata (e.g., data about data); publishing statistics; and photographs differently.
Another helpful question is to consider how structured it is. In particular, you should pay attention to whether the data is:
- unstructured
- semi-structured
- structured
The level of structure informs us how to treat the data before we analyze it. If, for example, you have hundreds of of images, you want to work with, it’s likely you’ll have to do significant amount of work before you can analyze your data because most photographs are unstructured.
In contrast, the letter toward the top of this post is semi-structured. It is laid out in a typical, physical letter style with information about who, where, when, and what was involved. Each piece of information, in turn, is placed in standardized locations for easy consumption and analysis. Still, to work with the letter and its fellows online, one would likely want to create a structured counterpart.
Finally, structured data is usually highly organized and, when online, often in machine-readable chart form. Here, for example, are two pages from the Polk San Francisco City Directory from 1955-1956 with a screenshot of the machine-readable chart from a CSV (comma separated value) file below it. This data is clearly structured in both forms. One could argue that they must be as the entire point of a directory is for easy of information access and reading. The latter, however, is the one that we can use in different programs on our computers.
Internet Archive. | Public Domain.
This post has provided a quick look at what data is for the Arts&Humanities.
The next will be looking at what we can do with machine-readable, structured data sets like the publisher’s information. Stay tuned! The post should be up in two weeks.
Correspondance complète de Rousseau ONLINE
In partnership with the Voltaire Foundation, the Correspondance complète de Rousseau ONLINE makes Ralph Leigh’s critical edition in 52 volumes in the original French-language available as an ebook collection for the first time. The digital corpus gathers together all 8,000 letters written to and by one of the most important figures of eighteenth-century intellectual history, as well as the correspondence between third parties relating to the writer and his time. Drafts and copies have been collated against the original manuscripts and all variants reproduced. The extensive annotations identify individuals, events and places, explain the linguistic usages of the eighteenth century, give bibliographical information and clarify obscure allusions.
This library purchase was made possible with the generous support from the Archie & Harriett Maclean Endowed Fund for French Culture.
Oral History Project Wins Autry Public History Prize
The UC Berkeley Oral History Center (OHC) is thrilled to announce that OHC historian Todd Holmes and project partner Emi Kuboyama from Stanford University have won the 2023 Autry Public History Prize for their digital project, Redress: An Oral History. The award is given by the Western History Association for the best project in public history. Released to the public in 2022, the project documents the history of Japanese American Redress through oral histories and a documentary film, which are featured with related historical resources on a dedicated educational website.
Holmes and Kuboyama began the project in 2018 with the initial goal of documenting the history of the Office of Redress Administration (ORA), the little-known agency charged with administering redress by the Civil Liberties Act of 1988. Emi Kuboyama, the principal creator of the project, had a direct link to the agency and its work. As a native of Hawaii, she was no stranger to the history of Japanese American incarceration or the impact that dark period still held in Japanese American communities. She also began her legal career with the agency in 1994, an experience that had a profound impact on her personally and professionally.
In 2017, Kuboyama attended the OHC’s Advanced Oral History Institute to explore how oral history could help document the historic redress program and the work of the ORA. There she met OHC historian Todd Holmes and the two agreed to partner on the project. With the support of a Japanese American Confinement Sites grant from the National Parks Service, they conducted over a dozen interviews with former ORA staff, as well as community leaders affiliated with the program. The recordings and transcripts of those interviews are now housed at the Densho Digital Repository. Upon the completion of the oral history interviews, Holmes and Kuboyama recognized the need to put the history of the ORA into conversation with the experience of the Japanese American community in its forty-six-year journey from internment to redress. With the generous support of the Henri and Tomoye Takahashi Foundation, they enlisted the help of filmmaker Jon Ayon. The collaboration resulted in the film, Redress, which offers the first in-depth look at the history of Japanese American redress as told by the community members who took part in the program, and the government professionals who administered it.
The last part of this digital project was to create a website that would not only serve as a home for the oral histories and film, but also an educational space for students and the public to learn more about the history of redress. Created by Todd Holmes and Heidi Holmes, the website features two historical pages that supplement the film and oral histories, as well as a resources page that points visitors to related historical material such as books, films, and oral history collections. Since the project’s release in fall 2022, the website has received over 43,000 visitors.
The prize was awarded to Holmes and Kuboyama in October 2023 at the annual Western History Association Conference. In the awards program, the Autry Committee praised the Redress project as “an excellent model of professional public history practice that documents a moment in Western American History that has particular significance for today’s conversations about reparations within other marginalized groups.” The committee also applauded how the project “showcases the power of the medium of oral history.”
The Oral History Center congratulates Todd Holmes, Emi Kuboyama, and their partners on an outstanding project and contribution. For more on the history of Japanese American Redress, visit the project website. And to learn more about the Japanese American experience and the legacy of WWII, see the new oral histories of the OHC’s Japanese American Intergenerational Narratives project, which are featured in the newest season of The Berkeley Remix podcast.
Resources
Redress: An Oral History website
Oral History Center’s Japanese American Intergenerational Narratives Oral History Project
The Berkeley Remix podcast: Season 8: “‘From Generation to Generation’: The Legacy of Japanese American Incarceration”
Now available: Open educational resource of Building Legal Literacies for Text Data Mining
Last summer we hosted the Building Legal Literacies for Text Data Mining institute. We welcomed 32 digital humanities researchers and professionals to the weeklong virtual training, with the goal to empower them to confidently navigate law, policy, ethics, and risk within digital humanities text data mining (TDM) projects. Building Legal Literacies for Text Data Mining (Building LLTDM) was made possible through a grant from the National Endowment for the Humanities.
Since the remote institute in June 2020, the participants and project team reconvened in February 2021 to discuss how participants had been thinking about, performing, or supporting TDM in their home institutions and projects with the law and policy literacies in mind.
To maximize the reach and impact of Building LLTDM, we have now published a comprehensive open educational resource (OER) of the contents of the institute. The OER covers copyright (both U.S. and international law), technological protection measures, privacy, and ethical considerations. It also helps other digital humanities professionals and researchers run their own similar institutes by describing in detail how we developed and delivered programming (including our pedagogical reflections and take-aways), and includes ideas for hosting shorter literacy teaching sessions. The resource (available as a web-book or in downloadable formats including PDF and EPUB) is in the public domain under the CC0 Public Domain Dedication, meaning it can be accessed, reused, and repurposed without restriction.
In addition to the OER, we’ve also published a white paper that describes the institute’s origins and goals, project overview and activities, and reflections and possible follow-on actions.
Thank you to the National Endowment for the Humanities, the project team, institute participants, and staff at the UC Berkeley Library for making Building LLTDM a success.
[Note: this content is cross-posted on the LLTDM blog.]
New publication by Nick Paige from the French Department
Check out this new book by Department of French faculty member Nicholas Paige, available in print and as an ebook through the online catalog.
From introduction:
“This book is about the evolution of French and to a lesser degree English novels – by which I mean French- and English-language novels – from 1601 to 1830. And while evolution is very much at the center of my preoccupations, I do not offer a “story” about that evolution. There is no plot, as we might want if we thought of the novel moving forward, perhaps from birth, episode by episode, toward a resolution, some happy state of stability – as if, in other words, the novel’s own history could be made into a kind of novel.”
“In lieu of a story, Technologies of the Novel offers a quantitative account of the ceaseless yet patterned flux of the novel system over these twenty-three decades.”
“Technologies of the Novel is, then, digital and distant; but it is most certainly not antianalogue or anticlose.”
What happened at the Building LLTDM Institute
This update is cross-posted from the Building LLTDM blog.
On June 23-26, we welcomed 32 digital humanities (DH) researchers and professionals to the Building Legal Literacies for Text Data Mining (Building LLTDM) Institute. Our goal was to empower DH researchers, librarians, and professional staff to confidently navigate law, policy, ethics, and risk within digital humanities text data mining (TDM) projects—so they can more easily engage in this type of research and contribute to the further advancement of knowledge. We were joined by a stellar group of faculty to teach and mentor participants. Building LLTDM is supported by a grant from the National Endowment for the Humanities.
Why was the Institute needed?
Until now, humanities researchers conducting text data mining in the U.S. have had to maneuver through a thicket of legal issues without much guidance or assistance. As an example, take a researcher scraping content about Egyptian artifacts from online sites or databases, or downloading videos about Egyptian tomb excavations, in order to conduct automated analysis about religion or philosophy. The researcher then shares these content-rich data sets with others to encourage research reproducibility or enable other researchers to query the data sets with new questions. This kind of work can raise issues of copyright, contract, and privacy law. It can also raise concerns around ethics, for example, if there are plausible risks of exploitation of people, natural or cultural resources, or indigenous knowledge.
Potential law and policy hurdles do not just deter text data mining research: They also bias it toward particular topics and sources of data. In response to confusion over copyright, website terms of use, and other perceived legal roadblocks, some digital humanities researchers have gravitated to low-friction research questions and texts to avoid making decisions about rights-protected data. When researchers limit their research to such sources, it is inevitably skewed, leaving important questions unanswered, and rendering resulting findings less broadly applicable.
Moving an interactive, design-thinking Institute online
After months of preparation, we had been looking forward to working and learning together at UC Berkeley, but the world had other plans for our Institute. Due to the global health crisis, we had to transform our planned in-person, intensive workshop into an interactive and relevant remote experience.
How did we do this? The pandemic meant we had to transition everything online, which of course presents challenges for a design-thinking framework. We are thrilled that our approach to interactive remote pedagogy was successful! (You can check out the schedule and framework in our Participant Packet.) The substantive content was pre-recorded and delivered in a flipped classroom model. Faculty created a series of short videos, and shared readings relevant to the legal literacies. We also provided the video transcripts and slides to participants to promote accessibility and accommodate multiple learning styles.
We used Zoom to meet synchronously for discussion in groups of various sizes. We used Slack for asynchronous communication, and interactive tools such as Mural for design thinking exercises like journey mapping so that everyone could live edit and collaborate. We capped each day with a “happy half hour” on Zoom as an informal way to get to know each other a little better, even from afar.
We also relied on an institute moderator and daily writing exercises to reinforce the design-thinking stages and learning outcomes. Each night, we reviewed the participants’ free-writes and began the next morning by reflecting back to the participants the themes from what they had shared.
Reflections on goals: social justice & effective empowerment
One of our priorities for the Institute was to invite a diverse pool of participants, including those involved in social justice research, in order to maximize the public value impact of Building LLTDM. We looked for demonstrated commitments to diversity and equity but could hardly have imagined the breadth and depth of experiences that applicants were willing to share. The selected participants research everything from understanding “place” data from community histories of historic African American settlements to the development of AIDS activist networks in communities of color; to portrayals of autism in literature; and more. Others demonstrated a commitment to bringing back the skills they learn to expand TDM opportunities for students and communities who have traditionally been marginalized or under-resourced. They also came from a variety of institution types, from research advising and support experience, professional roles, levels of experience with TDM, career stages, and disciplinary perspectives.
We are also moved by the participants’ own reflections on the experience. One of the last interactive exercises we hosted during the online Institute was a collective week-in-review discussion, and gratitude wall. We asked the participants to share what they were thankful for, highlighting other participants where possible. So many of the participants wrote about how valuable the learning experience was and how thoughtfully it was put together and delivered.
We can’t express the transformational impact of the week better than the participants, themselves. In Institute evaluation forms, they shared feelings like:
- “This is by far the best organized event that I have ever attended. The content was by far the most substantive. The faculty were by far the most engaged. A+ across the board.”
- “I am so grateful to have had the opportunity to engage with a diverse group of scholars (researchers and professionals)… The deliberately thought through breakdown and mix fostered incredibly valuable discussions and I would hope this kind of framework is used as a best practice for future DH institutes of all kinds going forward. Also, thank you for such an amazing virtual experience which I can only imagine took a tremendous amount of work to coordinate and plan with limited time to shift to an entirely different format–I was overjoyed to critically engage with complex subjects…”
- “This has been phenomenal. I don’t want to qualify it (by adding something like “…for having to be moved online”), because it’s been so, so good: well organized, thoughtful, and human throughout.”
- “There was clearly so much thought, care, and planning that went into the preparation of this institute, and it was an amazing opportunity to learn from a group of people — organizers, faculty, and participants — who all have such deep expertise. The video and readings lists alone are a huge resource, but to be able to process and reflect on that material together with a diverse group of people was really wonderful.”
Next steps, and our own gratitude
What’s next for Building LLTDM? The “Institute” is not over yet; only the 1-week training is complete. The cohort will be meeting again virtually in February 2021 to discuss how implementation of the literacies into our local communities and practices has gone. In the meantime, as the participants bring back the law and policy literacies they’ve learned to their home institutions, we are excited to see several cohort members already organizing their own post-Institute research subgroups, such as those whose TDM work relies heavily on social media content, and others who are exploring how to disseminate the Building LLTDM literacies within other instructional formats and frameworks.
As part of the grant, the project team will also be aggregating the resources from the Institute and developing supplementary material for an Open Educational Resource (OER). We know there is a large community of TDM researchers and professionals who may be interested in or who can benefit from these materials, and the OER will be made available for broad reuse in the public domain.
Thank you to all the participants for their insights and contributions, willingness to share, and flexibility in transitioning to a fully-remote Institute. Thank you to all the faculty for their unmatched legal and policy expertise, ongoing commitment to mentorship, and adaptability in content creation and delivery. And thank you again to the NEH for making such a meaningful experience possible.
Team Awarded Grant to Help Digital Humanities Scholars Navigate Legal Issues of Text Data Mining
We are thrilled to share that the National Endowment for the Humanities (NEH) has awarded a $165,000 grant to a UC Berkeley-led team of legal experts, librarians, and scholars who will help humanities researchers and staff navigate complex legal questions in cutting-edge digital research.
What is this grant all about?
If you were to crack open some popular English-language novels written in the 1850’s–say, ones from Brontë, Hawthorne, Dickens, and Melville–you would find they describe men and women in very different terms. While a male character might be said to “get” something, a female character is more likely to have “felt” it. Whereas the word “mind” might be used when describing a man, the word “heart” is more likely to be used about a woman. Yet, as the 19th Century became the 20th, these descriptive differences between genders actually diminish. How do we know all this? We confess we have not actually read every novel ever written between the 19th and 21st Centuries (though we’d love to envision a world in which we could). Instead, we can make this assertion because researchers (including David Bamman, of UC Berkeley’s School of Information) used automated techniques to extract information from the novels, and analyzed these word usage trends at scale. They crafted algorithms to turn the language of those novels into data about the novels.
In fields of inquiry like the digital humanities, the application of such automated techniques and methods for identifying, extracting, and analyzing patterns, trends, and relationships across large volumes of unstructured or thinly-structured digital content is called “text data mining.” (You may also see it referred to as “text and data mining” or “computational text analysis”). Text data mining provides humanists and social scientists with invaluable frameworks for sifting, organizing, and analyzing vast amounts of material. For instance, these methods make it possible to:
- Detect racial disparity by evaluating language from police body camera footage;
- Develop new tools to enable large-scale analysis of television series and photographs; and
- Capture and design new physical representations of naturally occurring laughter
The Problem
Until now, humanities researchers conducting text data mining have had to navigate a thicket of legal issues without much guidance or assistance. For instance, imagine the researchers needed to scrape content about Egyptian artifacts from online sites or databases, or download videos about Egyptian tomb excavations, in order to conduct their automated analysis. And then imagine the researchers also want to share these content-rich data sets with others to encourage research reproducibility or enable other researchers to query the data sets with new questions. This kind of work can raise issues of copyright, contract, and privacy law, not to mention ethics if there are issues of, say, indigenous knowledge or cultural heritage materials plausibly at risk. Indeed, in a recent study of humanities scholars’ text analysis needs, participants noted that access to and use of copyright-protected texts was a “frequent obstacle” in their ability to select appropriate texts for text data mining.
Potential legal hurdles do not just deter text data mining research; they also bias it toward particular topics and sources of data. In response to confusion over copyright, website terms of use, and other perceived legal roadblocks, some digital humanities researchers have gravitated to low-friction research questions and texts to avoid decision-making about rights-protected data. They use texts that have entered into the public domain or use materials that have been flexibly licensed through initiatives such as Creative Commons or Open Data Commons. When researchers limit their research to such sources, it is inevitably skewed, leaving important questions unanswered, and rendering resulting findings less broadly applicable. A growing body of research also demonstrates how race, gender, and other biases found in openly available texts have contributed to and exacerbated bias in developing artificial intelligence tools.
The Solution
The good news is that the NEH has agreed to support an Institute for Advanced Topics in the Digital Humanities to help key stakeholders to learn to better navigate legal issues in text data mining. Thanks to the NEH’s $165,000 grant, Rachael Samberg of UC Berkeley Library’s Office of Scholarly Communication Services will be leading a national team (identified below) from more than a dozen institutions and organizations to teach humanities researchers, librarians, and research staff how to confidently navigate the major legal issues that arise in text data mining research.
Our institute is aptly called Building Legal Literacies for Text Data Mining (Building LLTDM), and will run from June 23-26, 2020 in Berkeley, California. Institute instructors are legal experts, humanities scholars, and librarians immersed in text data mining research services, who will co-lead experiential meeting sessions empowering participants to put the curriculum’s concepts into action.
In October, we will issue a call for participants, who will receive stipends to support their attendance. We will also be publishing all of our training materials in an openly-available online book for researchers and librarians around the globe to help build academic communities that extend these skills.
Building LLTDM team member Matthew Sag, a law professor at Loyola University Chicago School of Law and leading expert on copyright issues in the digital humanities, said he is “excited to have the chance to help the next generation of text data mining researchers open up new horizons in knowledge discovery. We have learned so much in the past ten years working on HathiTrust [a text-minable digital library] and related issues. I’m looking forward to sharing that knowledge and learning from others in the text data mining community.”
Team member Brandon Butler, a copyright lawyer and library policy expert at the University of Virginia, said, “In my experience there’s a lot of interest in these research methods among graduate students and early-career scholars, a population that may not feel empowered to engage in “risky” research. I’ve also seen that digital humanities practitioners have a strong commitment to equity, and they are working to build technical literacies outside the walls of elite institutions. Building legal literacies helps ease the burden of uncertainty and smooth the way toward wider, more equitable engagement with these research methods.”
Kyle K. Courtney of Harvard University serves as Copyright Advisor at Harvard Library’s Office for Scholarly Communication, and is also a Building LLTDM team member. Courtney added, “We are seeing more and more questions from scholars of all disciplines around these text data mining issues. The wealth of full-text online materials and new research tools provide scholars the opportunity to analyze large sets of data, but they also bring new challenges having to do with the use and sharing not only of the data but also of the technological tools researchers develop to study them. I am excited to join the Building LLTDM team and help clarify these issues and empower humanities scholars and librarians working in this field.”
Megan Senseney, Head of the Office of Digital Innovation and Stewardship at the University of Arizona Libraries reflected on the opportunities for ongoing library engagement that extends beyond the initial institute. Senseney said that, “Establishing a shared understanding of the legal landscape for TDM is vital to supporting research in the digital humanities and developing a new suite of library services in digital scholarship. I’m honored to work and learn alongside a team of legal experts, librarians, and researchers to create this institute, and I look forward to integrating these materials into instruction and outreach initiatives at our respective universities.”
Next Steps
The Building LLTDM team is excited to begin supporting humanities researchers, staff, and librarians en route to important knowledge creation. Stay tuned if you are interested in participating in the institute.
In the meantime, please join us in congratulating all the members of the project team:
- Rachael G. Samberg (University of California, Berkeley) (Project Director)
- Scott Althaus (University of Illinois, Urbana-Champaign)
- David Bamman (University of California, Berkeley)
- Sara Benson (University of Illinois, Urbana-Champaign)
- Brandon Butler (University of Virginia)
- Beth Cate (Indiana University, Bloomington)
- Kyle K. Courtney (Harvard University)
- Maria Gould (California Digital Library)
- Cody Hennesy (University of Minnesota, Twin Cities)
- Eleanor Koehl (University of Michigan)
- Thomas Padilla (University of Nevada, Las Vegas; OCLC Research)
- Stacy Reardon (University of California, Berkeley)
- Matthew Sag (Loyola University Chicago)
- Brianna Schofield (Authors Alliance)
- Megan Senseney (University of Arizona)
- Glen Worthey (Stanford University)
Workshop: Publish Digital Books & Open Educational Resources with Pressbooks
Publish Digital Books & Open Educational Resources with Pressbooks
Monday, May 6, 11:10am-12:30pm
Academic Innovation Studio, Dwinelle Hall 117 (Level D)
If you’re looking to self-publish work of any length and want an easy-to-use tool that offers a high degree of customization, allows flexibility with publishing formats (EPUB, MOBI, PDF), and provides web-hosting options, Pressbooks may be great for you. Pressbooks is often the tool of choice for academics creating digital books, open textbooks, and open educational resources, since you can license your materials for reuse however you desire. Learn why and how to use Pressbooks for publishing your original books or course materials. You’ll leave the workshop with a project already under way! Register at bit.ly/dp-berk
Workshop: By Design: Graphics & Images Basics
By Design: Graphics & Images Basics
Monday, April 22, 4:10-5:00pm
D-Lab, 350 Barrows Hall
In this hands-on workshop, we will learn how to create web graphics for your digital publishing projects and websites. We will cover topics such as: image editing tools in Photoshop; image resolution for the web; sources for free public domain and Creative Commons images; and image upload to publishing tools such as WordPress. If possible, please bring a laptop with Photoshop installed. All UCB faculty and students can receive a free Adobe Creative Suite license. Register at bit.ly/dp-berk
Upcoming Workshops in this Series 2018-2019:
-
- Publish Digital Books & Open Educational Resources with Pressbooks
Please see bit.ly/dp-berk for details.