Coming Soon: Love Your Data, from Editathons to Containers!

UC Berkeley has been loving its data for a long time, and has been part of the international movement which is Love Data Week (LDW) since at least 2016, even during the pandemic!  This year is no exception—the UC Berkeley Libraries and our campus partners are offering some fantastic workshops (four of which are led by our very own librarians) as part of the University of California-wide observance.

Love Data Week 2023 is happening next month, February 13-17 (it’s always during the week of Valentine’s Day)!

University of California 2023 Love Data Week calendar with UC Berkeley offerings

UC Berkeley Love Data Week offerings for 2023 include:

GIS & Mapping: Where to Start

Wikipedia Edit-a-thon (you can also dip into Wikidata at other LDW events)

Introduction to Containers

Textual Analysis with Archival Materials

Getting Started with Qualitative Data Analysis

All members of the UC community are welcome—we hope you will join us!  Registration links for our offerings are above, and the full UC-wide calendar is here.   If you are interested in learning more about what the library is doing with data, check out our new Data + Digital Scholarship Services page.  And, feel free to email us at librarydataservices@berkeley.edu.   Looking forward to data bonding next month!


Love data? Join us for Love Data Week 2022, Feb. 14-18!

Once again, UC Libraries are collaborating on a UC-wide Love Data Week series of talks, presentations, and workshops Feb. 14-18, 2022. With over 30 presentations and workshops, there’s plenty to choose from, with topics such as:

  • How to write effective data management plans
  • Text analysis with Python
  • How and where to share your research data
  • Geospatial analysis with R and with Jupyter Notebooks
  • Data ethics & justice
  • Cleaning and coding data for qualitative analysis
  • Software management for researchers
  • An introduction to databases for newspapers and social science data
  • 3-D data, visualization, and mapping

All members of the UC community are invited to attend these events to gain hands-on experience, learn about resources, and engage in discussions about data needs throughout the research process. To register for workshops during this week and see what other sessions will be offered UC-wide, visit the UC Love Data Week 2022 website.


Event: Workshops on working with qualitative and textual data

The Library Data Services Program is offering a series of workshops on working with qualitative and textual data. Each workshop is designed to help novice learners get started with cleaning, organizing, analyzing, and presenting qualitative or textual data. Sessions include cleaning and coding qualitative data in MaxQDA and the open-source Taguette program, organizing and writing up research projects in Scrivener, and archiving qualitative data once a project has been completed. Each workshop is designed to act as a starting point for learning concepts and will familiarize attendees with additional resources for getting help.

  1. Archiving data with the Qualitative Data Repository (QDR)

Wednesday, January 26th from 10:00 – 11:00 AM

  1. What do I do with all of this text? Cleaning and coding data for qualitative analysis

Tuesday, February 15th: 10:00 AM – 12:00 PM

  1. Getting Started with MaxQDA

Monday, March 14th: 1:00 – 3:00 PM

  1. Introduction to Scrivener

Monday, April 18th: 1:00 – 3:00 PM


A Library Research Journey (Pandemic Edition)

Screenshot of team members
Association of College and Research Libraries conference poster–screenshot of recorded talk

Even beyond those who believe that librarians sit around and read books all day (which would be delightful but is most definitely not our reality), many are surprised to learn that librarians double as active researchers. This is especially true in settings where librarians are members of the faculty, but even where that isn’t the case, such as at Berkeley, librarians are born investigators and it carries over into wanting to find out about and add to knowledge of our settings.

What does it look like to conduct library research?  Glad you asked! In our case, it started with a conversation and an idea.  Natalia Estrada (now Berkeley’s Political Science and Public Policy Librarian, then the Social Sciences Collection and Reference Assistant and in library school) and I were talking about how much we admired the work of Kaetrena Davis Kendrick.  Kendrick wrote a foundational work in the study of librarian workplace morale, The Low Morale Experience of Academic Librarians: A Phenomenological Study, and it sparked many more studies on this topic.  But, where were the studies of library staff experiences?  We wanted to find out!

We were lucky to recruit two colleagues who added so much to the team: Bonita Dyess, Circulation/Reserves Supervisor at the Earth Sciences/Map Library, and Celia Emmelhainz, Berkeley’s Anthropology & Qualitative Research Librarian.  First we applied for (and eventually got) funding for the research from LAUC (the Librarians Association of the University of California).  This meant we could pay for transcribing our interviews, give the participants gift cards, and buy qualitative data analysis software.  Then we applied for (and got) approval from the IRB (Institutional Review Board), making sure we were complying with processes for research with human subjects.

Here’s where the “pandemic edition” part comes in. All this planning and applying, starting in November 2019, took time; so, at the point we were actually ready to recruit participants, it was April 2020. We were sheltering in place, and not sure how this all would work (although it was probably better than having to go virtual in mid-stream)! Nevertheless, we hurled out information about and invitations to be part of the study to every list-serv, association, and friendly librarian we could think of, nationwide.  We ended up doing 34 interviews with academic library staff from a range of locations and institution types (purposefully excluding the UC system), during a three-week period in May-June 2020.   Due to COVID these were all online, either by phone or Google Meet (sort of like Zoom), and we asked a structured list of questions, with room for branching into other topics, or diving deeply.  Celia trained a wonderful student to transcribe the interviews, and once we had those transcripts and stripped identifying information from them, we were off– coding away (using MAXQDA software), and drawing themes, quotes, recommendations, and other findings from the surprisingly rich information we’d collected.

Next—we had to start getting the information out into the world!  Our eventual goal is to write a paper, or several, for publication.  There are a number of library and information science journals out there that we are considering… but that takes time as well, and we wanted to start presenting our findings sooner.  So, we did an “initial findings” presentation to the UC Berkeley Library Research Working Group, and then stepped into the big time with acceptance to present a poster at the 2021 Association of College and Research Libraries online conference (our poster got almost 600 views), and with a webinar we did for the Pennsylvania Library Association (both the poster and the webinar slides are available through the UC’s eScholarship portal).  All our work to get to this point is hopefully now helping others.

Screenshot of title slide of PA Library Association webinar

And, a word about connecting with our participants.  We were bowled over by their generosity with us and by all they had to say: much that we didn’t expect, and much that they were grateful someone was even asking about.  It ended up that we had captured one of the last opportunities to get a snapshot of pre-COVID library staff life; people were still in limbo, and talked about their regular jobs before any lockdowns, for the most part. At that point most expected to be back in their libraries and all to be normal by the end of the summer 2020.  We know now that that didn’t happen, and we know that library re-openings and staff roles in them have been challenging and sometimes contentious; we wish we’d known to ask for permission to re-interview our participants—even if only to check in with them.  But how could we have known?  We wonder how they are.

So now, we have papers to write, and thinking to do about how to take our questions into new avenues of research—because it’s a never-ending, and completely exciting process, and, we suspect, will be very different (easier? or not?) in the post-COVID landscape.  Do you have ideas for us?  We’d love to hear them!  Or want to hear more about our morale study? Please get in touch with us at librarystaffmorale@berkeley.edu!


Love Data? Join Us During Love Data Week 2021, Feb 8-12!

Love Data Week 2021

Since our Love Data Week invitation post last year, the COVID pandemic has created a new world— and amazing new opportunities and challenges related to data.  Just a peek at data.berkeley.edu (the portal for Berkeley’s Computing, Data Science, and Society Division) shows that data-related research during this past pandemic year, even with its intense and difficult challenges, has revealed new insights.  Check out “Pandemic provides real-time experiment for diagnosing, treating misinformation, disinformation”.*  

So, it’s fitting that Love Data Week 2021 at Berkeley, hosted by the UC Berkeley Library in partnership with Berkeley’s Research IT department, is focused on the kinds of issues we are confronted with in a wholly-online research environment.  Join us on Tuesday for a session on ethical considerations in data, most definitely a concern with many of Berkeley’s researchers looking at issues related to COVID; on Wednesday for a talk on cybersecurity (aimed at graduate researchers but all are welcome); on Thursday for another security-related workshop, “Getting Started with LastPass & Veracrypt”; and on Friday for an introduction to Savio, Berkeley’s high performance computing cluster.  Please click on this link for information on these, and registration links!

Questions?  E-mail LDW 2021 at researchdata@berkeley.edu .  And, if we’ve whetted your appetite for data and more data, take a look at the University of California-wide Love Data Week offerings.  If you’ve ever wondered what an API is, or want a quick intro to SQL, or even just want to know what the acronyms stand for, there are these sessions and more!

*  The same page makes it clear that data is for everyone; check out “I Am a Data Scientist”, about a student who came to Berkeley as an English major and discovered how data can “shed light on larger-scale questions”, and “Translating Numbers Into Words: The Art of Writing About Data Science”, featuring three Berkeleyites who are getting the word out about data.

 


Upcoming workshop on how to share and publish data

The image is a slide with the title of the workshop, data, and presenters

On December 1, 2020 from 12:30pm–2:00pm the Library is teaming up with Research Data Management to host a workshop How to Share and Publish Data: Resources, Law, and Policy. Signup here.

Are you unsure about how you can use or reuse other people’s data in your teaching or research, and what the terms and conditions are? Do you want to share your data with other researchers or license it for reuse but are wondering how and if that’s allowed? Do you have questions about university or granting agency data ownership and sharing policies, rights, and obligations? We will provide clear guidance on all of these questions and more in this interactive webinar on the ins-and-outs of data sharing and publishing.

Join the Library’s Office of Scholarly Communication Services and the Research Data Management Program as we:

  • Explore venues and platforms for sharing and publishing data
  • Unpack the terms of contracts and licenses affecting data reuse, sharing, and publishing
  • Help you understand how copyright does (and does not) affect what you can do with the data you create or wish to use from other people
  • Consider how to license your data for maximum downstream impact and reuse
  • Demystify data ownership and publishing rights and obligations under university and grant policies

Intended audiences include faculty, grad students, post-docs, instructors, and academic support staff, but anyone interested is welcome to attend.


“Checking the Boxes” – A panel on race, ethnicity, and the Census

Although we don’t always think of it that way, one federal government program that affects each of us in the United State is the decennial census.  And among the challenges of many kinds that a pandemic has brought us, its effects on gathering good quality census data is high on the list.

Earlier this year, the Library hosted a well-attended (physical) exhibit related to the census, Power and the People: The US Census and Who Counts (which can still be experienced online).  Related to the exhibit, we were on board with our plan to host a panel of campus experts on the contested race and ethnicity questions in the census, and how they’ve shifted over time…. Until March 17, when the Bay Area went into a shelter-in-place order and the program had to be postponed.  But last month, thanks to a persistent team, generous panelists, and the wonders of Zoom, we were thrilled to able to present the panel at last, online!

The program, titled Checking the Boxes: Race(ism), Latinx and the Census, featured three UC Berkeley experts on racial and ethnic categorizations in the census.  Cristina Mora (Associate Professor of Sociology and Chicano/Latino studies), Tina Sacks (Assistant Professor, School of Social Welfare), and Victoria Robinson (Lecturer and American Cultures Program Director, Department of Ethnic Studies) were joined by our moderator, librarian Jesse Silva, for presentations and a lively Q&A.

Professor Mora started the program off with the information that “ethnic and race categories are political constructs… They are not set-in-stone scientific markers of identity or genetic composition.” She noted that since the census counts are directly related to funding, communities have a vested interest in getting accurate and complete counts, but this can be very difficult for groups and areas that are designated Hard to Count. Professor Sacks continued by emphasizing the ways in which census-driven funding allocations can affect people in poverty and those in social safety net programs.  She also noted the intersections shown by census data between race and place, such as areas with a substantial number of incarcerated people. Finally Professor Robinson added background and context by discussing the site racebox.org, which shows the history of the race questions on the census from 1790 onwards, and which illuminates the changes in the cultural and social conceptions of what race is and how it can be measured.

The program concluded with an animated question and answer period, which included Professor Mora’s elaborating on the differences between racial and ethnic categories, Professor Sacks (who has actually been a census enumerator) discussing the challenges of counting the homeless population, and Professor Robinson revisiting the question of incarceration and the Attica problem: “[Incarcerated people’s] residence is considered to be a prison. That’s not their home, and the relationship then to the power…in the communities that they [aren’t from], that’s the Attica problem.”

Of course, this summary doesn’t do justice to the range and depth of the issues discussed.  If you missed this program, or would like to see it again, check it out on the UC Berkeley Library’s YouTube channel!

Census panel speaker photos
(Clockwise from top left) Jesse Silva, Cristina Mora, Victoria Robinson, Tina Sacks

Data Publishing with Dryad Digital Repository 

The California Digital Library (CDL) recently partnered with Dryad to provide enhanced data publishing and curation support for researchers. (Photo by J. Pierre Carrillo for the UC Berkeley Library)
(Photo by J. Pierre Carrillo for the UC Berkeley Library)

The California Digital Library (CDL) recently partnered with Dryad to provide enhanced data publishing and curation support for researchers. Dryad is a free service that enables researchers to archive and make publicly available their research data for the long term. Dryad replaces Dash, which was the data repository previously available to the university. 

Datasets published in Dryad receive a Digital Object Identifier (DOI) and a citation, both of which provide the data a persistent location, identification, and makes the data citable in future use. Additionally, Dryad fulfills many of the data sharing requirements stipulated by funders and publishers, many of whom may require that data be made freely and openly available at the end of a project or upon publication. 

Publishing data to Dryad is relatively quick and easy. As a UC Berkeley researcher, begin the upload process by signing in to Dryad using your ORCID ID. The data is then reviewed by a curator, meaning the data is reviewed and enriched to be Findable, Accessible, Interoperable, and Reusable or FAIR. By making your data FAIR, others in your area of expertise will be able to locate, understand, and potentially reuse the data you generated. Data that is made easily findable and publicly available contributes to raising the quality of scholarly output by making the process of data production transparent. Funders require data publishing to better leverage research dollars and publishers require data publishing to enhance the quality of scholarly literature. 

Please visit datadryad.org to explore published datasets. If you have any questions about preparing your data for publication or using Dryad, please contact researchdata@berkeley.edu.


Event: Love Data Week workshops

Please join us for a series of events on February 11th-15th during Love Data Week.

This nationwide campaign is designed to raise awareness about data management, security, sharing, and preservation. Students, researchers, librarians and data specialists are invited to attend these events to gain hands on experience, learn about resources, and engage in discussion around data needs throughout the research process.

To register for these events and find out more, please visit: https://guides.lib.berkeley.edu/ldw2019

MONDAY, FEBRUARY 11
Intro to Savio workshop
3:30-5:00 pm, Dwinelle 117 (Academic Innovation Studio)
Berkeley Research Computing is offering an introductory training session on using Savio, the campus Linux high-performance computing cluster. We’ll give an overview of how the cluster is set up, different ways you can get access to the cluster, logging in, transferring files, accessing software, and submitting and monitoring jobs. New, prospective, and current users are invited.

TUESDAY, FEBRUARY 12
Code Ocean lunch & learn
12:00-1:00 pm, Doe Library, Room 190 (BIDS)
Join us for a demonstration and Q&A session on the Code Ocean platform! Code Ocean is a cloud-based computational reproducibility platform that provides researchers and developers an easy way to share, discover, and run code published in academic journals and conferences.

TUESDAY, FEBRUARY 12
Preparing your data and code for reproducible publication
2:00-4:00 pm, Doe Library, Room 190 (BIDS)
This is a step-by-step, practical workshop to prepare your research code and data for computationally reproducible publication. The workshop starts with some brief introductory information about computational reproducibility, but the bulk of the workshop is guided work with code and data. We cover the basic best practices for publishing code and data.

WEDNESDAY, FEBRUARY 13
Shaping Clouds: Scaling Infrastructure for Research and Instruction at Berkeley
1:00-2:00 pm, Doe Library, Room 190 (BIDS)
There are many great resources for research and instruction across campus, but it can be difficult to determine what is available and where to find it. Join us for a showcase and community discussion about two cutting-edge cloud platforms, Analytic Environments on Demand (AEoD) and JupyterHub, and how best to provide a holistic ecosystem of these and other tools.

THURSDAY, FEBRUARY 14
Data Security: I just called to say I love you
1:00-2:00 pm, Dwinelle 117 (Academic Innovation Studio)
Learn what love the Information Security & Policy office shows campus and why a day without ISP would break the University’s heart. We will also talk about simple ways you can protect your identity and show your data love.

Sponsored by the University Library,  Research IT, Berkeley Institute for Data Science, Information, Security and Policy, and CITRIS.


Library Carpentry Sprint May 10th and 11th

The UC Berkeley Library is hosting the 2018 Library Carpentry Sprint on May 10th and 11th. This sprint it a part of the larger 2018 Mozilla Global Sprint, and will take place in the Berkeley Institute for Data Science (BIDS), 190 Doe Library from 2-5pm on Thursday, May 10th and from 1-5pm on Friday, May 11th.  All are welcome and no experience with Library Carpentry or participating in a sprint is required. Come help us update the existing Library Carpentry curriculum or just come to see what Library Carpentry is all about. If you wish to sign up in advance, simply add you name to the Library Carpentry sprint etherpad under the UC Berkeley section. More information about Library Carpentry can be found here.

What

Library Carpentry Sprint is an international campaign that is a part of the larger Mozilla Global Sprint 2018. The goal of this Library Carpentry sprint is to improve/extend Library Carpentry lessons. Participants can contribute code or content, proofread writing, help with visual design and graphic art, do QA (quality assurance) on prototype tools, or advise or comment on project ideas or plans. All skill levels are welcome!

When

You can drop by anytime on May 10th from 2-5pm or May 11th from 1-5pm

Where

Berkeley Institute for Data Science (BIDS), 190 Doe Memorial Library

Questions

Contact Scott Peterson, speterso@berkeley.edu