Wikipedia has become so central to our lives that we count on it to represent reality, and solid fact. When we encounter a new phenomenon, we check out our trusty online friend for more information. So, it was fascinating to me recently to see the lines blur between fiction and reality, when Wikipedia was used as a visual and social cue in the movie Tár, starring Cate Blanchett, about a famed female conductor. In the movie, one of the clues to the coming turbulence in Lydia Tár’s life is a screen capture of a mystery editor changing items on the conductor’s Wikipedia entry. It looked and felt so real, the filming and Blanchett’s performance so rivetingly vivid, that many people believed the film was a biopic of a real person. As Brooke LaMantia wrote in her article, No, Lydia Tar is Not Real,
“When I left the theater after watching Tár for two hours and 38 minutes, I immediately fumbled for my phone. I couldn’t wait to see actual footage of the story I had just seen and was so ready for my Wikipedia deep dive to sate me during my ride home. But when I frantically typed “Lydia Tar?” into Google as I waited for my train, I was greeted with a confusing and upsetting realization: Lydia Tár is not real…the film’s description on Letterboxd — “set in the international world of classical music, centers on Lydia Tár, widely considered one of the greatest living composer/conductors and first-ever female chief conductor of a major German orchestra” — is enough to make you believe Tár is based on a true story. The description was later added to a Wikipedia page dedicated to “Lydia Tár,” but ahead of the film’s October 28 wide release, that page has now been placed under a broader page for the movie as a whole. Was this some sort of marketing sleight of hand or just a mistake I stumbled upon? Am I the only one who noticed this? I couldn’t be, right? I thought other people had to be stuck in that same cycle of questioning: Wait, this has to be real. Or is it? She’s not a real person?
Wikipedia is central to LaMantia’s questioning! While it’s easy to understand people’s confusion in general, the Tár Wikipedia page, created by editors like you and like me, is very clear that this is a film, at least as of today’s access date, January 20, 2023… On the other hand, did you know you can click on the “View History” link on the page, and see every edit that has been made to it, since it was created, and who made that edit? If you look at the page resulting from one of the edits from October 27, 2022, you can see that it does look like Tár is a real person, and in fact, a person who later went on to edit this entry to make it clearer wrote, “Reading as it was, it is not clear if Lydia actually exists.” Maybe I should write to LaMantia and let her know.
I tell this story to show that clearly, Wikipedia is a phenomenon, and a globally central one, which makes it all the more amazing that it is created continuously, edit by edit, editor by editor. There are many ways in which our own and your own edits can create change, lead to social justice, correct misinformation and more. While it’s easy to get lost in the weeds of minute changes to esoteric entries, it’s also possible to improve pages on important figures in real-life history and bring them into our modern narrative and consciousness. And it’s easy to do!
If you are interested in learning more, and being part of this central resource, we warmly welcome you and invite you to join us on Wednesday, February 15, from 1-2:30 for our 2023 Wikipedia Editathon, part of the University of Calif0rnia-wide 2023 Love Data Week. No experience is required—we will teach you all you need to know about editing! (but, if you want to edit with us in real time, please create a Wikipedia account before the workshop). The link to register is here, and you can contact any of the workshop leaders (listed on the registration page) with questions. We look forward to editing with you!
UC Berkeley has been loving its data for a long time, and has been part of the international movement which is Love Data Week (LDW) since at least 2016, even during the pandemic! This year is no exception—the UC Berkeley Libraries and our campus partners are offering some fantastic workshops (four of which are led by our very own librarians) as part of the University of California-wide observance.
Love Data Week 2023 is happening next month, February 13-17 (it’s always during the week of Valentine’s Day)!
UC Berkeley Love Data Week offerings for 2023 include:
Wikipedia Edit-a-thon (you can also dip into Wikidata at other LDW events)
All members of the UC community are welcome—we hope you will join us! Registration links for our offerings are above, and the full UC-wide calendar is here. If you are interested in learning more about what the library is doing with data, check out our new Data + Digital Scholarship Services page. And, feel free to email us at firstname.lastname@example.org. Looking forward to data bonding next month!
Once again, UC Libraries are collaborating on a UC-wide Love Data Week series of talks, presentations, and workshops Feb. 14-18, 2022. With over 30 presentations and workshops, there’s plenty to choose from, with topics such as:
- How to write effective data management plans
- Text analysis with Python
- How and where to share your research data
- Geospatial analysis with R and with Jupyter Notebooks
- Data ethics & justice
- Cleaning and coding data for qualitative analysis
- Software management for researchers
- An introduction to databases for newspapers and social science data
- 3-D data, visualization, and mapping
All members of the UC community are invited to attend these events to gain hands-on experience, learn about resources, and engage in discussions about data needs throughout the research process. To register for workshops during this week and see what other sessions will be offered UC-wide, visit the UC Love Data Week 2022 website.
Since our Love Data Week invitation post last year, the COVID pandemic has created a new world— and amazing new opportunities and challenges related to data. Just a peek at data.berkeley.edu (the portal for Berkeley’s Computing, Data Science, and Society Division) shows that data-related research during this past pandemic year, even with its intense and difficult challenges, has revealed new insights. Check out “Pandemic provides real-time experiment for diagnosing, treating misinformation, disinformation”.*
So, it’s fitting that Love Data Week 2021 at Berkeley, hosted by the UC Berkeley Library in partnership with Berkeley’s Research IT department, is focused on the kinds of issues we are confronted with in a wholly-online research environment. Join us on Tuesday for a session on ethical considerations in data, most definitely a concern with many of Berkeley’s researchers looking at issues related to COVID; on Wednesday for a talk on cybersecurity (aimed at graduate researchers but all are welcome); on Thursday for another security-related workshop, “Getting Started with LastPass & Veracrypt”; and on Friday for an introduction to Savio, Berkeley’s high performance computing cluster. Please click on this link for information on these, and registration links!
Questions? E-mail LDW 2021 at email@example.com . And, if we’ve whetted your appetite for data and more data, take a look at the University of California-wide Love Data Week offerings. If you’ve ever wondered what an API is, or want a quick intro to SQL, or even just want to know what the acronyms stand for, there are these sessions and more!
* The same page makes it clear that data is for everyone; check out “I Am a Data Scientist”, about a student who came to Berkeley as an English major and discovered how data can “shed light on larger-scale questions”, and “Translating Numbers Into Words: The Art of Writing About Data Science”, featuring three Berkeleyites who are getting the word out about data.
— Yasmina Anwar (@yasmina_anwar) February 13, 2018
Last week, the University Library, the Berkeley Institute for Data Science (BIDS), the Research Data Management program were delighted to host Love Data Week (LDW) 2018 at UC Berkeley. Love Data Week is a nationwide campaign designed to raise awareness about data visualization, management, sharing, and preservation. The theme of this year’s campaign was data stories to discuss how data is being used in meaningful ways to shape the world around us.
At UC Berkeley, we hosted a series of events designed to help researchers, data specialists, and librarians to better address and plan for research data needs. The events covered issues related to collecting, managing, publishing, and visualizing data. The audiences gained hands-on experience with using APIs, learned about resources that the campus provides for managing and publishing research data, and engaged in discussions around researchers’ data needs at different stages of their research process.
Participants from many campus groups (e.g., LBNL, CSS-IT) were eager to continue the stimulating conversation around data management. Check out the full program and information about the presented topics.
Photographs by Yasmin AlNoamany for the University Library and BIDS.
LDW at UC Berkeley was kicked off by a walkthrough and demos about Scopus APIs (Application Programming Interface), was led by Eric Livingston of the publishing company, Elsevier. Elsevier provides a set of APIs that allow users to access the content of journals and books published by Elsevier.
In the first part of the session, Eric provided a quick introduction to APIs and an overview about Elsevier APIs. He illustrated the purposes of different APIs that Elsevier provides such as DirectScience APIs, SciVal API, Engineering Village API, Embase APIs, and Scopus APIs. As mentioned by Eric, anyone can get free access to Elsevier APIs, and the content published by Elsevier under Open Access licenses is fully available. Eric explained that Scopus APIs allow users to access curated abstracts and citation data from all scholarly journals indexed by Scopus, Elsevier’s abstract and citation database. He detailed multiple popular Scopus APIs such as Search API, Abstract Retrieval API, Citation Count API, Citation Overview API, and Serial Title API. Eric also overviewed the amount of data that Scopus database holds.
In the second half of the workshop, Eric explained how Scopus APIs work, how to get a key to Scopus APIs, and showed different authentication methods. He walked the group through live queries, showed them how to extract data from API and how to debug queries using the advanced search. He talked about the limitations of the APIs and provided tips and tricks for working with Scopus APIs.
Eric left the attendances with actionable and workable code and scripts to pull and retrieve data from Scopus APIs.
On the second day, we hosted a Data Stories and Visualization Panel, featuring Claudia von Vacano (D-Lab), Garret S. Christensen (BIDS and BITSS), Orianna DeMasi (Computer Science and BIDS), and Rita Lucarelli (Department of Near Eastern Studies). The talks and discussions centered upon how data is being used in creative and compelling ways to tell stories, in addition to rewards and challenges of supporting groundbreaking research when the underlying research data is restricted.
Claudia von Vacano, the Director of D-Lab, discussed the Online Hate Index (OHI), a joint initiative of the Anti-Defamation League’s (ADL) Center for Technology and Society that uses crowd-sourcing and machine learning to develop scalable detection of the growing amount of hate speech within social media. In its recently-completed initial phase, the project focused on training a model based on an unbiased dataset collected from Reddit. Claudia explained the process, from identifying the problem, defining hate speech, and establishing rules for human coding, through building, training, and deploying the machine learning model. Going forward, the project team plans to improve the accuracy of the model and extend it to include other social media platforms.
Next, Garret S. Christensen, BIDS and BITSS fellow, talked about his experience with research data. He started by providing a background about his research, then discussed the challenges he faced in collecting his research data. The main research questions that Garret investigated are: How are people responding to military deaths? Do large numbers of, or high-profile, deaths affect people’s decision to enlist in the military?
Garret discussed the challenges of obtaining and working with the Department of Defense data obtained through a Freedom of Information Act request for the purpose of researching war deaths and military recruitment. Despite all the challenges that Garret faced and the time he spent on getting the data, he succeeded in putting the data together into a public repository. Now the information on deaths in the US Military from January 1, 1990 to November 11, 2010 that was obtained through Freedom of Information Act request is available on dataverse. At the end, Garret showed that how deaths and recruits have a negative relationship.
Orianna DeMasi, a graduate student of Computer Science and BIDS Fellow, shared her story of working with human subjects data. The focus of Orianna’s research is on building tools to improve mental healthcare. Orianna framed her story about collecting and working with human subject data as a fairy tale story. She indicated that working with human data makes security and privacy essential. She has learned that it’s easy to get blocked “waiting for data” rather than advancing the project in parallel to collecting or accessing data. At the end, Orianna advised the attendees that “we need to keep our eyes on the big problems and data is only the start.”
Rita Lucarelli, Department of Near Eastern Studies discussed the Book of the Dead in 3D project, which shows how photogrammetry can help visualization and study of different sets of data within their own physical context. According to Rita, the “Book of the Dead in 3D” project aims in particular to create a database of “annotated” models of the ancient Egyptian coffins of the Hearst Museum, which is radically changing the scholarly approach and study of these inscribed objects, at the same time posing a challenge in relation to data sharing and the publication of the artifacts. Rita indicated that metadata is growing and digital data and digitization are challenging.
It was fascinating to hear about Egyptology and how to visualize 3D ancient objects!
We closed out LDW 2018 at UC Berkeley with a session about Research Data Management Planning and Publishing. In the session, Daniella Lowenberg (University of California Curation Center) started by discussing the reasons to manage, publish, and share research data on both practical and theoretical levels.
Daniella shared practical tips about why, where, and how to manage research data and prepare it for publishing. She discussed relevant data repositories that UC Berkeley and other entities offer. Daniela also illustrated how to make data reusable, and highlighted the importance of citing research data and how this maximizes the benefit of research.
At the end, Daniella presented a live demo on using Dash for publishing research data and encouraged UC Berkeley workshop participants to contact her with any question about data publishing. In a lively debate, researchers shared their experiences with Daniella about working with managing research data and highlighted what has worked and what has proved difficult.
We have received overwhelmingly positive feedback from the attendees. Attendees also expressed their interest in having similar workshops to understand the broader perspectives and skills needed to help researchers manage their data.
I would like to thank BIDS and the University Library for sponsoring the events.
The University Library, Research IT, and Berkeley Institute for Data Science will host a series of events on February 12th-16th during the Love Data Week 2018. Love Data Week a nationwide campaign designed to raise awareness about data visualization, management, sharing, and preservation.
Please join us to learn about multiple data services that the campus provides and discover options for managing and publishing your data. Graduate students, researchers, librarians and data specialists are invited to attend these events to gain hands-on experience, learn about resources, and engage in discussion around researchers’ data needs at different stages in their research process.
To register for these events and find out more, please visit: http://guides.lib.berkeley.edu/ldw2018guide
Intro to Scopus APIs – Learn about working with APIs and how to use the Scopus APIs for text mining.
01:00 – 03:00 p.m., Tuesday, February 13, Doe Library, Room 190 (BIDS)
Refreshments will be provided.
Data stories and Visualization Panel – Learn how data is being used in creative and compelling ways to tell stories. Researchers across disciplines will talk about their successes and failures in dealing with data.
1:00 – 02:45 p.m., Wednesday, February 14, Doe Library, Room 190 (BIDS)
Refreshments will be provided.
Planning for & Publishing your Research Data – Learn why and how to manage and publish your research data as well as how to prepare a data management plan for your research project.
02:00 – 03:00 p.m., Thursday, February 15, Doe Library, Room 190 (BIDS)
Hope to see you there!