Library Leaders Forum 2016

2016-10-27 Libraries Leaders-162
The aerial shot of the group at the Library Leaders Forum 2016 by Brad Shirakawa

On October 26-28, I had the honor of attending the Library Leaders Forum 2016, which was held at the Internet Archive (IA). This year’s meeting was geared towards envisioning the library of 2020. October 26th was also IA’s 20th anniversary. I joined my Web Science and Digital Libraries (WS-DL) Research Group in celebrating IA’s 20 years of preservation by contributing a blog post with my own personal story, which highlights a side of the importance of Web preservation for the Egyptian Revolution. More personal stories about Web archiving exist on WS-DL blog.

img_6959
Brewster Kahle opens the Library Leaders Forum 2016

In the Great room at the Internet Archive Brewster Kahle, the Internet Archive’s Founder, kicked off the first day by welcoming the attendees. He began by highlighting the importance of openness, sharing, and collaboration for the next generation. During his speech he raised an important question, “How do we support datasets, the software that come with it, and open access materials?” According to Kahle, the advancement of digital libraries requires collaboration.

img_7017
IA’s Golden Floppy

After Brewster Kahle’s brief introduction, Wendy Hanamura, the Internet Archive’s Director of Partnership, highlighted parts of the schedule and presented the rules of engagement and communication:

  • The rule of 1 – Ask one question answer one question.
  • The rule of n – If you are in a group of n people, speak 1/n of the time.

Before giving the microphone to the attendees for their introductions, Hanamura gave a piece of advice, “be honest and bold and take risks“. She then informed the audience that “The Golden Floppy” award shall be given to the attendees who would share bold or honest statements.

Next was our chance to get to know each other through self-introductions. We were supposed to talk about who we are, where we are from and finally, what we want from this meeting or from life itself. The challenge was to do this in four words.

img_6964
“Partnership is important for advancing the library system”, Sylvain Belanger.

After the introductions, Sylvain Belanger, the Director of Preservation of Library and Archives in Canada, talked about where his organization will be heading in 2020. He mentioned the physical side of the work they do in Canada to show the challenges they experience. They store, preserve, and circulate over 20 million books, 3 million maps, 90,000 films, and 500 sheets of music.

We cannot do this alone!” Belanger exclaimed. He emphasized how important a partnership is to advance the library field. He mentioned that the Library and Archives in Canada is looking to enhance preservation and access as well as looking for partnerships. They would also like to introduce the idea of innovation into the mindset of their employees. According to Belanger, the Archives’ vision for the year 2020 includes consolidating their expertise as much as they can and also getting to know how do people do their work for digitization and Web archiving.

After the Belanger’s talk, we split up into groups of three to meet other people we didn’t know so that we could exchange knowledge about what we do and where we came from. Then the groups of two will join to form a group of six that will exchange their visions, challenges, and opportunities. Most of the attendees agreed on the need for growth and accessibility of digitized materials. Some of the challenges were funding, ego, power, culture, etc.

Chris Edwards from the Getty Research Institute.
Chris Edwards from the Getty Research Institute.

Chris Edward, the Head of Digital Services at the Getty Research Institute, talked about what they are doing, where they are going, and the impact of their partnership with the IA. Edward mentioned that the uploads by the IA are harvested by HathiTrust and the Defense Logistics Agency (DLA). This allows them to distribute their materials. Their vision for 2020 is to continue working with the IA and expanding the Getty research portal, and digitize everything they have and make it available for everyone, anywhere, all the time. They also intend on automating metadata generation (OCR, image recognition, object recognition, etc.), making archival collections accessible, and doing 3D digitization of architectural models. They will then join forces with the International Image Interoperability Framework (IIIF) community to develop the capability to represent these objects. He also added that they want to help the people who do not have the ability to do it on their own.

img_6974
Wendy Hanamura is presenting the IA’s strategic plan for 2015-2020

After lunch, Wendy Hanamura walked us quickly through the Archive’s strategic plan for 2015-2020 and IA’s tools and projects. Some of these plans are:

  • Next generation Wayback Machine
    • Test pilot with Mozilla so they suggest archived pages for the 404
    • Wikimedia link rots
  • Building libraries together
  • The 20 million books
    • Table top scribe
    • Open library and discovery tool
    • Digitization supercenter
    • Collaborative circulation system
  • Television Archive — Political ads
  • Software and emulation
  • Proprietary code
  • Scientific data and Journals – Sharing data
  • Music — 78’s

No book should be digitized twice!”, this is how Wendy Hanamura ended her talk.

img_6973Then we had a chance to put our hands on the new tools by the IA and by their partners through having multiple makers’ space stations. There were plenty of interesting projects, but I focused on the International Research Data Commons– by Karissa McKelvey and Max Ogden from the Dat Project. Dat is a grant-funded project, which introduces open source tools to manage, share, publish, browse, and download research datasets. Dat supports peer-to-peer distribution system, (e.g., BitTorrent). Ogden mentioned that their goal is to generate a tool for data management that is as easy as Dropbox and also has a versioning control system like GIT.

After a break Jeffrey Mackie-Mason, the University Librarian of UC Berkeley, interviewed Brewster Kahle about the future of libraries and online knowledge. The discussion focused on many interesting issues, such as copyrights, digitization, prioritization of archiving materials, cost of preservation, avoiding duplication, accessibility and scale, IA’s plans to improve the Wayback Machine and many other important issues related to digitization and preservation. At the end of the interview, Kahle announced his white paper, which wrote entitled “Transforming Our Libraries into Digital Libraries”, and solicited feedback and suggestions from the audience.

https://twitter.com/tripofmice/status/791790807736946688

https://twitter.com/tripofmice/status/791786514497671168

Brad Shirakawa
The photographer Brad Shirakawa while taking  an aerial shot at the Great room.

At the end of the day, we had an unusual and creative group photo by the great photographer Brad Shirakawa who climbed out on a narrow plank high above the crowd to take our picture.

On day two the first session I attended was a keynote address by Brewster Kahle about his vision for the Internet Archive’s Library of 2020, and what that might mean for all libraries.

Heather Christenson from HeathiTrust.
Heather Christenson from HeathiTrust.

Heather Christenson, the Program Officer for HathiTrust, talked about where HeathiTrust is heading in 2020. Christenson started by briefly explaining what is HathiTrust and why HathiTrust is important for libraries. Christenson said that HathiTrust’s primary mission is preserving for print and digital collections, improving discovery and access through offering text search and bibliographic data APIs, and generating a comprehensive collection of the US federal documents. Christensen mentioned that they did a survey about their membership and found that people want them to focus on books, videos, and text materials.

A panel discussion about the Legal Strategies and Practices for libraries.
A panel discussion about the Legal Strategies and Practices for libraries.

Our next session was a panel discussion about the Legal Strategies Practices for libraries by Michelle Wu, the Associate Dean for Library Services and Professor of Law at the Georgetown University Law Center, and Lila Bailey, the Internet Archive’s Outside Legal Counsel. Both speakers shared real-world examples and practices. They mentioned that the law has never been clearer and it has not been safer about digitizing, but the question is about access. They advised the libraries to know the practical steps before going to the institutional council. “Do your homework before you go. Show the usefulness of your work, and have a plan for why you will digitize, how you will distribute, and what you will do with the takedown request.”

Tom Rieger talks about the LOC’s 2020 strategic plan.
Tom Rieger talks about the LOC’s 2020 strategic plan.

After the panel Tom Rieger, the Manager of Digitization Services Section at the Library of Congress (LOC), discussed the 2020 vision for the Library of Congress. Reiger spoke of the LOC’s 2020 strategic plan. He mentioned that their primary mission is to serve the members of Congress, the people in the USA, and the researchers all over the world by providing access to collections and information that can assist them in decision making. To achieve their mission the LOC plans to collect and preserve the born digital materials and provide access to these materials, as well as providing services to people for accessing these materials. They will also migrate all the formats to an easily manageable system and will actively engage in collaboration with many different institutions to empowering the library system, and adapt new methods for fulfilling their mission.

img_6991

 

In the evening, there were different workshops about tools and APIs that IA and their partners provided. I was interested in the RDM workshop by Max Ogden and Roger Macdonald. I wanted to explore the ways we can support and integrate this project into the UC Berkeley system. I gained more information about how the DAT project worked through live demo by Ogden. We also learned about the partnership between the Dat Project and the Internet Archive to start storing scientific data and journals at scale.

Notes from “Long-Term Storage for Research Data Management” session.

We then formed into small groups around different topics on our field to discuss what challenges we face and generate a roadmap for the future. I joined the “Long-Term Storage for Research Data Management” group to discuss what the challenges and visions of storing research data and what should libraries and archives do to make research data more useful. We started by introducing ourselves. We had Jefferson Bailey from the Internet Archive, Max Ogden, Karissa from the DAT project, Drew Winget from Stanford libraries, Polina Ilieva from the University of California San Francisco (UCSF), and myself, Yasmin AlNoamany.

Some of the issues and big-picture questions that were addressed during our meeting:

  • The long-term storage for the data and what preservation means to researchers.
  • What is the threshold for reproducibility?
  • What do researchers think about preservation? Does it mean 5 years, 15 years, etc.?
  • What is considered as a dataset? Harvard considers anything/any file that can be interpreted as a dataset.
  • Do librarians have to understand the data to be able to preserve it?
  • What is the difference between storage and preservation? Data can be stored, but long-term preservation needs metadata.
  • Do we have to preserve everything? If we open it to the public to deposit their huge datasets, this may result in noise. For the huge datasets what should be preserved and what should not?
  • Privacy and legal issues about the data.

Principles of solutions

  • We need to teach researchers how to generate metadata and the metadata should be simple and standardized.
  • Everything that is related to research reproducibility is important to be preserved.
  • Assigning DOIs to datasets is important.
  • Secondary research – taking two datasets and combine them to produce something new. In digital humanities, many researchers use old datasets.
  • There is a need to fix the 404 links for datasets.
  • There is should be an easy way to share data between different institutions.
  • Archives should have rules for the metadata that describe the dataset the researchers share.
  • The network should be neutral.
  • Everyone should be able to host a data.
  • Versioning is important.

Notes from the other Listening posts:

img_7007
Polina Ilieva from UCSF wrapped up the meeting.

At the end of the day, Polina Ilieva, the Head of Archives and Special Collections at UCSF, wrapped up the meeting by giving her insight and advice. She mentioned that for accomplishing their 2020 goals and vision, there is a need to collaborate and work together. Ilieva said that the collections should be available and accessible for researchers and everyone, but there is a challenge of assessing who is using these collections and how to quantify the benefits of making these collections available. She announced that they would donate all their microfilms to the Internet Archive! “Let us all work together to build a digital library, serve users, and attract consumers. Library is not only the engine for search, but also an engine for change, let us move forward!” This is how Ilieva ended her speech.

It was an amazing experience to hear about the 2020 vision of the libraries and be among all of the esteemed library leaders I have met. I returned with inspiration and enthusiasm for being a part of this mission and also ideas for collaboration to advance the library mission and serve more people.

–Yasmin AlNoamany


Primary Sources: Farm Security Administration – Office of War Information Photograph Collection

“The photographs in the Farm Security Administration – Office of War Information Photograph Collection form an extensive pictorial record of American life between 1935 and 1944. This U.S. government photography project was headed for most of its existence by Roy E. Stryker, who guided the effort in a succession of government agencies: the Resettlement Administration (1935-1937), the Farm Security Administration (1937-1942), and the Office of War Information (1942-1944). The collection also includes photographs acquired from other governmental and non-governmental sources, including the News Bureau at the Offices of Emergency Management (OEM), various branches of the military, and industrial corporations.”

These photographs were originally intended to document the need for agricultural assistance and to record how the FSA addressed that need. However, the scope of the collection far exceeded these parameters and the collection encompasses pictures that depict everyday life of Americans, the effects of the Great Depression and the Dust Bowl, the migration West or to industrial cities of displaced people, and America’s mobilization for World War II. Represented in the collection are works of well-known photographers of the period, including Dorothea Lange, Walker Evans, Jack Delano, and Esther Bubley.

The images from Black & White negatives have been digitized and can be viewed at the Library of Congress site. A different page provides access to the approximately 1600 color photographs.

Not all of the images were printed, but even so the number of printed images became difficult to manage. The archivist Paul Vanderbilt was hired to arrange them and, recognizing that researchers would approach the collection with different needs, he devised two organizational schemes. He first organized sets of prints into “stories,” generally consisting of images with the same subject matter or from a specific geographic region. These were called LOTs (examples of which can be found in Documenting America: Photographers on Assignment.) The LOTs were microfilmed by the Library of Congress.

The LOTs were then dismantled and the collection was reorganized geographically, and then according to subject classification numbers. The images online can be browsed through a subject index.

This organization scheme is also reflected in the microfiche collection of the printed photographs, called America 1935-1946: the photographs of the U.S. Department of Agriculture, Farm Security Administration, and the U.S. Office of War InformationThe Library’s copy of this collection is housed in the Newspapers & Microforms Library, located in 40 Doe Library.

Since the unprinted photographs did not have this organizational scheme applied to them, they are not as easily accessible through the online catalog search. After conducting a search, go to the description for any FSA/OWI image and select the “Browse neighboring items by call number” link. The Library of Congress continues its efforts to add metadata to these records so they will increasingly be easier to locate.