Van de Sompel's Research: Archiving Online Scholarly Material

Herbert Van de Sompel, a scientist and Los Alamos National Laboratory (LANL) and a research affiliate at the New Mexico Consortium (NMC), is concerned about the problem of how to archive scholarly materials. This has become an issue due to the flood of scholarly articles and artifacts published daily online as well as the fact that the nature of scholarly communication is changing. Researchers are communicating in different ways and on different platforms than in the past. Researchers now often use online portals, often general purpose and not dedicated to scholarly work, that specify different topics such as assessment, discovery, analysis, writing, publication and outreach.

While traditional scholarly resources like journal articles do have archival efforts in place, the broad variety of other types of web-based scholarly objects are largely neglected when archiving is concerned. These scholarly materials can include software, workflows, presentations, video recording of experiments, blog posts that dig deep in to specific aspects of research, project websites, etc. Van de Sompel and colleagues call these neglected online materials the “scholarly orphans”.

These scholarly orphans are regularly referenced in scientific papers. But since the web platforms that host them may disappear and since they are not archived, over time, looking up the referenced materials becomes impossible. So the question is how to archive them.
Van de Sompel is collaborating with Michael L. Nelson of Old Dominion University, Martin Klein of LANL and Harihar Shankar of both LANL and the NMC on a new Andrew W. Mellon funded project to explore how these scholarly orphans can be archived.
In this research, they are exploring the feasibility of an approach that automatically monitors a researcher's account in various web portals and returns information about newly added or updated artifacts. The web location (URL) of such artifacts is then passed on to a web crawler that captures artifacts and associated descriptive metadata, and deposits those in a web archive. The approach assumes knowledge of a researcher's identity in various web portals and an institutional process that operates the monitoring, capturing, and archiving processes. Building on the Memento and Robust Links work that Van de Sompel and colleagues did previously, access to artifacts referenced in scientific papers can automatically be routed to copies in web archives.

To learn more read Van de Sompel’s article: Discovering Scholarly Orphans using ORCID

© 2018 New Mexico Consortium