Conference proceeding
InterPlanetary Wayback: The Permanent Web Archive
2016 IEEE/ACM JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), v 2016-, 7559620
01 Jan 2016
Abstract
To facilitate permanence and collaboration in web archives, we built InterPlanetary Wayback to disseminate the contents of WARC files into the IPFS network. IPFS is a peer-topeer content-addressable file system that inherently allows deduplication and facilitates opt-in replication. We split the header and payload of WARC response records before disseminating into IPFS to leverage the deduplication, build a CDXJ index, and combine them at the time of replay. From a 1.0 GB sample Archive-It collection of WARCs containing 21,994 mementos, we found that on an average, 570 files can be indexed and disseminated into IPFS per minute. We also found that in our naive prototype implementation, replay took on an average 370 milliseconds per request.
Metrics
Details
- Title
- InterPlanetary Wayback: The Permanent Web Archive
- Creators
- Sawood Alam - Old Dominion UniversityMat Kelly - Old Dominion UniversityMichael L. Nelson - Old Dominion University
- Publication Details
- 2016 IEEE/ACM JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), v 2016-, 7559620
- Series
- ACM-IEEE Joint Conference on Digital Libraries JCDL
- Publisher
- IEEE
- Number of pages
- 2
- Grant note
- 1526700 / Direct For Computer & Info Scie & Enginr; National Science Foundation (NSF); NSF - Directorate for Computer & Information Science & Engineering (CISE)
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science
- Web of Science ID
- WOS:000389502300066
- Scopus ID
- 2-s2.0-84989831095
- Other Identifier
- 991021786467304721
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Web of Science research areas
- Computer Science, Interdisciplinary Applications