Logo image
InterPlanetary Wayback: The Permanent Web Archive
Conference proceeding

InterPlanetary Wayback: The Permanent Web Archive

Sawood Alam, Mat Kelly and Michael L. Nelson
2016 IEEE/ACM JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), v 2016-, 7559620
01 Jan 2016

Abstract

Computer Science Computer Science, Interdisciplinary Applications Science & Technology Technology
To facilitate permanence and collaboration in web archives, we built InterPlanetary Wayback to disseminate the contents of WARC files into the IPFS network. IPFS is a peer-topeer content-addressable file system that inherently allows deduplication and facilitates opt-in replication. We split the header and payload of WARC response records before disseminating into IPFS to leverage the deduplication, build a CDXJ index, and combine them at the time of replay. From a 1.0 GB sample Archive-It collection of WARCs containing 21,994 mementos, we found that on an average, 570 files can be indexed and disseminated into IPFS per minute. We also found that in our naive prototype implementation, replay took on an average 370 milliseconds per request.

Metrics

13 Record Views
20 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas
Computer Science, Interdisciplinary Applications
Logo image