Logo image
WARCreate: create wayback-consumable WARC files from any webpage
Conference proceeding   Open access

WARCreate: create wayback-consumable WARC files from any webpage

Mat Kelly and Michele Weigle
Proceedings of the 12th ACM/IEEE-CS joint conference on digital libraries, pp 437-438
10 Jun 2012
url
https://digitalcommons.odu.edu/cgi/viewcontent.cgi?article=1154&context=computerscience_fac_pubsView
Open

Abstract

browser internet archive personal web archiving WARC wayback machine
The Internet Archive's Wayback Machine is the most common way that typical users interact with web archives. The Internet Archive uses the Heritrix web crawler to transform pages on the publicly available web into Web ARChive (WARC) files, which can then be accessed using the Wayback Machine. Because Heritrix can only access the publicly available web, many personal pages (e.g. password-protected pages, social media pages) cannot be easily archived into the standard WARC format. We have created a Google Chrome extension, WARCreate, that allows a user to create a WARC file from any webpage. Using this tool, content that might have been otherwise lost in time can be archived in a standard format by any user. This tool provides a way for casual users to easily create archives of personal online content. This is one of the first steps in resolving issues of "long term storage, maintenance, and access of personal digital assets that have emotional, intellectual, and historical value to individuals".

Metrics

29 Record Views
12 citations in Scopus
17 readers on Mendeley

Details

Logo image