Logo image
From Analog Records to Computational Research Data: Building the AI-Ready Lab Notebook
Conference proceeding

From Analog Records to Computational Research Data: Building the AI-Ready Lab Notebook

Joel Pepper, Zach Siapano, Jacob Furst, Fernando Uribe-Romo, David Breen and Jane Greenberg
IEEE International Conference on Big Data, pp 6018-6023
08 Dec 2025

Abstract

AI-ready data Buildings Computational archival science digital collections Digital representation Documentation Error analysis lab notebooks Optical character recognition Pipelines Refining Soft sensors Data Mining History
Scientific laboratory notebooks, particularly those in analog, handwritten form, represent a significant yet underutilized data source for computational studies. This paper reports on our research to further develop a pipeline for transforming analog lab notebooks to AI-Ready digital archives. The research is conducted within the framework for Computational Archival Science (CAS), extending CAS principles, drawing from archival practice and computational thinking. We provide background context on laboratory notebook history and current day use, explore CAS as a framework for study, followed by our research goals and methods. Automated extraction results for table records found in the notebooks have an error rate under 5% on a per cell basis. The framework, methods, and our findings seek to advance pipelines for making analog records, both historical and current, accessible and curated for computational research. The findings presented underscore both the accelerating pace of extraction technologies and the importance of more structured, consistent analog documentation practices to support computational transformation and AI-readiness. The conclusion summarizes results and identifies next steps.

Metrics

1 Record Views

Details

Logo image