Flexible metadata pipelines are crucial for supporting the FAIR data
principles. Despite this need, researchers seldom report their approaches for
identifying metadata standards and protocols that support optimal flexibility.
This paper reports on an initiative targeting the development of a flexible
metadata pipeline for a collection containing over 300,000 digital fish
specimen images, harvested from multiple data repositories and fish
collections. The images and their associated metadata are being used for
AI-related scientific research involving automated species identification,
segmentation and trait extraction. The paper provides contextual background,
followed by the presentation of a four-phased approach involving: 1. Assessment
of the Problem, 2. Investigation of Solutions, 3. Implementation, and 4.
Refinement. The work is part of the NSF Harnessing the Data Revolution, Biology
Guided Neural Networks (NSF/HDR-BGNN) project and the HDR Imageomics Institute.
An RDF graph prototype pipeline is presented, followed by a discussion of
research implications and conclusion summarizing the results.
Metrics
4 Record Views
Details
Title
Toward a Flexible Metadata Pipeline for Fish Specimen Images
Creators
Dom Jebbia
Xiaojun Wang
Yasin Bakis
Henry L BartJr
Jane Greenberg
Resource Type
Preprint
Language
English
Academic Unit
Information Science (Informatics)
Other Identifier
991020532099404721
Research Home Page
Browse by research and academic units
Learn about the ETD submission process at Drexel
Learn about the Libraries’ research data management services