Logo image
Modular, focused data science education improves biomedical learners' abilities: A study of the Data and Analytics for Research Training (DART) program
Journal article   Peer reviewed

Modular, focused data science education improves biomedical learners' abilities: A study of the Data and Analytics for Research Training (DART) program

Rose Hartman, K Joy Payton, Rose Franzen, Meredith Lee, Elizabeth Drellich, Ali Shokoufandeh and Jeffrey Pennington
PLoS computational biology, v 21(7), e1013249
17 Jul 2025
PMID: 40674413
url
https://doi.org/10.1371/journal.pcbi.1013249View
Published, Version of Record (VoR)

Abstract

The increasing availability of big data and adoption of sophisticated computational techniques in biomedical research has exciting implications for our scientific understanding of human health. However, researchers report struggling to find data science education that meets their needs, despite the fact that many training programs and online resources exist. There is a lack of evidence on the strengths and weaknesses of various training options, making selecting an educational path daunting. We created a new data science training program focused on rigorous, reproducible methods for biomedical research, making use of tightly scoped modular content that can be flexibly arranged to provide a curriculum tailored to a researcher's specific needs and skill level. Moreover, we ran a study testing the program's effectiveness, providing not only another option for data science training but also a model for collecting and sharing relevant data on data science education programs. We ran two waves of research participants, adjusting our materials in between to improve both the training program and our research design. For both waves, we pre-registered hypotheses that learners' self-reported data science ability and level of agreement with important tenets of open science would increase over the course of the program. Indeed, learners showed significant improvement in data science ability (Wave 1: t(47) = 10.18, p < .001, Wave 2: t(238) = 17.12, p < .001) and greater agreement with open science values (Wave 1: t(47) = 3.56, p < .001, Wave 2: t(238) = 7.95, p < .001). Follow up analyses underscore the robustness of improvement in data science ability. The improvement in open science values was more moderate and was significant only in some of our pre-registered hypothesis tests, likely due to a ceiling effect as most learners reported high agreement with open science values at pretest.

Metrics

2 Record Views

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being
#16 Peace, Justice and Strong Institutions

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
Web of Science research areas
Biochemical Research Methods
Mathematical & Computational Biology
Logo image