Logo image
Are Aggregated Electronic Health Record Datasets Good for Research?
Journal article   Open access   Peer reviewed

Are Aggregated Electronic Health Record Datasets Good for Research?

Neal Goldstein, Brianne L Olivieri-Mui and Igor Burstyn
Journal of general internal medicine : JGIM, v 40, pp 3743-3749
12 Aug 2025
PMID: 40794368
Featured in Collection :   Research Supported by Drexel Libraries' OA Programs
url
https://doi.org/10.1007/s11606-025-09808-9View
Published, Version of Record (VoR) Open Access via Drexel Libraries Read and Publish Program 2025 Open CC BY V4.0

Abstract

data aggregation validity quantitative bias analysis Data Management or Analysis (Medical) Health Information Technology Electronic Health Records
There has been a proliferation of large-scale electronic health record (EHR) data platforms that pool across multiple healthcare organizations, such as the National Institutes of Health’s All of Us in the federal space and TriNetX and Epic Cosmos in the commercial space. There are unique issues that occur when EHR data are aggregated across disparate healthcare systems beyond the general—and more well known—concerns about secondary analysis of EHR data from a single entity. In this article, we define aggregated EHR data, contrasting it to other real-world data sources, highlight benefits and challenges when working with aggregated EHR data, offer several “good practices” to address these challenges, and conclude by discussing whether it is appropriate to pool these data together or not.

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
Web of Science research areas
Health Care Sciences & Services
Logo image