Applying machine learning models in multi-institutional studies can generate bias

Rebeckah K. Fussell; Meagan Sundstrom; Sabrina McDowell; N. G. Holmes

doi:10.1119/perc.2024.pr.Fussell

Back

Conference proceeding

Applying machine learning models in multi-institutional studies can generate bias

Rebeckah K. Fussell, Meagan Sundstrom, Sabrina McDowell and N. G. Holmes

2024 PHYSICS EDUCATION RESEARCH CONFERENCE, PERC, pp 144-149

01 Jan 2024

DOI: https://doi.org/10.1119/perc.2024.pr.Fussell

Featured in Collection : UN Sustainable Development Goals @ Drexel

Files and links (1)

url

https://doi.org/10.1119/perc.2024.pr.FussellView

Published, Version of Record (VoR) Open

Abstract

Education & Educational Research

Education, Scientific Disciplines

Social Sciences

There is increasing interest in deploying machine learning models at scale for multi-institutional studies in physics education research. Here we investigate the efficacy of applying machine learning models to institutions outside of their training set, using natural language processing to code open-ended survey responses. We find that, in general, changing institutional contexts can affect machine learning estimates of code frequencies: either previously documented sources of uncertainty increase in magnitude, new unknown sources of uncertainty emerge, or both. We also find an example where uncertainties do not change between the institution used in the training data and an institution not in the training data. Results suggest that attention to uncertainty is critical, especially when making measurements of student writing across multi-institutional data sets.

Metrics

22 Record Views

Details

Title: Applying machine learning models in multi-institutional studies can generate bias
Creators: Rebeckah K. Fussell - Cornell University
Meagan Sundstrom - Cornell University
Sabrina McDowell - Cornell University
N. G. Holmes - Cornell University
Contributors: Q X Ryan (Editor)
A Pawl (Editor)
J P Zwolak (Editor)
Publication Details: 2024 PHYSICS EDUCATION RESEARCH CONFERENCE, PERC, pp 144-149
Series: Physics Education Research Conference
Publisher: Amer Assoc Physics Teachers
Number of pages: 6
Grant note: DGE-2139899 / NSF GRFP; National Science Foundation (NSF); NSF - Office of the Director (OD) DUE-1836617 / NSF; National Science Foundation (NSF)
Resource Type: Conference proceeding
Language: English
Academic Unit: Physics
Web of Science ID: WOS:001324921500023
Scopus ID: 2-s2.0-85206263942
Other Identifier: 991022032066504721

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas: Education & Educational Research; Education, Scientific Disciplines

Applying machine learning models in multi-institutional studies can generate bias

Files and links (1)

Abstract

Metrics

Details

UN Sustainable Development Goals (SDGs)

InCites Highlights

Drexel University Social media