Physics education research (PER) has long struggled with how to manage the desire to use qualitative, or text based, data with the proportionately higher cost of analyzing and collecting this type of data. We present several methods and use cases for using modern natural language processing techniques, leveraging emergent machine learning tools, to analyze large amounts of short response questions of physicists. The methods we employ are split along two separate research aims to better understand the current community of physicists. First we investigate how physics professors feel about their transition to online learning during the COVID-19 pandemic. We utilize sentiment analysis to understand whether the responses were positive or negative in the Spring of 2020, at the beginning of the pandemic, and the Fall of 2020 when professors have had some experience with online learning. We find a Bayesian t-test for difference in sentiment from Spring to Fall to show no clear change in the average sentiment score, and show a non-normal distribution of sentiment scores for both survey times. We further investigate the responses of physics professors by conducting Latent Dirichlet Allocation, a method of thematic analysis, to identify the primary themes of the instructor responses in both Spring and Fall. The second avenue of natural language processing aims to generate a program for identifying the motivational factors that bring women into physics. Utilizing training data built from a separate project on the self-reported motivation of 2127 women attendees of the Conference of Undergraduate Women in Physics in the years of 2015 and 2019, we create an automated coding program with effectiveness ranging from excellent to poor depending on the specific motivational code. In the Spring of 2024, we included our motivational prompt on the American Institute of Physics (AIP) bachelors graduation survey. This survey went to all graduating physics majors in America, and we received a total of 438 responses, 301 men and 117 women. We applied our coding program to these responses to determine whether training a natural language processing tool on responses from one gender will intrinsically bias the tool; we find this to be true (p < 0.01). Hand coding the 438 responses from the AIP survey revealed a significant difference in the code distribution for men and women graduating in 2024 (p<0.01), suggesting that what motivates men and women to study physicists is different, and or there is a filtering effect where men and women that have the same motivation are disproportionately dropping out of physics.
Metrics
22 File views/ downloads
18 Record Views
Details
Title
Improving our understanding of physicists through natural language processing
Creators
Colin Green
Contributors
Eric T. Brewe (Advisor)
Awarding Institution
Drexel University
Degree Awarded
Doctor of Philosophy (Ph.D.)
Publisher
Drexel University; Philadelphia, Pennsylvania
Number of pages
xi, 116 pages
Resource Type
Dissertation
Language
English
Academic Unit
College of Arts and Sciences; Physics; Drexel University
Other Identifier
991022057738404721
Research Home Page
Browse by research and academic units
Learn about the ETD submission process at Drexel
Learn about the Libraries’ research data management services