Journal article
Beyond crosswalks: reliability of exposure assessment following automated coding of free-text job descriptions for occupational epidemiology
The Annals of occupational hygiene, v 58(4), pp 482-492
May 2014
PMID: 24504175
Featured in Collection : UN Sustainable Development Goals @ Drexel
Abstract
Epidemiologists typically collect narrative descriptions of occupational histories because these are less prone than self-reported exposures to recall bias of exposure to a specific hazard. However, the task of coding these narratives can be daunting and prohibitively time-consuming in some settings. The aim of this manuscript is to evaluate the performance of a computer algorithm to translate the narrative description of occupational codes into standard classification of jobs (2010 Standard Occupational Classification) in an epidemiological context. The fundamental question we address is whether exposure assignment resulting from manual (presumed gold standard) coding of the narratives is materially different from that arising from the application of automated coding. We pursued our work through three motivating examples: assessment of physical demands in Women's Health Initiative observational study, evaluation of predictors of exposure to coal tar pitch volatiles in the US Occupational Safety and Health Administration's (OSHA) Integrated Management Information System, and assessment of exposure to agents known to cause occupational asthma in a pregnancy cohort. In these diverse settings, we demonstrate that automated coding of occupations results in assignment of exposures that are in reasonable agreement with results that can be obtained through manual coding. The correlation between physical demand scores based on manual and automated job classification schemes was reasonable (r = 0.5). The agreement between predictive probability of exceeding the OSHA's permissible exposure level for polycyclic aromatic hydrocarbons, using coal tar pitch volatiles as a surrogate, based on manual and automated coding of jobs was modest (Kendall rank correlation = 0.29). In the case of binary assignment of exposure to asthmagens, we observed that fair to excellent agreement in classifications can be reached, depending on presence of ambiguity in assigned job classification (κ = 0.5-0.8). Thus, the success of automated coding appears to depend on the setting and type of exposure that is being assessed. Our overall recommendation is that automated translation of short narrative descriptions of jobs for exposure assessment is feasible in some settings and essential for large cohorts, especially if combined with manual coding to both assess reliability of coding and to further refine the coding algorithm.
Metrics
Details
- Title
- Beyond crosswalks: reliability of exposure assessment following automated coding of free-text job descriptions for occupational epidemiology
- Creators
- Igor Burstyn - 1. Department of Environmental and Occupational Health, School of Public Health, Drexel University, Philadelphia, PA, USAAnton SlutskyDerrick G LeeAlison B SingerYuan AnYvonne L Michael
- Publication Details
- The Annals of occupational hygiene, v 58(4), pp 482-492
- Publisher
- Oxford University Press; England
- Resource Type
- Journal article
- Language
- English
- Academic Unit
- Information Science; Epidemiology and Biostatistics; Environmental and Occupational Health
- Web of Science ID
- WOS:000335000600007
- Scopus ID
- 2-s2.0-84898877378
- Other Identifier
- 991014877779004721
UN Sustainable Development Goals (SDGs)
This publication has contributed to the advancement of the following goals:
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Domestic collaboration
- International collaboration
- Web of Science research areas
- Public, Environmental & Occupational Health
- Toxicology