Logo image
Improving propensity score weighting using machine learning
Journal article   Open access   Peer reviewed

Improving propensity score weighting using machine learning

Brian K Lee, Justin Lessler and Elizabeth A Stuart
Statistics in medicine, v 29(3), pp 337-346
10 Feb 2010
PMID: 19960510
url
https://doi.org/10.1002/sim.3782View
Published, Version of Record (VoR) Open

Abstract

boosting simulation data mining ensemble methods machine learning weighting propensity score CART
Machine learning techniques such as classification and regression trees (CART) have been suggested as promising alternatives to logistic regression for the estimation of propensity scores. The authors examined the performance of various CART-based propensity score models using simulated data. Hypothetical studies of varying sample sizes (n=500, 1000, 2000) with a binary exposure, continuous outcome, and ten covariates were simulated under seven scenarios differing by degree of non-linear and non-additive associations between covariates and the exposure. Propensity score weights were estimated using logistic regression (all main effects), CART, pruned CART, and the ensemble methods of bagged CART, random forests, and boosted CART. Performance metrics included covariate balance, standard error, percent absolute bias, and 95% confidence interval coverage. All methods displayed generally acceptable performance under conditions of either non-linearity or non-additivity alone. However, under conditions of both moderate non-additivity and moderate non-linearity, logistic regression had subpar performance, while ensemble methods provided substantially better bias reduction and more consistent 95% CI coverage. The results suggest that ensemble methods, especially boosted CART, may be useful for propensity score weighting.

Metrics

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
Web of Science research areas
Mathematical & Computational Biology
Medical Informatics
Medicine, Research & Experimental
Public, Environmental & Occupational Health
Statistics & Probability
Logo image