Logo image
The chronic illness risk and resiliency profile project: utilizing machine learning to build a multivariate correlational model of disease
Dissertation   Open access

The chronic illness risk and resiliency profile project: utilizing machine learning to build a multivariate correlational model of disease

Meghan Marie Colosimo
Doctor of Philosophy (Ph.D.), Drexel University
Jun 2020
DOI:
https://doi.org/10.17918/00001036
pdf
Colosimo_Meghan_20201.50 MBDownloadView

Abstract

Medicine and psychology Chronic diseases--Psychological aspects Machine Learning Public Health
Health-related behavior change is the core premise of behavioral medicine research and intervention yet remains unattainable and/or unsustainable for the majority of patients. Significant systemic, environmental and personal barriers limit access to biopsychoeducational information, medical providers and evidence-based treatments that have the power to motivate health promoting behaviors and reduce risk for lifetime development of chronic illness, such as diabetes and hypertension. Unfortunately, the impact of the field of health psychology is often methodologically limited by underpowered, unrepresentative samples and comparatively less powerful statistical methods/models for analyzing the complexities of chronic illness development and prognosis. However, this sphere of influence can easily expand through integration of advances in big data analytics and the application of more robust, predictively powerful machine learning (ML) models to traditional behavioral medicine research objectives. The process of improving health and wellbeing en masse begins with knowledge acquisition through dissemination of personally relevant, data driven health-related information to the general public. The present project sought to develop a comprehensive, multifactorial correlational model that classifies and predicts risk for lifetime development of three chronic diseases known to be responsive to health-related behavior change, diabetes, hypertension, and heart disease/heart attack, based on the intersection of demographic, health behavior, and healthrelated quality of life variables. Three ML models were utilized, in conjunction with Synthetic Minority Over-sampling Technique - Nominal Continuous (SMOTE - NC) to address imbalanced diagnostic groups: a) logistic regression, b) decision tree, and c) random forest. Experiments in dimensionality reduction including Factor Analysis of Mixed Data (FAMD) and permutation feature importance were also performed to enhance model efficiency and performance. Results supported the application of ML techniques to the classification of chronic illness, achieving target accuracy above 90% for two of the three diagnostic outcomes modeled: diabetes (92.1%) and heart disease/attack (94.4%). Hypertension (77.7%) was the only outcome that did not achieve target accuracy, which may be due in part to the relative complexity of risk and resiliency for this disease. Additionally, secondary analyses supported the classification of U.S. Veterans as a specialized subpopulation characterized by chronic illness disparity, particularly in regard to heart disease/attack. Future iterations of this project conducted with more expansive, feature rich data sources and optimizations in accuracy may allow for the eventual development of a web-based application with the ability to generate a personalized Chronic Illness Risk and Resiliency Profile for users that will include an adjusted predicted risk, a reduction in chronic illness risk which can be achieved through a combination of recommended, user specific health-related behavior changes. This research is designed to increase awareness of the role of behavior in health in order to empower the general public and motivate feasible increases in health-promoting behaviors across the United States.

Metrics

33 File views/ downloads
43 Record Views

Details

Logo image