Logo image
Machine learning models predict long COVID outcomes based on baseline clinical and immunologic factors
Journal article   Open access   Peer reviewed

Machine learning models predict long COVID outcomes based on baseline clinical and immunologic factors

Naresh Doni Jayavelu, Hady Samaha, Sonia Tandon Wimalasena, Annmarie Hoch, Jeremy P. Gygi, Gisela Gabernet, Al Ozonoff, Shanshan Liu, Carly E. Milliren, Ofer Levy, …
Communications medicine, v 6(1), 1
2026
PMID: 41484172
url
https://doi.org/10.1038/s43856-025-01230-wView
Published, Version of Record (VoR) Open

Abstract

13/1 631/114/2163 631/154/53/2423 692/308/53/2423 692/53/2421 82/1 Article General Medicine Medicine & Public Health
Background The post-acute sequelae of SARS-CoV-2 (PASC), also known as long COVID, remain a significant health issue that is incompletely understood. Predicting which acutely infected individuals will develop long COVID is challenging due to the absence of established biomarkers, clear disease mechanisms, or well-defined sub-phenotypes. Machine learning (ML) models may address this gap by leveraging clinical data to enhance diagnostic precision. Methods Clinical data, including antibody titers and viral load measurements collected at the time of hospital admission, are used to predict the likelihood of acute COVID-19 progressing to long COVID. Machine learning models are trained and evaluated for predictive performance. Feature importance analysis is performed to identify the most influential predictors. Results The machine learning models achieve median AUROC values ranging from 0.64 to 0.66 and AUPRC values between 0.51 and 0.54, demonstrating predictive capabilities. Low antibody titers and high viral loads at hospital admission emerge as the strongest predictors of long COVID outcomes. Comorbidities—such as chronic respiratory, cardiac, and neurologic diseases—and female sex are also identified as significant risk factors. Conclusions Machine learning models identify patients at risk for developing long COVID based on baseline clinical characteristics. These models guide early interventions, improve patient outcomes, and mitigate the long-term public health impacts of SARS-CoV-2. Plain language summary Long COVID, or post-acute sequelae of SARS-CoV-2, is a prolonged health condition that can occur after acute COVID-19 infection. However, the ability to predict who will develop long COVID remains limited due to the absence of clear tests or biomarkers. We looked at patients’ medical information, including the amount of virus in their body at hospital admission, and how strong their immune response was. Using computer programs that can find hidden patterns in large sets of data, we discovered that people with a weaker immune response, higher amounts of virus, certain long term health problems and women are more likely to develop long COVID. This study highlights that computer-based tools could help doctors identify high-risk patients early and provide care that may prevent long-term complications. Jayavelu, Samaha et al., apply machine learning models on hospital admission data, including antibody titers and viral load, to identify patients at high risk for Long COVID. Low antibody levels, high viral loads, chronic diseases, and female sex are key predictors, supporting early, targeted interventions.

Metrics

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Industry collaboration
Domestic collaboration
International collaboration
Web of Science research areas
Medicine, Research & Experimental
Logo image