Artificial intelligence-based identification of key risk factors for long COVID from early clinical data.

Artificial intelligence-based identification of key risk factors for long COVID from early clinical data.

Publication date: Jan 01, 2026

Long COVID, a complex condition characterized by persistent symptoms following SARS-CoV-2 infection, has become a significant public health concern. Early identification of individuals at risk for long COVID is crucial for effective management. This study explores the potential of biochemical and clinical markers, alongside machine learning (ML) models, to predict long COVID development using data from the first 72 h post-admission. We analyzed clinical and laboratory data from 394 individuals diagnosed with SARS-CoV-2. A predictive model for long COVID was developed using machine learning algorithms, particularly XGBoost, to identify key biomarkers associated with the development of long COVID symptoms at 3 months post-infection. The model hyperparameters were optimized using Bayesian optimization techniques, and variable importance was assessed using SHAP values. The predictive model achieved an area under the receiver operating characteristic curve (AUC-ROC) of 0. 732, indicating moderate discriminatory power. Key variables identified included hemoglobin levels, oxygen saturation, weight, C-reactive protein (CRP), activated partial thromboplastin time (APTT), sodium, type of pulmonary infiltrates, and sex. Hemoglobin levels were the only statistically significant variable (p = 0. 015) between those who developed long COVID and those who did not. Despite these findings, the model showed moderate overall accuracy (63. 9 %) but excelled in recall (78. 6 %), highlighting its potential in clinical settings for early identification of high-risk patients. Our study demonstrates the feasibility of using machine learning to predict long COVID based on early clinical and laboratory data. While individual biomarkers alone may have limited predictive value, their combination enhances risk assessment for long COVID. Future studies should focus on refining the model and validating it in broader populations, including those with milder COVID-19 forms, to improve its accuracy and clinical utility.

Concepts Keywords
Biomarkers Adult
Covid Aged
Discriminatory Artificial Intelligence
Future Bayes Theorem
Intelligence Biomarkers
Biomarkers
COVID-19
Female
Humans
Long COVID
Machine Learning
Machine learning
Male
Middle Aged
Predictive biomarkers
Risk Factors
ROC Curve
SARS-CoV-2

Semantics

Type Source Name
disease MESH long COVID
disease MESH SARS-CoV-2 infection
pathway REACTOME SARS-CoV-2 Infection
disease MESH infection
drug DRUGBANK Saquinavir
disease MESH included
drug DRUGBANK Oxygen

Original Article

(Visited 3 times, 1 visits today)

Leave a Comment

Your email address will not be published. Required fields are marked *