At-admission prediction of mortality and pulmonary embolism in an international cohort of hospitalised patients with COVID-19 using statistical and machine learning methods.

At-admission prediction of mortality and pulmonary embolism in an international cohort of hospitalised patients with COVID-19 using statistical and machine learning methods.

Publication date: Jul 16, 2024

By September 2022, more than 600 million cases of SARS-CoV-2 infection have been reported globally, resulting in over 6. 5 million deaths. COVID-19 mortality risk estimators are often, however, developed with small unrepresentative samples and with methodological limitations. It is highly important to develop predictive tools for pulmonary embolism (PE) in COVID-19 patients as one of the most severe preventable complications of COVID-19. Early recognition can help provide life-saving targeted anti-coagulation therapy right at admission. Using a dataset of more than 800,000 COVID-19 patients from an international cohort, we propose a cost-sensitive gradient-boosted machine learning model that predicts occurrence of PE and death at admission. Logistic regression, Cox proportional hazards models, and Shapley values were used to identify key predictors for PE and death. Our prediction model had a test AUROC of 75. 9% and 74. 2%, and sensitivities of 67. 5% and 72. 7% for PE and all-cause mortality respectively on a highly diverse and held-out test set. The PE prediction model was also evaluated on patients in UK and Spain separately with test results of 74. 5% AUROC, 63. 5% sensitivity and 78. 9% AUROC, 95. 7% sensitivity. Age, sex, region of admission, comorbidities (chronic cardiac and pulmonary disease, dementia, diabetes, hypertension, cancer, obesity, smoking), and symptoms (any, confusion, chest pain, fatigue, headache, fever, muscle or joint pain, shortness of breath) were the most important clinical predictors at admission. Age, overall presence of symptoms, shortness of breath, and hypertension were found to be key predictors for PE using our extreme gradient boosted model. This analysis based on the, until now, largest global dataset for this set of problems can inform hospital prioritisation policy and guide long term clinical research and decision-making for COVID-19 patients globally. Our machine learning model developed from an international cohort can serve to better regulate hospital risk prioritisation of at-risk patients.

Open Access PDF

Concepts Keywords
Diabetes Adult
Hospitalised Aged
Sex Aged, 80 and over
Cohort Studies
COVID-19
Female
Hospitalization
Humans
Machine Learning
Male
Middle Aged
Pulmonary Embolism
Risk Factors
SARS-CoV-2
Spain
United Kingdom

Semantics

Type Source Name
disease MESH pulmonary embolism
disease MESH COVID-19
pathway REACTOME SARS-CoV-2 Infection
disease MESH complications
drug DRUGBANK Flunarizine
disease MESH death
disease VO age
disease MESH pulmonary disease
disease MESH dementia
disease MESH hypertension
disease MESH cancer
disease MESH obesity
disease MESH chest pain
disease MESH joint pain
drug DRUGBANK Tropicamide
drug DRUGBANK Coenzyme M
disease MESH pneumonia
disease VO population
disease MESH infection
disease MESH morbidity
disease MESH comorbidity
disease MESH inflammation
disease MESH thrombosis
disease MESH cardiovascular disease
disease MESH thromboembolism
disease IDO country
disease MESH asthma
pathway KEGG Asthma
disease MESH cardiac disease
disease MESH chronic kidney disease
disease MESH deep vein thrombosis
disease MESH AIDS
disease MESH Malnutrition
disease MESH Bleeding
disease MESH Conjunctivitis
disease MESH Lymphadenopathy
disease VO nose
disease MESH Seizures
disease MESH dehydration
disease MESH Sore throat
disease IDO blood
drug DRUGBANK Urea
drug DRUGBANK Nitrogen
drug DRUGBANK Oxygen
disease MESH chronic conditions

Original Article

(Visited 3 times, 1 visits today)