A natural language processing pipeline for identifying pediatric long COVID symptoms and functional impacts in freeform clinical notes: a RECOVER study.

A natural language processing pipeline for identifying pediatric long COVID symptoms and functional impacts in freeform clinical notes: a RECOVER study.

Publication date: Oct 01, 2025

To develop a natural language processing (NLP) pipeline for unstructured electronic health record (EHR) data to identify symptoms and functional impacts associated with Long COVID in children. We analyzed 48 287 outpatient progress notes from 10 618 pediatric patients from 12 institutions. We evaluated notes obtained 28 to 179 days after a COVID-19 diagnosis or positive test. Two samples were examined: patients with evidence of Long COVID and patients with acute COVID but no evidence of Long COVID based on diagnostic codes. The pipeline identified clinical concepts associated with 21 symptoms and 4 functional impact categories. Subject matter experts (SMEs) screened a sample of 4586 terms from the NLP output to assess pipeline accuracy. Prevalence and concordance of each of the 25 concepts was compared between the 2 patient samples. A binary assertion measure comparing SME and NLP assertions showed moderate accuracy (N = 4133; F1 = .80) and improved substantially when only high-confidence SME assertions were considered (N = 2043; F1 = .90). Overall, the 25 Long COVID concept categories were markedly more prevalent in the presumptive Long COVID cohort, and differences were noted between concepts identified in notes versus structured data. This preliminary analysis illustrates the additional insight into a syndrome such as Long COVID gained from incorporating notes data, characterizing symptoms and functional impacts. These data support the importance of incorporating NLP methodology when possible into designing computable phenotypes and to accurately characterize patients with Long COVID.

Open Access PDF

Concepts Keywords
Covid NLP
Experts pediatrics
Outpatient PEDSnet
Pediatric RECOVER

Semantics

Type Source Name
disease MESH long COVID
disease MESH COVID-19
disease MESH syndrome

Original Article

(Visited 13 times, 1 visits today)

Leave a Comment

Your email address will not be published. Required fields are marked *