Leveraging SNOMED CT for patient cohort identification over heterogeneous EHR data.

Publication date: Jun 12, 2025

SNOMED CT is extensively employed to standardize data across diverse patient datasets and support cohort identification, with studies revealing its benefits and challenges. In this work, we developed a SNOMED CT-driven cohort query system over a heterogeneous Optum de-identified COVID-19 Electronic Health Record dataset leveraging concept mappings between ICD-9-CM/ICD-10-CM and SNOMED CT. We evaluated the benefits and challenges of using SNOMED CT to perform cohort queries based on both query code sets and actual patients retrieved from the database, leveraging the original ICD-9-CM and ICD-10-CM as baselines. Manual review of 80 random cases revealed 65 cases containing 148 true positive codes and 25 cases containing 63 false positive codes. The manual evaluation also revealed issues in code naming, mappings, and hierarchical relations. Overall, our study indicates that while the SNOMED CT-driven query system holds considerable promise for comprehensive cohort queries, careful attention must be given to the challenges offalsely included codes and patients.

Concepts Keywords
Covid Benefits
Ct Cm
Informatics Codes
Perform Cohort
Ct
Driven
Heterogeneous
Icd
Leveraging
Mappings
Patient
Queries
Query
Snomed
System

Semantics

Type Source Name
disease MESH COVID-19

Original Article

(Visited 1 times, 1 visits today)