Early detection of emerging SARS-CoV-2 Variants from wastewater through genome sequencing and machine learning.

Publication date: Jul 08, 2025

Genome sequencing from wastewater enables accurate and cost-effective identification of SARS-CoV-2 variants. However, existing computational pipelines have limitations in detecting emerging variants not yet characterized in humans. Here, we present an unsupervised learning approach that clusters co-varying and time-evolving mutation patterns to identify SARS-CoV-2 variants. To build our model, we sequence 3659 wastewater samples collected over two years from urban and rural locations in Southern Nevada. We then develop a multivariate independent component analysis (ICA)-based pipeline to transform mutation frequencies into independent sources. These data-driven time-evolving and co-varying sources are compared to 8810 SARS-CoV-2 clinical genomes from Nevadans. Our method accurately detects the Delta variant in late 2021, Omicron variants in 2022, and emerging recombinant XBB variants in 2023. Our approach also reveals the spatial and temporal dynamics of variants in both urban and rural regions; achieves earlier detection of most variants compared to other computational tools; and uncovers unique co-varying mutation patterns not associated with any known variant. The multivariate nature of our pipeline boosts statistical power and supports accurate early detection of SARS-CoV-2 variants. This feature offers a unique opportunity to detect emerging variants and pathogens, even in the absence of clinical testing.

Open Access PDF

Concepts Keywords
Nevadans COVID-19
Pathogens Genome, Viral
Recombinant Humans
Rural Machine Learning
Mutation
SARS-CoV-2
Wastewater
Wastewater
Whole Genome Sequencing

Semantics

Type Source Name
disease MESH mutation frequencies
disease MESH COVID 19
drug DRUGBANK Tretamine
drug DRUGBANK Water
drug DRUGBANK Coenzyme M
disease MESH infections
disease IDO infection
disease IDO quality
disease IDO nucleic acid
disease MESH viral load
disease IDO process
drug DRUGBANK Fenamole
disease IDO algorithm
drug DRUGBANK Pentaerythritol tetranitrate
drug DRUGBANK Aspartame
drug DRUGBANK Saquinavir
drug DRUGBANK (S)-Des-Me-Ampa
disease IDO intervention
drug DRUGBANK Ademetionine
disease IDO cell
disease IDO replication
pathway REACTOME Reproduction

Original Article

(Visited 2 times, 1 visits today)