Unsupervised detection of novel SARS-CoV-2 mutations and lineages in wastewater samples using long-read sequencing.

Publication date: Jan 29, 2025

The COVID-19 pandemic has underscored the importance of virus surveillance in public health and wastewater-based epidemiology (WBE) has emerged as a non-invasive, cost-effective method for monitoring SARS-CoV-2 and its variants at the community level. Unfortunately, current variant surveillance methods depend heavily on updated genomic databases with data derived from clinical samples, which can become less sensitive and representative as clinical testing and sequencing efforts decline. In this paper, we introduce HERCULES (High-throughput Epidemiological Reconstruction and Clustering for Uncovering Lineages from Environmental SARS-CoV-2), an unsupervised method that uses long-read sequencing of a single 1 Kb fragment of the Spike gene. HERCULES identifies and quantifies mutations and lineages without requiring database-guided deconvolution, enhancing the detection of novel variants. We evaluated HERCULES on Norwegian wastewater samples collected from July 2022 to October 2023 as part of a national pilot on WBE of SARS-CoV-2. Strong correlations were observed between wastewater and clinical sample data in terms of prevalence of mutations and lineages. Furthermore, we found that SARS-CoV-2 trends in wastewater samples were identified one week earlier than in clinical data. Our results demonstrate HERCULES’ capability to identify new lineages before their detection in clinical samples, providing early warnings of potential outbreaks. The methodology described in this paper is easily adaptable to other pathogens, offering a versatile tool for environmental surveillance of new emerging pathogens.

Open Access PDF

Concepts Keywords
Epidemiology COVID-19
Hercules High-Throughput Nucleotide Sequencing
July Humans
Norwegian Long-read sequencing
Pilot Metagenomics
Mutation
SARS-CoV-2
SARS-CoV-2
Spike Glycoprotein, Coronavirus
Spike Glycoprotein, Coronavirus
spike protein, SARS-CoV-2
Wastewater
Wastewater
Wastewater
Wastewater-Based Epidemiological Monitoring

Semantics

Type Source Name
disease MESH COVID-19 pandemic
pathway REACTOME Reproduction
disease MESH Infection
disease IDO pathogen
disease MESH viral load
drug DRUGBANK Water
disease IDO algorithm
disease IDO nucleic acid
drug DRUGBANK Ademetionine
drug DRUGBANK 5-amino-1 3 4-thiadiazole-2-thiol
drug DRUGBANK 7-Methyl-Gpppa
drug DRUGBANK Trihexyphenidyl
drug DRUGBANK Ranitidine
disease MESH point mutations
disease MESH uncertainty
disease IDO pathogen surveillance
drug DRUGBANK Ilex paraguariensis leaf
drug DRUGBANK Etoperidone
disease MESH Influenza
drug DRUGBANK Gold
disease IDO cell
drug DRUGBANK Trestolone
disease IDO infectivity
disease IDO protein

Original Article

(Visited 1 times, 1 visits today)