Using big sequencing data to identify chronic SARS-Coronavirus-2 infections.

Using big sequencing data to identify chronic SARS-Coronavirus-2 infections.

Publication date: Jan 20, 2024

The evolution of SARS-Coronavirus-2 (SARS-CoV-2) has been characterized by the periodic emergence of highly divergent variants. One leading hypothesis suggests these variants may have emerged during chronic infections of immunocompromised individuals, but limited data from these cases hinders comprehensive analyses. Here, we harnessed millions of SARS-CoV-2 genomes to identify potential chronic infections and used language models (LM) to infer chronic-associated mutations. First, we mined the SARS-CoV-2 phylogeny and identified chronic-like clades with identical metadata (location, age, and sex) spanning over 21 days, suggesting a prolonged infection. We inferred 271 chronic-like clades, which exhibited characteristics similar to confirmed chronic infections. Chronic-associated mutations were often high-fitness immune-evasive mutations located in the spike receptor-binding domain (RBD), yet a minority were unique to chronic infections and absent in global settings. The probability of observing high-fitness RBD mutations was 10-20 times higher in chronic infections than in global transmission chains. The majority of RBD mutations in BA. 1/BA. 2 chronic-like clades bore predictive value, i. e., went on to display global success. Finally, we used our LM to infer hundreds of additional chronic-like clades in the absence of metadata. Our approach allows mining extensive sequencing data and providing insights into future evolutionary patterns of SARS-CoV-2.

Open Access PDF

Concepts Keywords
Coronavirus Associated
Fitness Chronic
Genomes Clades
Immunocompromised Coronavirus
Cov
Global
Identify
Infections
Infer
Metadata
Mutations
Rbd
Sars
Sequencing
Variants

Semantics

Type Source Name
disease MESH infections
disease MESH chronic infections
disease IDO infection
drug DRUGBANK Spinosad
disease VO organization
drug DRUGBANK Safrazine
disease VO gene
disease MESH cancer
disease MESH AIDS
disease MESH long COVID
pathway KEGG Viral replication
disease VO population
disease VO volume
drug DRUGBANK Ademetionine
disease MESH COVID 19
disease VO time
disease VO frequency
drug DRUGBANK Pentaerythritol tetranitrate
drug DRUGBANK Aspartame
disease IDO quality
disease VO ANOVA
disease VO vaccination
disease IDO cell
disease MESH causality
disease IDO immune response
disease IDO pathogen
drug DRUGBANK Coenzyme M
drug DRUGBANK Nonoxynol-9
drug DRUGBANK Isoxaflutole

Original Article

(Visited 1 times, 1 visits today)