The publicly-accessible RNA barcode segments based on the genetic tests of complete genome sequences for SARS-CoV-2 identification from HCoVs and SARSr-CoV-2 lineages.

The publicly-accessible RNA barcode segments based on the genetic tests of complete genome sequences for SARS-CoV-2 identification from HCoVs and SARSr-CoV-2 lineages.

Publication date: Jan 20, 2024

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the pathogen responsible for coronavirus disease 2019 (COVID-19), continues to evolve, giving rise to more variants and global reinfections. Previous research has demonstrated that barcode segments can effectively and cost-efficiently identify specific species within closely related populations. In this study, we designed and tested RNA barcode segments based on genetic evolutionary relationships to facilitate the efficient and accurate identification of SARS-CoV-2 from extensive virus samples, including human coronaviruses (HCoVs) and SARSr-CoV-2 lineages. Nucleotide sequences sourced from NCBI and GISAID were meticulously selected and curated to construct training sets, encompassing 1,733 complete genome sequences of HCoVs and SARSr-CoV-2 lineages. Through genetic-level species testing, we validated the accuracy and reliability of the barcode segments for identifying SARS-CoV-2. Subsequently, 75 main and subordinate species-specific barcode segments for SARS-CoV-2, located in ORF1ab, S, E, ORF7a, and N coding sequences, were intercepted and screened based on single-nucleotide polymorphism sites and weighted scores. Post-testing, these segments exhibited high recall rates (nearly 100%), specificity (almost 30% at the nucleotide level), and precision (100%) performance on identification. They were eventually visualized using one and two-dimensional combined barcodes and deposited in an online database (http://virusbarcodedatabase. top/). The successful integration of barcoding technology in SARS-CoV-2 identification provides valuable insights for future studies involving complete genome sequence polymorphism analysis. Moreover, this cost-effective and efficient identification approach also provides valuable reference for future research endeavors related to virus surveillance.

Concepts Keywords
Efficient Complete genome sequences
Genetic Genetic tests
Global HCoVs
Virusbarcodedatabase RNA barcode segments

Semantics

Type Source Name
disease VO Severe acute respiratory syndrome coronavirus 2
disease IDO pathogen
disease MESH coronavirus disease 2019
disease MESH reinfections
disease VO efficient
disease VO effective

Original Article

(Visited 1 times, 1 visits today)