Sequence-similarity-based approach to SARS-CoV-2 genome sequence and lung cancer-related genes via multivariate feature extraction method.

Publication date: Jul 11, 2025

The COVID-19 pandemic has prompted genomic studies linking SARS-CoV-2 and lung cancer-related genes. This study explores sequence similarity and motif patterns to assess disease susceptibility. We applied a data mining approach to compare human and SARS-CoV-2 genomes, revealing high sequence identity (0. 74-0. 99%) with lung cancer-related genes. Low-entropy motifs were associated with higher genetic risk. We identified shared patterns of lengths 4, 5, and 10, selecting the most significant motifs. These findings support the hypothesis that sequence similarity and conserved motifs provide insights into gene function, evolutionary processes, and the genetic links between cancer and viral infections.

Concepts Keywords
Cancer Data mining
Genomes sequence motif
Mining sequence similarity
Pandemic

Semantics

Type Source Name
disease MESH lung cancer
disease MESH COVID-19 pandemic
disease IDO susceptibility
disease MESH cancer
disease MESH viral infections

Original Article

(Visited 5 times, 1 visits today)