Empirical Comparison and Analysis of Artificial Intelligence-Based Methods for Identifying Phosphorylation Sites of SARS-CoV-2 Infection.

Publication date: Dec 21, 2024

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a member of the large coronavirus family with high infectivity and pathogenicity and is the primary pathogen causing the global pandemic of coronavirus disease 2019 (COVID-19). Phosphorylation is a major type of protein post-translational modification that plays an essential role in the process of SARS-CoV-2-host interactions. The precise identification of phosphorylation sites in host cells infected with SARS-CoV-2 will be of great importance to investigate potential antiviral responses and mechanisms and exploit novel targets for therapeutic development. Numerous computational tools have been developed on the basis of phosphoproteomic data generated by mass spectrometry-based experimental techniques, with which phosphorylation sites can be accurately ascertained across the whole SARS-CoV-2-infected proteomes. In this work, we have comprehensively reviewed several major aspects of the construction strategies and availability of these predictors, including benchmark dataset preparation, feature extraction and refinement methods, machine learning algorithms and deep learning architectures, model evaluation approaches and metrics, and publicly available web servers and packages. We have highlighted and compared the prediction performance of each tool on the independent serine/threonine (S/T) and tyrosine (Y) phosphorylation datasets and discussed the overall limitations of current existing predictors. In summary, this review would provide pertinent insights into the exploitation of new powerful phosphorylation site identification tools, facilitate the localization of more suitable target molecules for experimental verification, and contribute to the development of antiviral therapies.

Open Access PDF

Concepts Keywords
Coronavirus Artificial Intelligence
Covid computation tool
Intelligence Computational Biology
Pathogenicity COVID-19
Spectrometry deep learning
Humans
Machine Learning
machine learning
Phosphoproteins
Phosphoproteins
Phosphorylation
phosphorylation site
Protein Processing, Post-Translational
Proteome
Proteome
SARS-CoV-2
SARS-CoV-2

Semantics

Type Source Name
disease MESH SARS-CoV-2 Infection
pathway REACTOME SARS-CoV-2 Infection
disease IDO infectivity
disease IDO primary pathogen
disease IDO protein
disease IDO role
disease IDO process
pathway REACTOME SARS-CoV-2-host interactions
disease IDO host
drug DRUGBANK Serine
drug DRUGBANK L-Threonine
drug DRUGBANK L-Tyrosine
disease IDO site
drug DRUGBANK Lauric Acid
drug DRUGBANK Troleandomycin
drug DRUGBANK Coenzyme M
disease MESH respiratory failure
disease MESH multiple organ failure
disease MESH death
drug DRUGBANK Phosphate ion
disease MESH infection
disease MESH viral infection
disease IDO quality
drug DRUGBANK Ranitidine
drug DRUGBANK Ademetionine
drug DRUGBANK Honey
drug DRUGBANK Abacavir
drug DRUGBANK Amino acids
drug DRUGBANK Potassium
drug DRUGBANK L-Aspartic Acid
drug DRUGBANK Dimercaprol
drug DRUGBANK Carboxyamidotriazole
drug DRUGBANK Aspartame
disease IDO algorithm
drug DRUGBANK Esomeprazole
drug DRUGBANK Flunarizine
drug DRUGBANK MCC
drug DRUGBANK Saquinavir
drug DRUGBANK Pentaerythritol tetranitrate
disease MESH inflammation
drug DRUGBANK Isoxaflutole
drug DRUGBANK Guanosine
drug DRUGBANK (S)-Des-Me-Ampa
disease MESH pneumonia
disease IDO cell
disease IDO replication
drug DRUGBANK Efavirenz
drug DRUGBANK Hexadecanal
drug DRUGBANK L-Lysine

Original Article

(Visited 1 times, 1 visits today)