DISCERN: deep single-cell expression reconstruction for improved cell clustering and cell subtype and state detection.

DISCERN: deep single-cell expression reconstruction for improved cell clustering and cell subtype and state detection.

Publication date: Sep 20, 2023

Single-cell sequencing provides detailed insights into biological processes including cell differentiation and identity. While providing deep cell-specific information, the method suffers from technical constraints, most notably a limited number of expressed genes per cell, which leads to suboptimal clustering and cell type identification. Here, we present DISCERN, a novel deep generative network that precisely reconstructs missing single-cell gene expression using a reference dataset. DISCERN outperforms competing algorithms in expression inference resulting in greatly improved cell clustering, cell type and activity detection, and insights into the cellular regulation of disease. We show that DISCERN is robust against differences between batches and is able to keep biological differences between batches, which is a common problem for imputation and batch correction algorithms. We use DISCERN to detect two unseen COVID-19-associated T cell types, cytotoxic CD4 and CD8 Tc2 T helper cells, with a potential role in adverse disease outcome. We utilize T cell fraction information of patient blood to classify mild or severe COVID-19 with an AUROC of 80% that can serve as a biomarker of disease stage. DISCERN can be easily integrated into existing single-cell sequencing workflow. Thus, DISCERN is a flexible tool for reconstructing missing single-cell gene expression using a reference dataset and can easily be applied to a variety of data sets yielding novel insights, e. g., into disease mechanisms.

Open Access PDF

Concepts Keywords
Algorithms Auto encoder
Biomarker Batch effect correction
Cd4 Cell clustering
Outperforms Cell type identification
Stage COVID-19
Deep Learning
Expression reconstruction
Machine Learning
Probabilistic modeling
Reference atlas mapping
RNA sequencing
Single-cell RNA-seq
T helper cell
Transcription factor analysis
Transfer learning


Type Source Name
disease IDO cell
disease MESH COVID-19
disease IDO blood
pathway REACTOME Reproduction
drug DRUGBANK Ilex paraguariensis leaf
disease VO gene
drug DRUGBANK Dichloroacetic Acid
disease VO effectiveness
drug DRUGBANK Pentaerythritol tetranitrate
disease IDO quality
drug DRUGBANK Hyaluronic acid
disease IDO algorithm
drug DRUGBANK Pidolic Acid
drug DRUGBANK Chromium
drug DRUGBANK Esomeprazole
disease VO organ
drug DRUGBANK Azelaic acid
drug DRUGBANK Coenzyme M
disease VO population
disease VO efficient
drug DRUGBANK Saquinavir
disease MESH fibrosis
drug DRUGBANK Flunarizine
disease IDO process
drug DRUGBANK Isoxaflutole
disease MESH rare diseases
disease VO data set
disease MESH inflammation
disease VO time
disease MESH pneumonia
disease MESH asthma
pathway KEGG Asthma
disease MESH death
disease IDO intervention
drug DRUGBANK Fevipiprant
drug DRUGBANK Methylergometrine
disease MESH infection
disease IDO site
drug DRUGBANK Dimercaprol
drug DRUGBANK Topiramate
disease IDO history
disease IDO production
drug DRUGBANK Prostaglandin D2
disease MESH chronic obstructive pulmonary disease
disease MESH tuberculosis
pathway KEGG Tuberculosis
disease VO frequency
drug DRUGBANK Somatostatin
disease MESH pancreatic acinar carcinoma
disease MESH autoimmune diseases
disease MESH tumor
disease MESH hepatocellular carcinoma
pathway KEGG Hepatocellular carcinoma
disease MESH liver cancer
disease MESH systemic lupus erythematosus
pathway KEGG Systemic lupus erythematosus
disease VO Tox
disease MESH breast cancer
pathway KEGG Breast cancer
drug DRUGBANK Gold
disease MESH repression
disease MESH acute myeloid leukemia
pathway KEGG Acute myeloid leukemia
disease MESH Allergy
drug DRUGBANK (S)-Des-Me-Ampa
disease MESH idiopathic membranous nephropathy

Original Article

(Visited 1 times, 1 visits today)