2-AIN-506, 2-AIN-252: Seminar in Bioinformatics (2), (4)
Summer 2025
Abstrakt

Yanglan Gan, Yuhan Chen, Guangwei Xu, Wenjing Guo, Guobing Zou. Deep enhanced constraint clustering based on contrastive learning for scRNA-seq data. Briefings in bioinformatics, 24(4). 2023.

Download preprint: not available

Download from publisher: https://doi.org/10.1093/bib/bbad222 PubMed

Related web page: not available

Bibliography entry: BibTeX

Abstract:

Single-cell RNA sequencing (scRNA-seq) measures transcriptome-wide gene 
expression at single-cell resolution. Clustering analysis of scRNA-seq data 
enables researchers to characterize cell types and states, shedding new light on 
cell-to-cell heterogeneity in complex tissues. Recently, self-supervised 
contrastive learning has become a prominent technique for underlying feature 
representation learning. However, for the noisy, high-dimensional and sparse 
scRNA-seq data, existing methods still encounter difficulties in capturing the 
intrinsic patterns and structures of cells, and seldom utilize prior knowledge, 
resulting in clusters that mismatch with the real situation. To this end, we 
propose scDECL, a novel deep enhanced constraint clustering algorithm for 
scRNA-seq data analysis based on contrastive learning and pairwise constraints. 
Specifically, based on interpolated contrastive learning, a pre-training model is 
trained to learn the feature embedding, and then perform clustering according to 
the constructed enhanced pairwise constraint. In the pre-training stage, a mixup 
data augmentation strategy and interpolation loss is introduced to improve the 
diversity of the dataset and the robustness of the model. In the clustering 
stage, the prior information is converted into enhanced pairwise constraints to 
guide the clustering. To validate the performance of scDECL, we compare it with 
six state-of-the-art algorithms on six real scRNA-seq datasets. The experimental 
results demonstrate the proposed algorithm outperforms the six competing methods. 
In addition, the ablation studies on each module of the algorithm indicate that 
these modules are complementary to each other and effective in improving the 
performance of the proposed algorithm. Our method scDECL is implemented in Python 
using the Pytorch machine-learning library, and it is freely available at 
https://github.com/DBLABDHU/scDECL.