Datasets
Standard Dataset
scMASKGAN
- Citation Author(s):
- Submitted by:
- You Wu
- Last updated:
- Sun, 01/19/2025 - 03:26
- DOI:
- 10.21227/jpca-js42
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
Single-cell RNA sequencing (scRNA-seq) enables high-resolution analysis of cellular heterogeneity, but dropout events, where gene expression is undetected in individual cells, present a significant challenge. We propose \textbf{scMASKGAN}, which transforms matrix imputation into a pixel restoration task to improve the recovery of missing gene expression data. Specifically, we integrate masking, convolutional neural networks (CNNs), attention mechanisms, and residual networks (ResNets) to effectively address dropout events in scRNA-seq data. The masking mechanism ensures the preservation of complete cellular information, while convolution and attention mechanisms are employed to capture both global and local features. Residual networks augment feature representation and effectively mitigate the risk of model overfitting. Additionally, cell-type labels are incorporated as constraints to guide the methods in learning more accurate cellular features. Finally, multiple experiments were conducted to evaluate methods performance using seven different data types and scRNA-seq data from ten neuroblastoma samples. The results demonstrate that the data imputed by scMASKGAN not only perform excellently across various evaluation metrics but also significantly enhance the effectiveness of downstream analyses, enabling a more comprehensive exploration of underlying biological information.
sevensamples including Human brain scRNA-seq data, ERCC spike-in RNAs scRNA-seq data, Mouse ESC scRNA-seq data (for cell-cycle dynamics study and Gene-gene correlation analysis), Time-coursed scRNA-seq data (for trajectory analysis) and three scRNA-seq datasets from the sc_Drop-seq, sc_CEL-seq2, and sc_10X platforms.
The other dataset , which provided a total of 10 neuroblastoma samples, including 5 high-risk neuroblastomas (NB) and 5 low-risk neuroblastomas, consisting of 4 ganglioneuroblastomas (GNB) and 1 ganglioneuroma (GN).