multi-omics data

The data of ROSMAP dataset have been preprocessed and dimensionally reduced in the original research, thus we did not perform further preprocessing on it. For SCZ dataset, we firstly removed features with more than 50% missing or 0 expression values for all omics sets. Log transformation was then utilized to normalize omics expression values, and the Z-score method was used to standardize all features of each sample in every omics sets. Only samples presented in both omics sets and label set were retained in the dataset of analysis.


As various modalities of genomic data are accumulating, methods to integrate across multi-omics datasets are becoming important. Error-correcting output codes (ECOC) is an ensemble learning strategy for solving a multiclass problem thru a decoding process that aggregates the predictions of multiple classifiers. Thus, it lends itself naturally to aggregating predictions across multiple views as well. We applied the ECOC to multi-view learning to see if this strategy can enhance classifier performance as compared to traditional techniques.
