Name: DGCMF-MSN
Creator: Min Jin
License: https://creativecommons.org/licenses/by/4.0/
Keywords: Artificial Intelligence

Abstract

The dataset of the DGCMF-MSN, where includes 1,020 drug entities, 5,598 standardized side effects, and 133,750 validated positive association samples. Additionally, the feature-engineered data derived from the three-modal data of these 1,020 drugs are also included. Drug-se_matrix.txt is a matrix of drug-side effect associations. Drugs.smiles contains feature engineering results derived from SMILES representations. Drugs.fpt contains molecular fingerprint feature engineering results. The files mpnn_toxcast.npy, nf_toxcast.npy, weave_toxcast.npy, and afp_toxcast.npy represent graph embeddings of molecular structures generated by MPNN, NF, Weave, and AFP models respectively. Drugs1020.json contains identifiers for these 1020 drugs.

Instructions:

Comments

The load_data function in test.py contains the usage of the data set.

Submitted by Min Jin on Thu, 03/27/2025 - 14:04

Dataset Files

Files have not been uploaded for this dataset

Datasets

Standard Dataset

DGCMF-MSN

Abstract

Comments

Dataset Files

QUESTIONS?