Datasets
Standard Dataset
DDA-GTN: large-scale drug repurposing on drug-gene-disease heterogenous association networks using graph transformers
- Citation Author(s):
- Submitted by:
- Pu-Feng Du
- Last updated:
- Mon, 07/08/2024 - 15:58
- DOI:
- 10.21227/zt4g-d266
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
Drug development is a process that is incredibly expensive and time-consuming. Computational drug repurposing can help to assign new indications for approved drugs. It is capable to reduce the cost of drug developments. Machine learning models have been introduced to repurpose drugs long before. Recent studies formulate computational drug repurposing problem as a latent link prediction task on a heterogenous network. A number of computational methods have been developed based on graph neural networks. We propose to use graph transformer networks to learn features of diseases, genes and drugs in a drug-gene-disease heterogenous network. By utilizing the power of graph transformer networks, we developed DDA-GTN method, which is capable of predicting latent associations between drugs and diseases on a drug-gene-disease heterogenous network containing thousands of drugs and thousands of diseases. As far as we can tell, the benchmarking dataset in this work is the largest drug-disease association dataset until today. DDA-GTN is the only method that can work with a network of this size. Comparisons indicate that DDA-GTN has a comparable or better performance than all state-of-the-art methods. DDA-GTN has an acceptable computational efficiency on both small dense networks and large sparse networks with a very small memory consumption. It is expected that DDA-GTN can be helpful in drug repurposing in practice. All dataset and codes for reproducing results in our work have been deposited in a public GitHub repository (https://github.com/Siriacc/DDA-GTN).
The dataset is a CSV file.