Abstract 

This dataset integrates three Publicly available sources of drug-target interaction data: the Human dataset, the Biosnap dataset, and the DrugBank dataset, combining them into a comprehensive resource for drug discovery and bioinformatics research. It includes a diverse set of human proteins identified as potential drug targets, along with a variety of corresponding drug molecules. Each drug-target pair is accompanied by interaction labels, indicating whether the drug interacts with the protein target. By merging data from these authoritative biological databases, this dataset provides a rich foundation for developing predictive models and advancing machine learning techniques in the field of drug discovery and repurposing.

Instructions: 

To use the data of human dataset, you can load both human.txt and humanSeqPdb.txt using Python's pandas library. From the human.txt , you can get the sequence of protein, the smiles of drug and the label of the intraction pair. From the humanSeqPdb.txt, you can get the  identifier of the Protein Data Bank (PDB) structure associated with the protein, which can be useful for structural bioinformatics studies. 

The usage of the Biosnap and DrugBank datasets is similar.

 

Comments

This dataset combines the Human, Biosnap, and DrugBank datasets for drug-target interaction analysis. Please note that  the datasets included are publicly available and have been previously published for use. The data is provided under an open-access license and can be used for research purposes in drug discovery. For detailed usage instructions, refer to the documentation. If you use this dataset in your research, please cite the corresponding papers.

Submitted by tian wen on Mon, 12/02/2024 - 23:40

Dataset Files

    Files have not been uploaded for this dataset