Datasets
Standard Dataset
predicting drug likeness and molecular activity
- Citation Author(s):
- Submitted by:
- Andres Frederic
- Last updated:
- Fri, 10/13/2023 - 07:46
- DOI:
- 10.21227/3zzp-hj56
- Data Format:
- License:
Abstract
During our research in generating or optimizing molecules to be drug candidates by extending deep reinforcement learning and graph neural networks algorithms, we used GEOM data [1], and we had an idea to make a dataset obtained from molecules from GEOM to predit the activity towards COVID and the drug linkeness. We calculated over 200 descriptors for the molecules using RDKit [2]. We hope you enjoy using it.
References:
[1] Axelrod, S., & Gomez-Bombarelli, R. (2021). GEOM (Version V4) [Computer software]. Harvard Dataverse. https://doi.org/10.7910/DVN/JNGTDF
[2] Greg Landrum, Paolo Tosco, Brian Kelley, sriniker, gedeck, NadineSchneider, Riccardo Vianello, Ric, Andrew Dalke, Brian Cole, AlexanderSavelyev, Matt Swain, Samo Turk, Dan N, Alain Vaucher, Eisuke Kawashima, Maciej Wójcikowski, Daniel Probst, guillaume godin, … DoliathGavid. (2020). rdkit/rdkit: 2020_03_1 (Q1 2020) Release (Release_2020_03_1). Zenodo. https://doi.org/10.5281/zenodo.3732262
You can start by importing the dataset, and then you can transform SMILES of the molecule into RDKit objects or using mooecular graphs.
Documentation
Attachment | Size |
---|---|
dataset.txt | 135 bytes |
Comments
Hope you enjoy playing with the data!