Real name: 
Congratulations!  You have been automatically subscribed to IEEE DataPort and can access all datasets on IEEE DataPort!
First Name: 
Andres
Last Name: 
Frederic
Affiliation: 
National Institute of Informatics
Job Title: 
Associate Professor

Datasets & Competitions

During our research in generating or optimizing molecules to be drug candidates by extending deep reinforcement learning and graph neural networks algorithms, we used GEOM data [1], and we had an idea to make a dataset obtained from molecules from GEOM to predit the activity towards COVID and the drug linkeness. We calculated over 200 descriptors for the molecules using RDKit [2]. We hope you enjoy using it.

 

References:

Categories:
298 Views

The dataset aims to facilitate research in the optimization of the carbon footprint of recipes. Consisting of 30 Excel files processed through various Python scripts and Jupyter notebooks, the dataset serves as a versatile resource for both performance analysis and environmental impact assessment. The unique attribute of this dataset lies in its ability to calculate representative values of carbon footprint optimization through multiple algorithmic implementations.

Categories:
169 Views

This Named Entities dataset is implemented by employing the widely used Large Language Model (LLM), BERT, on the CORD-19 biomedical literature corpus. By fine-tuning the pre-trained BERT on the CORD-NER dataset, the model gains the ability to comprehend the context and semantics of biomedical named entities. The refined model is then utilized on the CORD-19 to extract more contextually relevant and updated named entities. However, fine-tuning large datasets with LLMs poses a challenge. To counter this, two distinct sampling methodologies are utilized.

Categories:
170 Views

In recent years, teaching-learning methods have emerged into a completely new dimension from what used to be a traditional approach. The in-person lectures have been converted into online virtual learning, the traditional record-keeping has been replaced by robust learning management systems which have made the teaching-learning process lot more efficient and convenient.

Categories:
156 Views

the focus of this dataset is to provid an open-loop solution for a stochastic problem with imperfect state information and
chance-constraints adjusted by an optimal gain.

Categories:
72 Views

dataset for An open-loop solution for a stochastic problem with imperfect state information and chance-constraints adjusted by an optimal gain.

Categories:
29 Views

The 5K EPP Dataset includes 5007 photos of water crystaks classified in 13 categories. This dataset was created under the leaderhip of Prof. Masaru Emoto.

Categories:
254 Views