Machine Learning

This data set contains: 

- Training dataset: 271 CT-scans of inner ears used for optimization and training of the model. 

- Validation dataset: 70 CT-scans of inner ears used for external validation. 


Data on 2355 COVID-19 cases by date of July to December 2021 were extracted from a data set recorded by COVID-19 referral centers at Qazvin province in Iran. We recorded a wide range of clinical characteristics including age, sex, previous diseases, and hospitalization time. Moreover, we collected data about the different consumed medications including Atrovastatin 20 mg, Atrovastatin 40 mg, Ivermectin 3 mg, Ivermectin 40 mg, Dexamethasone, Kaletra, Favipiravir, Famotidine 40 mg, Interferon, Remdesivir, Hydroxychloroquine.


During our research in generating or optimizing molecules to be drug candidates by extending deep reinforcement learning and graph neural networks algorithms, we used GEOM data [1], and we had an idea to make a dataset obtained from molecules from GEOM to predit the activity towards COVID and the drug linkeness. We calculated over 200 descriptors for the molecules using RDKit [2]. We hope you enjoy using it.




Forecasting production from wind and solar power plants, and making effective decisions under forecast uncertainty, are essential capabilities in low-carbon energy systems. This competition invites participants to develop state-of-the-art forecasting and energy trading techniques to accelerate the global transition to net-zero and to win a share of $21,000 in prize money. It aims to bridge the gap between academic and industry practice, introduce energy forecasting challenges to new communities, and promote energy analytics and data science education.

Last Updated On: 
Thu, 02/08/2024 - 11:50
Citation Author(s): 
Jethro Browell, Sebastian Haglund, Henrik Kälvegren, Edoardo Simioni, Ricardo Bessa, Yi Wang

The dataset contained walking data of 41 volunteers, including 20 women and 21 men. Each volunteer walked on asphalt and SLATE roads for six times, each time for less than one minute. In addition, the dataset also included gait data on stairs and stairs. The data acquisition frequency is 100HZ, and a total of four sensors are used to collect data. The sensor numbered 001 is located on the left knee, the sensor numbered 001 is located on the right wrist, the sensor numbered 003 is located on the left ankle, and the sensor 004 is located on the back of the waist.


This dataset comprises gunshot audio and supporting data released as part of ShotSpotter Tech Note 098, "Precision and accuracy of acoustic gunshot location in an urban environment".

The data derive from a series of live fire tests of the ShotSpotter Respond gunshot location system conducted in Pittsburgh, PA on December 18th, 2018 by the Pittsburgh Bureau of Police. ShotSpotter uses live fire tests to validate that the deployed sensor density is appropriate for the community in question, and to ensure the system is ready for production use.


The pathology files of 194 colon cancer patients, 137 breast cancer patients, 124 gastric cancer patients, and 169 thyroid cancer patients who were referred to the healthcare facilities of Qazvin Province, Iran  were examined for age, sex, surgery type, and pathological information. We collected information between 2010 and 2020.


JVNV is a Japanese emotional speech corpus with verbal content and nonverbal vocalizations whose scripts are generated by a large-scale language model.

Existing emotional speech corpora lack not only proper emotional scripts but also nonverbal vocalizations (NVs) that are essential expressions in spoken language to express emotions.

We propose an automatic script generation method to produce emotional scripts by providing seed words with sentiment polarity and phrases of nonverbal vocalizations to ChatGPT using prompt engineering.


Social Media Big Dataset for Research, Analytics, Prediction, and Understanding the Global Climate Change Trends is focused on understanding the climate science, trends, and public awareness of climate change. The use of dataset for analytics of climate change trends greatly helps in researching and comprehending global climate change trends.