Machine Learning | IEEE DataPort

EEG data for ADHD / Control children

Participants were 61 children with ADHD and 60 healthy controls (boys and girls, ages 7-12). The ADHD children were diagnosed by an experienced psychiatrist to DSM-IV criteria, and have taken Ritalin for up to 6 months. None of the children in the control group had a history of psychiatric disorders, epilepsy, or any report of high-risk behaviors.

Categories:

Category

COVID-19 tweets dataset for Bengali language

This dataset is very vast and contains Bengali tweets related to COVID-19. There are 36117 unique tweet-ids in the whole dataset that ranges from December 2019 till May 2020 . The keywords that have been used to crawl the tweets are 'corona', , 'covid ' , 'sarscov2 ', 'covid19', 'coronavirus '. For getting the other 33 fields of data drop a mail at "avishekgarain@gmail.com". Code snippet is given in Documentation file.

Categories:

Category

COVID-19 tweets dataset for Spanish language

This dataset is very vast and contains Spanish tweets related to COVID-19. There are 18958 unique tweet-ids in the whole dataset that ranges from December 2019 till May 2020 . The keywords that have been used to crawl the tweets are 'corona', , 'covid ' , 'sarscov2 ', 'covid19', 'coronavirus '. For getting the other 33 fields of data drop a mail at "avishekgarain@gmail.com". Code snippet is given in Documentation file.

Categories:

Category

Speech Dataset in Hindi Language

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification.

Categories:

Category

Speech Dataset in Hindi Language

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification.

Categories:

Category

Intel Open Wi-Fi RTT Dataset

Dataset used for "A Machine Learning Approach for Wi-Fi RTT Ranging" paper (ION ITM 2019). The dataset includes almost 30,000 Wi-Fi RTT (FTM) raw channel measurements from real-life client and access points, from an office environment. This data can be used for Time of Arrival (ToA), ranging, positioning, navigation and other types of research in Wi-Fi indoor location. The zip file includes a README file, a CSV file with the dataset and several Matlab functions to help the user plot the data and demonstrate how to estimate the range.

Categories:

Category

A novel fusion Python application of data mining techniques to evaluate airborne magnetic datasets

Depths to the various subsurface anomalies have been the primary interest in all the applications of magnetic methods of geophysical prospection. Depths to the subsurface geologic features of interest are more valuable and superior to all other properties in any correct subsurface geologic structural interpretations.

Categories:

Category

Wearable- and Catheter-Based Cardiovascular Signals during Progressive Exsanguination in a Porcine Model of Hemorrhage

This dataset contains cardiovascular data recorded during progressive exsanguination in a porcine model of hemorrhage. Both wearable and catheter-based sensors were used to capture cardiovascular function; the wearable system contained a fusion of ECG, SCG, and PPG sensors while the catheter-based system was comprised of pressure catheters in the aortic arch, femoral artery, and right and left atria via a Swan-Ganz catheter.

Categories:

Category

MIT DriveSeg (Semi-auto) Dataset

Solving the external perception problem for autonomous vehicles and driver-assistance systems requires accurate and robust driving scene perception in both regularly-occurring driving scenarios (termed “common cases”) and rare outlier driving scenarios (termed “edge cases”). In order to develop and evaluate driving scene perception models at scale, and more importantly, covering potential edge cases from the real world, we take advantage of the MIT-AVT Clustered Driving Scene Dataset and build a subset for the semantic scene segmentation task.

Categories:

Category

Anomaly detection dataset

Please refer each dataset website for further information

Categories: