Skip to main content

Machine Learning

This dataset is very vast and contains tweets related to COVID-19. There are 226668 unique tweet-ids in the whole dataset that ranges from December 2019 till May 2020 . The keywords that have been used to crawl the tweets are 'corona',  ,  'covid ' , 'sarscov2 ',  'covid19', 'coronavirus '.  For getting the other 33 fields of data drop a mail at "avishekgarain@gmail.com". Twitter doesn't allow public sharing of other details related to tweet data( texts,etc.) so can't upload here.

Categories:

This dataset is very vast and contains Bengali tweets related to COVID-19. There are 36117 unique tweet-ids in the whole dataset that ranges from December 2019 till May 2020 . The keywords that have been used to crawl the tweets are 'corona',  ,  'covid ' , 'sarscov2 ',  'covid19', 'coronavirus '.  For getting the other 33 fields of data drop a mail at "avishekgarain@gmail.com". Code snippet is given in Documentation file.

Categories:

This dataset is very vast and contains Spanish tweets related to COVID-19. There are 18958 unique tweet-ids in the whole dataset that ranges from December 2019 till May 2020 . The keywords that have been used to crawl the tweets are 'corona',  ,  'covid ' , 'sarscov2 ',  'covid19', 'coronavirus '.  For getting the other 33 fields of data drop a mail at "avishekgarain@gmail.com". Code snippet is given in Documentation file.

Categories:

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification.

Categories:

100 Speakers each consisting of 5 voice samples for training data and 1 voice sample for testing data. Total of 600 voice samples collected in different audio formats like mpeg, mp4, mp3, ogg etc. These samples were than preprocessed and converted into .wav format. Each voice sample has a time duration of 5-10 seconds due to different lengths tuning of parameters should be done before usage. Whole Dataset size is 600mb and duration is 1 hour 40 minutes. This dataset can be used for speech synthesis, speaker identification.

Categories:
    

Dataset used for "A Machine Learning Approach for Wi-Fi RTT Ranging" paper (ION ITM 2019). The dataset includes almost 30,000 Wi-Fi RTT (FTM) raw channel measurements from real-life client and access points, from an office environment. This data can be used for Time of Arrival (ToA), ranging, positioning, navigation and other types of research in Wi-Fi indoor location. The zip file includes a README file, a CSV file with the dataset and several Matlab functions to help the user plot the data and demonstrate how to estimate the range.

Categories:

This dataset contains cardiovascular data recorded during progressive exsanguination in a porcine model of hemorrhage. Both wearable and catheter-based sensors were used to capture cardiovascular function; the wearable system contained a fusion of ECG, SCG, and PPG sensors while the catheter-based system was comprised of pressure catheters in the aortic arch, femoral artery, and right and left atria via a Swan-Ganz catheter.

Categories: