Machine Learning

<p>This multilingual Twitter dataset spans over 2 years from October 2019 to the end of 2021, including 3 months before the outbreak of the COVID-19 pandemic.</p>
- Categories:

Multi-label event classification label of each sample-document is done with nine bits. The first bit signifies whether an event is present or absent with 1 or 0 respectively. The remaining eight bits signifies presence or absence of (i) covid, (ii) flood, (iii) storm, (iv) heavy rain, (v) cloudburst, (vi) landslide, (vii) earthquake, (viii) Tsunami with 1 or 0. The location and the impact sentence classification labeling are similar.
- Categories:
A synthetic data for low power (P ≤10 mW) InGaAsP MQW-DFB lasers operating at a wavelength (λ) ranging from 1.53 to 1.57 µm at a case temperature laying between -40 ℃ to 85 ℃ with side mode suppression ratio of more than 35 dB is generated and can be used for laser lifetime prediction using machine learning based approaches.
- Categories:
The dataset includes processed sequences of optical time domain reflectometry (OTDR) traces incorporating different types of fiber faults namely fiber cut, fiber eavesdropping (fiber tapping), dirty connector and bad splice. The dataset can be used for developping ML-based approaches for optical fiber fault detection, localization, idenification, and characterization.
- Categories:

The disaster-news healline generation dataset (news_articles_and _titles) contains a set of disaster-news articles and their headlines/titles. This dataset may be used to develop a method to generate a good quality headline for a disaster-news article.
- Categories:
Ground reaction forces (GRFs) and center of pressure trajectories (CoPs) are required for a comprehensive biomechanical analysis. They are also important outcome measures in sports sciences or clinical areas. GRFs and CoPs are usually measured by force plate, which is rarely equipped on staircases in laboratories. We present a one-dimensional convolutional neural network for estimating GRFs and CoPs during stair ascent and descent using multi-level of kinematics as input.
- Categories:

The data collection includes posts from social media networks popular among Russian-speaking people. The information was gathered using pre-defined keywords ("war," "special military operation," and so on) and is mainly relevant to Ukraine's continuing conflict with Russia. Following a thorough assessment and analysis of the data, propaganda and false news were detected. The information gathered has been anonymized. Feature engineering and text preparation can extract new insights and information from this data source.
- Categories:
This data set contains information on cardiopulmonary signals that were recorded simultaneously. The signals are separated into two folders, one titled heart sounds and the other lung sounds. In addition, two matlab programs are included, one with which the signals can be recorded and another to make graphs in time and frequency. It also has a pdf file that details the nomenclature of the signals.
This data set can be useful for various signal processing algorithms: filtering, PCA, LDA, ICA, CNN, etc.
- Categories:

This data set consists of 3-phase currents of faults and other transient cases for transmission lines connected with DFIG-based Wind Farms. PSCAD/EMTDC software is used for the simulation of the faults and other transients.
- Categories: