Machine Learning
The dataset contains performance values, Area Under the ROC Curve (AUC) and Average Precision (AP), of popular anomaly detection (AD) algorithms taken over a set of 9k AD benchmark datasets.
Datasets were initially published with the following paper:
Kandanaarachchi, S., Muñoz, M. A., Hyndman, R. J., & Smith-Miles, K. (2020). On normalization and algorithm selection for unsupervised outlier detection. Data Mining and Knowledge Discovery, 34(2), 309-354.
- Categories:
In this paper, a novel time-constrained global and local nonlinear analytic stationary subspace analysis (Tc-GLNASSA) is proposed to enhance blast furnace ironmaking process (BFIP) monitoring. Although existing analytic stationary subspace analysis method has been available for deriving process consistent relationships. However, the presence of complex nonlinear, periodic nonstationary and time-varying smelting conditions renders the satisfactory estimation of stationary projections unattainable.
- Categories:
This dataset contains one month of the binary activity of the 4060 urban IoT nodes. Each record in the dataset presents the node ID, the time stamp, the location of the IoT node in latitude and longitude, and also the binary activity of the IoT node. The main purpose of this dataset is to be used as part of distributed denial of service (DDoS) attack research.
- Categories:
Accurate detection and segmentation of apple trees are crucial in high throughput phenotyping, further guiding apple trees yield or quality management. A LiDAR and a camera were attached to the UAV to acquire RGB information and coordinate information of a whole orchard. The information was integrated by simultaneous localization and mapping network to form a dataset of RGB-colored point clouds. The dataset can be used for methods related to apple detection and segmentation based on point clouds.
- Categories:
Data preprocessing is a fundamental stage in deep learning modeling and serves as the cornerstone of reliable data analytics. These deep learning models require significant amounts of training data to be effective, with small datasets often resulting in overfitting and poor performance on large datasets. One solution to this problem is parallelization in data modeling, which allows the model to fit the training data more effectively, leading to higher accuracy on large data sets and higher performance overall.
- Categories:
eLearning, or online learning, has reached every corner of the globe in this era of digitization. As a result of the COVID-19 pandemic, the value of eLearning has increased substantially. In eLearning recommendation systems, information overload, personalised suggestion, sparsity, and accuracy are all major problems. The correct eLearning Recommendation System is necessary to tailor the course recommendation according to the user's needs. To create this model, dataset of the User Profile and User Rating is needed.
- Categories:
eLearning, or online learning, has reached every corner of the globe in this era of digitization. As a result of the COVID-19 pandemic, the value of eLearning has increased substantially. In eLearning recommendation systems, information overload, personalised suggestion, sparsity, and accuracy are all major problems. The correct eLearning Recommendation System is necessary to tailor the course recommendation according to the user's needs. To create this model, dataset of the User Profile and User Rating is needed.
- Categories:
each application has up to two files. One for memory dataset and another for control flow dataset. Each dataset is composed of JSON objects. Each instruction is a JSON object.
- Categories: