Machine Learning

We utilized Digital Ocean's cloud service, setting up three Linux virtual machines, each with 1vCPU, 1GB of memory, and a 10GB disk. The architecture included an API gateway for routing requests to a stateless application service backed by a database for storing application data. The application operates the service under a fluctuating workload generated by a load-testing script to simulate real-world usage scenarios. The target source or the application service is integrated with Prometheus, a monitoring tool for gathering system metrics.


This article presents a dataset collected from a real process control network (PCN) to facilitate deep-learning-based anomaly detection and analysis in industrial settings. The dataset aims to provide a realistic environment for researchers to develop, test, and benchmark anomaly detection models without the risk associated with experimenting on live systems. It reflects raw process data from a gas processing plant, offering coverage of critical parameters vital for system performance, safety, and process optimization.


This dataset is derived from Sentinel-2 satellite imagery.
The main goal is to employ this dataset to train and classify images into two classes: with trees, and without trees.
The structure of the dataset is 2 folders named: "tree" (images containing trees) and "no-trees" (images without presence of trees).
Each folder contains 5200 images of this type.


Computer vision (CV) techniques help to perform non-destructive seed viability detection (SVD) for faster, more efficient and fairer results. However, the seed vigor dataset currently suffers from insufficient number of samples, data noise, and imbalance of positive and negative samples.


The dataset tracks the performance of eight stock market indices, from six countries. The indices are: IPC, S\&P 500, DAX, DJIA, FTSE, N225, NDX, and CAC. The time period is from the 1st of June 2006 to the 31st of May 2023.The index and the FX data are sourced from Yahoo Finance, and the rest of the variables are retrieved from the OECD.


To achieve improved multi-node temperature estimation with limited training data in Permanent Magnet Synchronous Motors (PMSMs), a novel approach of a Lumped-Parameter Thermal Network (LPTN)-informed neural network is proposed in this paper. Firstly, the parameter and model uncertainties of third or higher-order LPTNs with global parameter identification for temperature estimation are systematically stated based on numerical analysis.


Towards an accessible vision-based exam and documentation solution using a smartphone/tablet device, we conduct a comprehensive multi-test digitized neurological examination (DNE) dataset collection, namely DNE-113. Collected over 113 participants, DNE-113, a multi-test DNE database of finger tapping, finger to finger, forearm roll, stand-up and walk, and facial activation tests. Patients in DNE-113 were diagnosed with Parkinson’s disease (PD) or at least one other neurological (OD) disorder, based on their clinical record.


Quantification and analysis of global oil trade networks reveals deep insights into a nation's development and influence at a global scale. Further, it allows us to predict future trends and changes to adapt state policy as the crude oil market influences the balance of power among the developed and emerging economies alike as it is central for energy needs as well for industrial progress.


This is an example of a dataset of multimodal medical images of lung tumors. This dataset has 100 PET/CT,CT,PET images and labels each for lung tumors. The file names of these images are PET/CT,CT,PET, and label. The dataset also has a sample file for the test set. The sample file for the test set is named test.


The dataset explores the linguistic characteristics of Ukrainian online community members on "Lviv. Forum Ridne City" ( based on gender (female/male). It includes vectors of male and female profiles, along with 36 control vectors for 18 women's profiles and 18 men's profiles. The dataset includes 48 linguistic characteristics of gender in online communication. The linguistic features analyzed encompass a wide range, including apology, modal designs, emotions, profanity, sports and politics references, and more.