PMDC motor finds wide application in electric unity systems. Performance of the motor depends on overall mechnical vibrations. These two aspects are inter related. Also the structural platform or foundations play important role in these regards. Too much vibration often cause short circuit for short period. This data set presents sample vibration dependent short circuit data of current signals along with normal current for a PMDC with 12 volt supply.   


Data set includes Normal current data in column 1 and the two sets of short duration fault data that occurs due to the vibtrations of motor and foundations propagated to circuit connections causing the shrt circuit. If vibration is minimized, they will not occur. 


The heating and electricity consumption data are the results of an energy audit program aggregated for multiple load profiles of a residential customer. These profiles include HVAC systems loads, convenience power, elevator, etc. The datasets are gathered between December 2010 and November 2018 with a one-hour timestep resolution, thereby containing 140,160 measurements, half of which is for heat or electricity. In addition to the historical energy consumption values, a concatenation of weather variables is also available.


This is a publicly available dataset of heating and electricity consumption profiles, aggregated from multiple load profiles of a residential customer. The dataset is gathered between December 2010 and November 2018 with a one-hour time step resolution, thereby containing 70,080 measurements. In addition to the historical energy consumption values, a concatenation of meteorological variables is also included. The weather variables are air pressure, temperature, and humidity plus wind speed and solar irradiation at the predetermined location. 


The AOLAH databases are contributions from Aswan faculty of engineering to help researchers in the field of online handwriting recognition to build a powerful system to recognize Arabic handwritten script. AOLAH stands for Aswan On-Line Arabic Handwritten where “Aswan” is the small beautiful city located at the south of Egypt, “On-Line” means that the databases are collected the same time as they are written, “Arabic” cause these databases are just collected for Arabic characters, and “Handwritten” written by the natural human hand.


* There are two databases; first database is for Arabic characters, it consists of 2,520 sample files written by 90 writers using simulation of a stylus pen and a touch screen. The second database is for Arabic characters’ strokes, it consists of 1,530 sample files for 17 strokes. The second database is extracted from the previous accepted database by extracting strokes from characters.
* Writers are volunteers from Aswan faculty of engineering with ages from 18 to 20 years old.
* Natural writings with unrestricted writing styles.
* Each volunteer writes the 28 characters of Arabic script using the GUI.
* It can be used for Arabic online characters recognition.
* The developed tools for collecting the data is code acts as a simulation of a stylus pen and a touch screen, pre-processing data samples of characters are also available for researchers.
* The database is available free of charge (for academic and research purposes) to the researchers.
* The databases available here are the training databases.


Recent advances in computational power availibility and cloud computing has prompted extensive research in epileptic seizure detection and prediction. EEG (electroencephalogram) datasets from ‘Dept. of Epileptology, Univ. of Bonn’ and ‘CHB-MIT Scalp EEG Database’ are publically available datasets which are the most sought after amongst researchers. Bonn dataset is very small compared to CHB-MIT. But still researchers prefer Bonn as it is in simple '.txt' format. The dataset being published here is a preprocessed form of CHB-MIT. The dataset is available in '.csv' format.


Procedure :

  1. The tool used for preprocessing is Anaconda-Jupyter Notebook on Intel 8th gen i5 processor with 8GB RAM
  2. The dataset is prepared by extracting datapoints from '.edf' by using mne package in python. Equal amount of preictal and ictal data are extracted.
  3. A period of 4096 seconds (68 minutes) each of preictal and ictal data is extracted from the '.edf' files. All ictal periods for 24 patients annotated have been included in the dataset.
  4. Datapoints are loaded and preprocessed as dataframes by using pandas package in python.
  5. System RAM size should be available to the maximum possible extent as dataframes are large.
  6. The file chbmit_preprocessed_data.csv can be used as is for machine learning and deep learning models.

Data Availability :

The datset contains following files.

  • chbmit_ictal_raw_data.csv : This file contains only ictal data from all 24 patients. The channels vary largely and amount to 96 columns in this file.
  • chbmit_preictal_raw_data.csv : This file contains only preictal data from all 24 patients. The channels vary largely and amount to 96 columns in this file.
  • chbmit_preictal_23channels_data.csv :This file contains only preictal data from all 24 patients. Only 23 channels are retained and amount to 23 columns in this file.
  • chbmit_ictal_23channels_data.csv :This file contains only ictal data from all 24 patients. Only 23 channels are retained and amount to 23 columns in this file.
  • chbmit_preprocessed_data.csv :This file contains balanced preictal and ictal data from all 24 patients. Only 23 channels are retained, outcome column is added and amount to 24 columns in this file. In outcome column '0' indicates preictal and '1' indicates ictal.

This dataset is prepared with data reduction techniques. Data cleaning and data transformation need to be done as suitable for the application or model under development. 

Original Data:


The original raw dataset in '.edf' is available at  and to be cited as 

Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220


iot dataset



The Internet of Things (IoT) technology has revolutionized every aspect of everyday life by making everything smarter. IoT became more popular in recent years due to its vast applications in many fields such as smart cities, agriculture, healthcare, ambient assisted living, animal tracking, etc. Localization of a sensor node refers to knowing a sensor node's geographical location in the IoT network.


This dataset supports researchers in the validation process of solutions such as Intrusion Detection Systems (IDS) based on artificial intelligence and machine learning techniques for the detection and categorization of threats in Cyber Physical Systems (CPS). To that aim, data have been acquired from a Secure Water Treatment (SWaT) hardware-in-the-loop testbed which emulates water passage between nine tanks via solenoid-valves, pumps, pressure and flow sensors. The testbed is composed by a real partition which is virtually connected to a simulated one.


This dataset has related to the paper "A hardware-in-the-loop Secure Water Treatment dataset for cyber-physical security testing".
We provide four different acquisitions:
1) A normal acquisition without attacks ("normal.csv" for network traffic and "dataset_norm.csv" for physical measures)
2) Three acquisitions where different types of attacks and physical faults are reproduced ("attack_1.csv", "attack_2.csv" and "attack_3.csv" for network traffic and "dataset_att_1.csv", "dataset_att_2.csv" and "dataset_att_3.csv" for physical measures)
In addition to .csv files we provide four .pcap files ("attack_1.pcap", "attack_2.pcap", "attack_3.pcap" and "normal.pcap") which refer to network acquisitions for the four previous scenarios.
A README.xlsx file summarizes the key features of the entire dataset.


The dataset attached is recordings done for 5 parameters to ascertain physical soil composition. Data was collected between March 2021 and April 2021. This dataset is the raw data.


This dataset is proposed for human activity recognition tasks. The static activities including sitting, standing, and laying, as well as walking, running, cycling, and walking upstairs/downstairs. Each activity lasts for 2 minutes, 23 subjects were involved in the experiments.


Twenty-three subjects were involved in 8 activities, including 3 static ones and 5 periodic activities of walking, running, cycling, and walking upstairs/downstairs. Each activity lasts around 2 minutes.

For each column, the sensor signals are organized as " 'wri_Acc_X', 'wri_Acc_Y', 'wri_Acc_Z', 'wri_Gyr_X', 'wri_Gyr_Y', 'wri_Gyr_Z', 'wri_Mag_X', 'wri_Mag_Y', 'wri_Mag_Z', 'ank_Acc_X', 'ank_Acc_Y', 'ank_Acc_Z', 'ank_Gyr_X', 'ank_Gyr_Y', 'ank_Gyr_Z', 'ank_Mag_X', 'ank_Mag_Y','ank_Mag_Z', 'bac_Acc_X', 'bac_Acc_Y', 'bac_Acc_Z', 'bac_Gyr_X', 'bac_Gyr_Y', 'bac_Gyr_Z', 'bac_Mag_X', 'bac_Mag_Y', 'bac_Mag_Z' ".


The data acquisition process begins with capturing EEG signals from 20 healthy skilled volunteers who gave their written consent before performing the experiments. Each volunteer was asked to repeat an experiment for 10 times at different frequencies; each experiment was trigger by a visual stimulus.