Machine Learning

The HQA1K dataset was developed for assessing the quality of Computer Generated Holography (CGH) image renderings based on direct human input.
HQA1K is comprised of 1,000 pairs of natural images matched to simulated CGH renderings of various quality levels. The result is a diverse set of data for evaluating image quality algorithms and models.


This bearing datasets has high data quality and obvious fault characteristics, so it is a commonly used bearing fault diagnosis standard dataset. In this datasets, three unbalanced datasets under different loads are constructed to testify the recognition effect of the proposed method. The test bench is composed of 2HP (1.5KW) induction motor, fan end bearing, driver end bearing, torque translator and load motor. By using EDM technology, single point faults with different depths were machined on the inner race, outer race and rolling element of the test bearing.


This dataset was acquired during the dissertation entitled "Optical Camera Communications and Machine Learning for Indoor Visible Light Positioning". This work was carried out in the academic year 2020/2021 at the Instituto de Telecomunicações in Aveiro in the scope of the Integrated Master in Electronics and Telecommunications Engineering at the Department of Electronics, Telecommunication and Informatics of the University of Aveiro.


8-channel monopolar sEMG signals were acquired using the device developed by our research group at a sampling rate of 1000 Hz. Medical gel electrodes (CH50B, Shanghai Hanjie Electronic Technology Co., LTD., Shanghai, China) were used for data collection. The position of the electrodes is shown in Fig. 2. The REF electrode was placed on the inner side of the upper big arm near the elbow and the RLD electrode was placed on the outer side of the right upper arm near the elbow. Eight monopolar electrodes were placed on the right forearm.


RMUTT-DLD is an aggregated collection of data that encompasses details derived from the IC3 digital literacy certification program conducted at Rajamangala University of Technology Thanyaburi (RMUTT) in Thailand spanning from 2016 to 2023. The expanded dataset includes demographic details, academic records, and certification results, offering a holistic perspective on the progression of students' digital literacy over a period of time. The dataset has the flexibility to be imported into diverse applications, enabling its utilization for various purposes.


This data collection focuses on capturing user-generated content from the popular social network Reddit during the year 2023. This dataset comprises 29 user-friendly CSV files collected from Reddit, containing textual data associated with various emotions and related concepts.


This research studies the stance classification task of parliamentary debates with the aims to analyse how parliamentarians argue on different debate topic, what is their political stance, and the impact of homophily with respect to their party affiliation. A state-level Australian Hansard data is collected focusing on debates related to obesity and food marketing policies in Australia. It covers 6 states and 1 territory (NT is excluded) from the period 1/1/2000 to 1/1/ 2022.


This dataset extracts the entropy of each of the PE sections of benign and ransomware reports to be used for detecting ransomware. Several machine learning classifiers were trained on this dataset such as Decision Tree, Random Forest, KNN, XGBoost and Naive Bayes. From the results, PE entropy can accurately detect ransomware with a decision tree classifier yielding the overall best result with a 98.8% accuracy and an AUC of 0.969. The latency with the prediction of the decision tree classifier was extremely quick with a result of 1.509 milliseconds.


Sensor arrays are ubiquitous. They capture images in digital cameras, record the swipes of our fingers on the screens of our phones and tablets, or map pressure distribution over an area. Soft capacitive sensor arrays have been proposed to make electronic pressure-sensing skins capable of identifying the location and intensity of touch. However, large arrays of those sensors remain challenging to produce, as they require high-resolution patterning of electrodes and routing of long and thin electrical connections.