Machine Learning

This paper presents a dataset of brain Electroencephalogram (EEG) signals created when Malayalam vowels and consonants are spoken. The dataset was created by capturing EEG signals utilizing the OpenBCI Cyton device while a volunteer spoke Malayalam vowels and consonants. It includes recordings obtained from both sub-vocal and vocal. The creation of this dataset aims to support individuals who speak Malayalam and suffer from neurodegenerative diseases.

Categories:
2546 Views

The BirDrone dataset is compiled by aggregating images of small drones and birds sourced from various online datasets. It comprises 2970 high-resolution images (640x640 pixels), each featuring unique backdrops and lighting conditions. This dataset is designed to enhance machine learning models by simulating real-world scenarios.

 

Dataset Specifications:

Categories:
603 Views

The security of systems with limited resources is essential for deployment and cannot be compromised by other performance metrics such as throughput. Physically Unclonable Functions (PUFs) present a promising, cost-effective solution for various security applications, including IC counterfeiting and lightweight authentication. PUFs, as security blocks, exploit physical variations to extract intrinsic responses based on applied challenges, with Challenge-Response Pairs (CRPs) uniquely defining each device.

Categories:
275 Views

Speech impairment constitutes a challenge to an individual's ability to communicate effectively through speech and hearing. To overcome this, affected individuals’ resort to alternative modes of communication, such as sign language. Despite the increasing prevalence of sign language, there still exists a hindrance for non-sign language speakers to effectively communicate with individuals who primarily use sign language for communication purposes. Sign languages are a class of languages that employ a specific set of hand gestures, movements, and postures to convey messages.

Categories:
2079 Views

In deep learning, images are utilized due to their rich information content, spatial hierarchies, and translation invariance, rendering them ideal for tasks such as object recognition and classification. The classification of malware using images is an important field for deep learning, especially in cybersecurity. Within this context, the Classified Advanced Persistent Threat Dataset is a thorough collection that has been carefully selected to further this field's study and innovation.

Categories:
1459 Views

Microsoft contains a productive tool known as MS Office but the inclusion of VBA Macros inside the MS Office for automation purposes makes it a way for attackers to perform malicious activities. To get an up-to-date dataset, the research regarding VBA macros is still working to find efficient ways to detect it. To perform analysis, the dataset is required which is publically harder to find. To overcome this issue, a dataset is created from VirusTotal, VirusShare, Zenodo, Malware Bazaar, Github and InQuest Labs.

Categories:
1032 Views

This data repository contains test data and corresponding test code for evaluating the performance of a machine learning model. The dataset includes 950 labeled samples across 7 different classes. The test code provides implementations of several common evaluation metrics, including accuracy, precision, recall, and F1-score. This resource is intended to facilitate the benchmarking and comparison of different machine learning algorithms on a standardized task.

Categories:
24 Views

A PCB 130D20 microphone is kept in a stand and facing towards the tool tip to capture the emitted sound. The microphone is connected to the computer through a specially designed signal conditioner. GoldWave software is used to record the captured sound with sampling frequency set to 44100Hz. A series of machining experiments were conducted on a Turning machine(XLTURN from MTAB) with Carbide Insert and aluminium work piece of 38mm diameter. Throughout the experiment a constant feed rate set at 0.5 mm/rotation was uniformly maintained.

Categories:
104 Views

This dataset is composed of 2000 time-series (1000 Read and 1000 Write) realized from the much larger cloud storage workload released to the research community by the Alibaba group. The original dataset can be download from here: (https://github.com/alibaba/block-traces).

This original dataset collected over 31 days contains read/write data for 1000 storage volumes. The schema for each file given the file names and columns per file is explained:

Categories:
278 Views

Pages