Machine Learning

The Chattel Text was obtained by personnel on-site through the camera in the perception cap due to the lack of open source data. Among them, the Chattel Text dataset is 828 sheets. The Chattel Text dataset in this paper is labeled by labelImg to calibrate the text box and get the labeled document. The label document contains 8 numbers and a text, where the 8 numbers are the horizontal and vertical coordinates of the four vertices of the rectangular text box in the picture. Due to the randomized environment, some of the texts in the picture will be skewed and other characteristics.

Categories:
132 Views

Flow to image conversion is a pivotal preprocessing step in intrusion detection systems (IDS) where the representation of network flow data significantly influences classifier performance. In this study, we explore the effects of three distinct methods of transforming flow data into images on classifier performance.

Categories:
121 Views

One of the Dravidian language spoken majorly by 60 million people in and around Karnataka state of India is known as Kannada. It is one among 22 scheduled languages of India. Kannada langauge is written in Kannada scriptwhich has its traces back from kadamba script (325-550 AD). There are many languages which were used centuries back and aren’t being used currently whereas Kannada is one such language which is used even today for writing official documents and are being taught at schools which means it is going to be for many years.

Categories:
101 Views

This paper presents a dataset of brain Electroencephalogram (EEG) signals created when Malayalam vowels and consonants are spoken. The dataset was created by capturing EEG signals utilizing the OpenBCI Cyton device while a volunteer spoke Malayalam vowels and consonants. It includes recordings obtained from both sub-vocal and vocal. The creation of this dataset aims to support individuals who speak Malayalam and suffer from neurodegenerative diseases.

Categories:
2488 Views

The BirDrone dataset is compiled by aggregating images of small drones and birds sourced from various online datasets. It comprises 2970 high-resolution images (640x640 pixels), each featuring unique backdrops and lighting conditions. This dataset is designed to enhance machine learning models by simulating real-world scenarios.

 

Dataset Specifications:

Categories:
302 Views

The security of systems with limited resources is essential for deployment and cannot be compromised by other performance metrics such as throughput. Physically Unclonable Functions (PUFs) present a promising, cost-effective solution for various security applications, including IC counterfeiting and lightweight authentication. PUFs, as security blocks, exploit physical variations to extract intrinsic responses based on applied challenges, with Challenge-Response Pairs (CRPs) uniquely defining each device.

Categories:
221 Views

Speech impairment constitutes a challenge to an individual's ability to communicate effectively through speech and hearing. To overcome this, affected individuals’ resort to alternative modes of communication, such as sign language. Despite the increasing prevalence of sign language, there still exists a hindrance for non-sign language speakers to effectively communicate with individuals who primarily use sign language for communication purposes. Sign languages are a class of languages that employ a specific set of hand gestures, movements, and postures to convey messages.

Categories:
555 Views

In deep learning, images are utilized due to their rich information content, spatial hierarchies, and translation invariance, rendering them ideal for tasks such as object recognition and classification. The classification of malware using images is an important field for deep learning, especially in cybersecurity. Within this context, the Classified Advanced Persistent Threat Dataset is a thorough collection that has been carefully selected to further this field's study and innovation.

Categories:
971 Views

Microsoft contains a productive tool known as MS Office but the inclusion of VBA Macros inside the MS Office for automation purposes makes it a way for attackers to perform malicious activities. To get an up-to-date dataset, the research regarding VBA macros is still working to find efficient ways to detect it. To perform analysis, the dataset is required which is publically harder to find. To overcome this issue, a dataset is created from VirusTotal, VirusShare, Zenodo, Malware Bazaar, Github and InQuest Labs.

Categories:
885 Views

This data repository contains test data and corresponding test code for evaluating the performance of a machine learning model. The dataset includes 950 labeled samples across 7 different classes. The test code provides implementations of several common evaluation metrics, including accuracy, precision, recall, and F1-score. This resource is intended to facilitate the benchmarking and comparison of different machine learning algorithms on a standardized task.

Categories:
17 Views

Pages