Machine Learning

Speech impairment constitutes a challenge to an individual's ability to communicate effectively through speech and hearing. To overcome this, affected individuals’ resort to alternative modes of communication, such as sign language. Despite the increasing prevalence of sign language, there still exists a hindrance for non-sign language speakers to effectively communicate with individuals who primarily use sign language for communication purposes. Sign languages are a class of languages that employ a specific set of hand gestures, movements, and postures to convey messages.

Categories:
2676 Views

In deep learning, images are utilized due to their rich information content, spatial hierarchies, and translation invariance, rendering them ideal for tasks such as object recognition and classification. The classification of malware using images is an important field for deep learning, especially in cybersecurity. Within this context, the Classified Advanced Persistent Threat Dataset is a thorough collection that has been carefully selected to further this field's study and innovation.

Categories:
1563 Views

Microsoft contains a productive tool known as MS Office but the inclusion of VBA Macros inside the MS Office for automation purposes makes it a way for attackers to perform malicious activities. To get an up-to-date dataset, the research regarding VBA macros is still working to find efficient ways to detect it. To perform analysis, the dataset is required which is publically harder to find. To overcome this issue, a dataset is created from VirusTotal, VirusShare, Zenodo, Malware Bazaar, Github and InQuest Labs.

Categories:
1066 Views

This data repository contains test data and corresponding test code for evaluating the performance of a machine learning model. The dataset includes 950 labeled samples across 7 different classes. The test code provides implementations of several common evaluation metrics, including accuracy, precision, recall, and F1-score. This resource is intended to facilitate the benchmarking and comparison of different machine learning algorithms on a standardized task.

Categories:
26 Views

A PCB 130D20 microphone is kept in a stand and facing towards the tool tip to capture the emitted sound. The microphone is connected to the computer through a specially designed signal conditioner. GoldWave software is used to record the captured sound with sampling frequency set to 44100Hz. A series of machining experiments were conducted on a Turning machine(XLTURN from MTAB) with Carbide Insert and aluminium work piece of 38mm diameter. Throughout the experiment a constant feed rate set at 0.5 mm/rotation was uniformly maintained.

Categories:
111 Views

This dataset is composed of 2000 time-series (1000 Read and 1000 Write) realized from the much larger cloud storage workload released to the research community by the Alibaba group. The original dataset can be download from here: (https://github.com/alibaba/block-traces).

This original dataset collected over 31 days contains read/write data for 1000 storage volumes. The schema for each file given the file names and columns per file is explained:

Categories:
324 Views
Dataset of images of dragon fruit plants, collected from different media and taken from a dragon fruit field in Rio Branco, Brazil, with a total of 600 images
classified among 300 photos of sick plants, with fish eyes among others and 300 photos of healthy plants. For many of the photos, a simple smartphone 
camera was used to capture the images.

 

Categories:
902 Views

Precise recognition of soybean pods is a crucial need for acquiring phenotypic characteristics, such as the number of productive pods and the quantity of seeds per plant. There exist several techniques for counting seeds, each with their own boundaries. An automated procedure, such as a machine learning algorithm, that takes a image as input and outputs the discrete count of a certain object of interest in the image, canbe used for this type of work.

Categories:
191 Views

Current neural network solutions for channel estimation are frequently tested by training and testing on one example channel or similar channels. However, data-driven algorithms often degrade significantly on other channels which they are not trained on, because they cannot extrapolate their training knowledge. Online training can fine-tune the offline-trained neural networks to compensate for this degradation, but its feasibility is challenged by the tremendous computational resources required.

Categories:
143 Views

Pages