Machine Learning

Discovering Mathematical Patterns Behind HIV-1 Genetic Recombination: a new methodology to identify viral features - Supplementary Information

This dataset contains the Supplementary Information of the article "Discovering Mathematical Patterns Behind HIV-1 Genetic Recombination: a new methodology to identify viral features" (Manuscript DOI: 10.1109/ACCESS.2023.3311752).

Categories:: Artificial Intelligence
Machine Learning
Image Processing
Biomedical and Health Sciences

296 Views

SYPHAXAR Dataset

SYPHAXAR dataset is a dataset for Arabic text detection in the wild. It was collected from Tunisia in “Sfax” city, the second largest Tunisian city after the capital. A total of 3078 images were gathered through manual collection one by one, with each image energizing text detection challenges in nature according to real existing complexity of 15 different routes along with ring roads, intersections and roundabouts. These annotated images consist of more than 31000 objects, each of which is enclosed within a bounding box.

Categories:: Artificial Intelligence
Machine Learning
Image Processing
Computer Vision

242 Views

Pre-Training Representations of Binary Code Using Contrastive Learning

Overview

The dataset under consideration is a comprehensive compilation of code snippets, function descriptions, and their respective binary representations aimed at fostering research in software engineering. It contains a variety of code functionalities and serves as a valuable resource for understanding the behavior and characteristics of C programs. This data is sourced from the AnghaBench repository, a well-documented collection of C programs available on GitHub.

Columns and Data Types

Categories:: Artificial Intelligence
Machine Learning

143 Views

CRPs Dataset of Ring Oscillator PUF

Physically unclonable functions (PUFs) are foundational components that offer a cost-efficient and promising solution for diverse security applications, including countering integrated circuit (IC) counterfeiting, generating secret keys, and enabling lightweight authentication. PUFs exploit semiconductor variations in ICs to derive inherent responses from imposed challenges, creating unique challenge-response pairs (CRPs) for individual devices. Analyzing PUF security is pivotal for identifying device vulnerabilities and ensuring response credibility.

Categories:: Machine Learning
Security

465 Views

10 Gas Datasets

Dataset description:

This contains ten categories of gas data, each category contains 5 concentrations, 10, 20, 30, 40, 50ppm.

There are 160 groups of 10, 20, 30, 40, each group contains 6000 sampled voltage signals, and the sampling frequency is 10HZ.

There are only 80 groups for 50ppm concentration, and each group also contains 6000 sampled voltage signals.

The label corresponding to each gas includes category and concentration, which can be split by gas category and concentration.

Categories:: Artificial Intelligence
Machine Learning
Sensors

1220 Views

Consumo de energía

Los datos empleados en el análisis del estudio fueron obtenidos del sistema SAP del Departamento Comercial de la Compañía Nacional de Electricidad (CNEL EP) Unidad de Negocio Esmeraldas. Estos datos consisten en registros originales de consumo mensual de energía eléctrica facturada (expresada en kilovatios-hora, kWh) durante un periodo de 25 meses (enero de 2021 a enero 2023). Estos registros pertenecen a 136218 clientes aproximadamente de del sector residencial de la provincia de Esmeraldas.

Categories:: Artificial Intelligence
Machine Learning
Power and Energy
Electric Utility

587 Views

Imbalanced Data

Classification learning on non-stationary data may face dynamic changes from time to time. The major problem in it is the class imbalance and high cost of labeling instances despite drifts. Imbalance is due to lower number of samples in the minority class than the majority class. Imbalanced data results in the misclassification of data points.

Categories:: Machine Learning

714 Views

Adaptive GA-BPNN experimental data

In data file (.rar) contains 16 files in .mat format, where origin data after UMAP for training.mat is the original training data and the others are the experimental result data. data1_*.mat is the model test result file containing the simulation results (test_simu_*), model output (output_test_*), and error (error_*).

Categories:: IoT
Machine Learning

39 Views

Search Interests related to Disease X originating from different Geographic Regions

Please cite the following paper when using this dataset:

N. Thakur, K. A. Patel, I. Hall, Y. N. Duggal, and S. Cui, “A Dataset of Search Interests related to Disease X originating from different Geographic Regions”, Preprints 2023, 2023081701, DOI: https://doi.org/10.20944/preprints202308.1701.v1

Abstract:

Categories:: Artificial Intelligence
Machine Learning
Standards Research Data
Biomedical and Health Sciences
Computational Intelligence
COVID-19
Health

694 Views

Carbon sequestration

We used Sentinel-2 images to create the dataset In order to estimate sequestered carbon in the above-ground forest Biomass. Moreover, fieldwork was completed to gather related forest biomass volume. The clipped image has a size of 1115 × 955 pixels and consists of bands 3, 4, and 8, which correspond to green, red, and near-infrared.

Categories:: Artificial Intelligence
Machine Learning
Image Processing
Climate Change/Environmental
Geoscience and Remote Sensing

996 Views

Machine Learning

Machine Learning

Pages