Skip to main content

*.csv (zip); *.json (zip); *.pickle (zip); *.npz (zip);

The Travel Recommendation Dataset is a comprehensive dataset designed for building and evaluating conversational recommendation systems in the travel domain. It includes detailed information about users, destinations, and ratings, enabling researchers and developers to create personalized travel recommendation models. The dataset supports use cases such as personalizing travel recommendations, analyzing user behavior, and training machine learning models for recommendation tasks.

Categories:

ATPAD: An Accessible Tool for Atmospheric Data Processing and Visualization is a Python-based project that enables the analysis and visualization of pre-processed databases in an easy and freely accessible manner. As an example, we apply ATPAD to process and visualize data from the University Network of Atmospheric Observatories (RUOA) of the National Autonomous University of Mexico (UNAM), using three different stations located across Mexico. The access to the analyzed data-set an be found here. For the ATPAD code, please access: https://www.bremex-steaps.net/atpad

Categories:
BoardData is constructed from development board data provided by ST. These development boards are primarily utilized for function demonstration and platform development of STM32 series microcontrollers. They incorporate a suite of common sub-circuit modules for electronic devices, including interface modules, digital-to-analog and analog-to-digital converter modules, memory modules, comparators, touch modules, display modules, switch arrays, among others. Consequently, these boards exhibit a high degree of consistency with real PCB circuits.
Categories:

This ZIP file contains two distinct datasets collected over a 14-day period. The first dataset consists of real-world smart home data, providing detailed logs from six devices: a Plug Fan, Plug PC, Humidity Sensor, Presence Sensor, Light Bulb, and Window Opening Sensor. The data includes device interactions and environmental conditions such as temperature, humidity, and presence. The second dataset is generated by a smart home simulator for the same period, offering simulated device interactions and environmental variables.

Categories:

In this paper we use Natural Language Processing techniques to improve different machine learning approaches (Support Vector Machines (SVM), Local SVM, Random Forests) to the problem of automatic keyphrases extraction from scientific papers. For the evaluation we propose a large and high-quality dataset: 2000 ACM papers from the Computer Science domain. We evaluate by comparison with expert-assigned keyphrases.

Categories:

Dataset for "SynEL: A Synthetic Benchmark for Entity Linking" paper. The dataset integrates structured information from two primary sources: DBpedia for English, representing a high-resource language environment, and the Russian Public Company Register, a challenging low-resource dataset. Each dataset includes extensive annotations and structured entity links, ensuring high relevance for real-world applications in diverse industries.

Categories:

Surface electromyography (EMG) can be used to interact with and control robots via intent recognition. However, most machine learning algorithms used to decode EMG signals have been trained on small datasets with limited subjects, impacting their generalization across different users and tasks. Here we developed EMGNet, a large-scale dataset for EMG neural decoding of human movements. EMGNet combines 7 open-source datasets with processed EMG signals for 132 healthy subjects (152 GB total size).

Categories:

This dataset is the outcome of an observation on Millet traits under seed coating and covering. For covering we rely on Germination Percentage (FGP), Germination Index (GI),Mean Germination Time (MGT), Seedling Length( SL) and Seedling Vigour Index (SVI) and Abnormal Seedling have been measured.  Moreover, different enzyme levels including catalase, peroxidase, and  Malondialdehyde (MDA) are measured. 

Categories:

UPMVM used three datasets named UD1, UD2 and UD3. UD1 is primarily used to collect and retrieve 280 poetry meters (rhythmic patterns [بحر]) and their corresponding feet. Other uses of this dataset include the design of DFA state function sequences with terminal state information to align the identified verse meters. UD2 is collected from [GitHub - sayedzeeshan/Aruuz] and updated. This update process involves the parsing and tokenization of the UD2 dataset.

Categories:

This dataset comprises Internet core network data inferred using the methodology detailed in the article titled 'Exploring Internet Evolution Through Analysis of its Core Network'.

Categories: