Artificial Intelligence

Marketable Foods (MF) Dataset

The Marketable Foods (MF) dataset was originally constructed to fine-tune the language and visual network layers and facilitates backdoor injections in text-to-image generative models. The dataset consists of images from three popular food corporations with prominent, recognisable brands (Coffee = Starbucks, Burger = McDonald's, Drink = Coca Cola). Samples were collected from the internet and were cleaned using a filtering algorithm discussed in the corresponding paper.

Categories:: Artificial Intelligence

793 Views

SWAN2018-2019

Weather radar echo extrapolation is an important approach for convective nowcasting, which predicts the evolution of convective systems in a short term. In recent years, radar echo extrapolation approaches based on deep learning have made significant progress and have been widely applied for radar echo extrapolation.

Categories:: Artificial Intelligence

61 Views

RITA: a Phraseological dataset of CEFR Assignments and Exams for Italian as a Second Language

RITA (Resource for Italian Tests Assessment), is a new NLP dataset of academic exam texts written in Italian by second-language learners for obtaining the CEFR certification of proficiency level.
RITA dataset is available for automatic processing in CSV and XML format, under an agreement of citation.

Categories:: Artificial Intelligence
Education and Learning Technologies
Machine Learning
Other
Social Sciences
Computational Intelligence
Education

399 Views

Pokemon-Zero-Neg

To thoroughly investigate the non-overlapping registration problem, we created our own datasets: Pokemon-Zero for zero overlap and Pokemon-Neg for negative overlap. In this section, we describe the process of dataset creation.

Categories:: Artificial Intelligence

245 Views

Discovering Mathematical Patterns Behind HIV-1 Genetic Recombination: a new methodology to identify viral features - Supplementary Information

This dataset contains the Supplementary Information of the article "Discovering Mathematical Patterns Behind HIV-1 Genetic Recombination: a new methodology to identify viral features" (Manuscript DOI: 10.1109/ACCESS.2023.3311752).

Categories:: Artificial Intelligence
Machine Learning
Image Processing
Biomedical and Health Sciences

296 Views

SYPHAXAR Dataset

SYPHAXAR dataset is a dataset for Arabic text detection in the wild. It was collected from Tunisia in “Sfax” city, the second largest Tunisian city after the capital. A total of 3078 images were gathered through manual collection one by one, with each image energizing text detection challenges in nature according to real existing complexity of 15 different routes along with ring roads, intersections and roundabouts. These annotated images consist of more than 31000 objects, each of which is enclosed within a bounding box.

Categories:: Artificial Intelligence
Machine Learning
Image Processing
Computer Vision

242 Views

ChatGPT Study

This dataset comprises data created during research on AI-generated code, with a focus on software engineering use-cases. The purpose of the research was to investigate how AI should be integrated into university software engineering curricula.

Categories:: Artificial Intelligence
Education

653 Views

Pre-Training Representations of Binary Code Using Contrastive Learning

Overview

The dataset under consideration is a comprehensive compilation of code snippets, function descriptions, and their respective binary representations aimed at fostering research in software engineering. It contains a variety of code functionalities and serves as a valuable resource for understanding the behavior and characteristics of C programs. This data is sourced from the AnghaBench repository, a well-documented collection of C programs available on GitHub.

Columns and Data Types

Categories:: Artificial Intelligence
Machine Learning

143 Views

queue waiting time dataset

The "Queue Waiting Time Dataset" is a detailed collection of information that records the movement of waiting times in queues. This dataset contains important details such as the time of arrival, the start and finish times, the waiting time, and the length of the queue. The arrival time denotes the moment when customers enter the queue, while the start and finish times track the duration of the service process. The waiting time measures the time spent waiting in the queue, and the queue length shows the number of customers in the queue when a new customer arrives.

Categories:: Artificial Intelligence
Continuous-time signal processing
Digital signal processing

2393 Views

33-, 119-, and 136-bus system data for reinforcement learning-based distribution network reconfiguration

The 33-, 119-, and 136-bus datasets are commonly used in the field of power systems and electrical engineering to train reinforcement learning-based algorithms for distribution network reconfiguration. Distribution network reconfiguration involves altering the topology of the electrical distribution grid by opening or closing switches to optimize certain objectives, such as minimizing power losses, improving voltage profiles, or enhancing overall system efficiency. This process is essential for maintaining a reliable and cost-effective power distribution system.

Categories:: Artificial Intelligence

1025 Views

Artificial Intelligence

Artificial Intelligence

Pages