Artificial Intelligence
Fundus Image Myopia Development (FIMD) dataset contains 70 retinal image pairs, in which, there is obvious myopia development between each pair of images. In addition, each pair of retinal images has a large overlap area, and there is no other retinopathy. In order to perform a reliable quantitative evaluation of registration results, we follow the annotation method of Fundus Image Registration (FIRE) dataset [1] to label control points between the pair of retinal images with the help of experienced ophthalmologists. Each image pair is labeled with
- Categories:
This LTE_RFFI project sets up an LTE device radio frequency fingerprint identification system using deep learning techniques. The LTE uplink signals are collected from ten different LTE devices using a USRP N210 in different locations. The sampling rate of the USRP is 25 MHz. The received signal is resampled to 30.72 MHz in Matlab and is saved in the MAT file form. The corresponding processed signals are included in the dataset. More details about the datasets can be found in the README document.
- Categories:
The Marketable Foods (MF) dataset was originally constructed to fine-tune the language and visual network layers and facilitates backdoor injections in text-to-image generative models. The dataset consists of images from three popular food corporations with prominent, recognisable brands (Coffee = Starbucks, Burger = McDonald's, Drink = Coca Cola). Samples were collected from the internet and were cleaned using a filtering algorithm discussed in the corresponding paper.
- Categories:
Weather radar echo extrapolation is an important approach for convective nowcasting, which predicts the evolution of convective systems in a short term. In recent years, radar echo extrapolation approaches based on deep learning have made significant progress and have been widely applied for radar echo extrapolation.
- Categories:
RITA (Resource for Italian Tests Assessment), is a new NLP dataset of academic exam texts written in Italian by second-language learners for obtaining the CEFR certification of proficiency level.
RITA dataset is available for automatic processing in CSV and XML format, under an agreement of citation.
- Categories:
SYPHAXAR dataset is a dataset for Arabic text detection in the wild. It was collected from Tunisia in “Sfax” city, the second largest Tunisian city after the capital. A total of 3078 images were gathered through manual collection one by one, with each image energizing text detection challenges in nature according to real existing complexity of 15 different routes along with ring roads, intersections and roundabouts. These annotated images consist of more than 31000 objects, each of which is enclosed within a bounding box.
- Categories:
This dataset comprises data created during research on AI-generated code, with a focus on software engineering use-cases. The purpose of the research was to investigate how AI should be integrated into university software engineering curricula.
- Categories:
Overview
The dataset under consideration is a comprehensive compilation of code snippets, function descriptions, and their respective binary representations aimed at fostering research in software engineering. It contains a variety of code functionalities and serves as a valuable resource for understanding the behavior and characteristics of C programs. This data is sourced from the AnghaBench repository, a well-documented collection of C programs available on GitHub.
Columns and Data Types
- Categories: