Artificial Intelligence

The Human voice Natural Language from On-demand media (HENLO) dataset is a high-quality emotional speech dataset created to address the need for representative and realistic data in speech emotion recognition research. Unlike many existing datasets, which rely on simulated emotions performed by untrained speakers or directed participants, HENLO sources its data from professionally produced films and podcasts available on Media On-Demand (MOD).

Categories:
67 Views

In this paper we use Natural Language Processing techniques to improve different machine learning approaches (Support Vector Machines (SVM), Local SVM, Random Forests) to the problem of automatic keyphrases extraction from scientific papers. For the evaluation we propose a large and high-quality dataset: 2000 ACM papers from the Computer Science domain. We evaluate by comparison with expert-assigned keyphrases.

Categories:
3 Views

Backlog refinement is a critical process within Agile practices which often faces challenges like ambiguous user stories, prioritization difficulties, and cognitive overload among team members. Teams spend a lot of time in grooming user stories and refining them based on the client or business requirements and customer feedback. In this paper, we present an empirical study, exploring the integration of Generative AI (GenAI), specifically Large Language Models into backlog refinement workflows to address these challenges.

Categories:
12 Views

Vision-language (VL) datasets are essential for advancing the capabilities of VL models, particularly in specialized domains like medical imaging. However, existing medical VL datasets are relatively small and predominantly focus on chest X-rays, limiting their applicability to other areas. To address this gap, we introduce the Skin-Path dataset, a comprehensive VL dataset specifically curated for histopathology.

Categories:
35 Views

Hyperspectral images are represented by numerous
narrow wavelength bands in the visible and near-infrared parts
of the electromagnetic spectrum. As hyperspectral imagery gains
traction for general computer vision tasks, there is an increased
need for large and comprehensive datasets for use as training
data.
Recent advancements in sensor technology allow us to capture
hyperspectral data cubes at higher spatial and temporal reso-
lution. However, there are few publicly available multi-purpose

Categories:
37 Views

The necessity for strong security measures to fend off cyberattacks has increased due to the growing use of Industrial Internet of Things (IIoT) technologies. This research introduces IoTForge Pro, a comprehensive security testbed designed to generate a diverse and extensive intrusion dataset for IIoT environments. The testbed simulates various IIoT scenarios, incorporating network topologies and communication protocols to create realistic attack vectors and normal traffic patterns.

Categories:
63 Views

This is a lightweight and versatile robustness benchmark built upon the training set of ImageNet-1K. It contains an overall of 50,000 images, divided in 5 components, evenly distributed over 1,000 classes. It assesses the performance of a classification model in five aspects: accuracy on intrinsically difficult images (SuperHard, SH), images with partial information (PartialInfo, PI), robustness against low resolution (LowResolution, LR), adversarial attacks (AdversarialAttack, AA), and speckle noise (SpeckleNoise, SN).

Categories:
52 Views

This dataset comprises extensive multi-modal data related to the experimental study of ultrasonically excited pulsating fluid jets used for bone cement removal. Conducted at the Institute of Geonics, Ostrava, Czech Republic, the study explores the effect of varying standoff distances on erosion profiles, under controlled parameters including a fixed nozzle diameter, sonotrode frequency, supply pressure, and robot arm velocity. The dataset includes numerical data representing ablation profiles, captured as a large CSV file, and audio recordings captured using a high-resolution microphone.

Categories:
98 Views

With the accelerating pace of population aging, the urgency and necessity for elderly individuals to control smart home systems have become increasingly evident. Smart homes not only enhance the independence of older adults, enabling them to complete daily activities more conveniently, but also ensure safety through health monitoring and emergency alert systems, thereby reducing the caregiving burden on families and society.

Categories:
66 Views

Drafting is a game mode in collectible card games where players build their decks from a restricted pool of cards. Throughout one draft, players are offered a series of selections, from which they must build their deck. Although drafting is a popular game variant in \textit{Magic: The Gathering}, few machine learning models have been developed to learn card selection strategies. We model drafts with a Siamese neural network that is trained on real-world data and predicts human expert selection. Our model learns an embedding space of preferences by comparing cards in the context of a deck.

Categories:
39 Views

Pages