*.csv

Testing Results from Manuscript "Exploring the Potential of Offline LLMs in Data Science: A Study on Code Generation for Data Analysis"

This is a dataset that contains the testing results presented in the manuscript "Exploring the Potential of Offline LLMs in Data Science: A Study on Code Generation for Data Analysis", and it aims to assess offline LLMs' capabilities in code generation for data analytics tasks. Best utilization of the dataset would occur after thorough understanding of the manuscript. A total of 250 testing results were generated for each of the two LLMs evaluated. They were merged, leading to the creation of this current dataset.

Categories:: Artificial Intelligence

45 Views

PermGuard Android Malware Dataset

The PermGuard dataset is a carefully crafted Android Malware dataset that maps Android permissions to exploitation techniques, providing valuable insights into how malware can exploit these permissions. It consists of 55,911 benign and 55,911 malware apps, creating a balanced dataset for analysis. APK files were sourced from AndroZoo, including applications scanned between January 1, 2019, and July 1, 2024. A novel construction method extracts Android permissions and links them to exploitation techniques, enabling a deeper understanding of permission misuse.

Categories:: Artificial Intelligence
Machine Learning
Security

609 Views

Weapon Engagement Zone Ranges Prediciton in BVR Air Combat

This dataset was generated using high-fidelity air combat simulations to develop and evaluate Weapon Engagement Zone (WEZ) prediction models. It contains data for various Beyond Visual Range (BVR) air combat scenarios, capturing diverse conditions and configurations between a shooter aircraft and a target.

The dataset is split into factorial and random design datasets, with outputs representing critical WEZ parameters, including the maximum range (Rmax) and the no-escape zone (Rnez).

Categories:: Other

61 Views

Soil Fertility Data For Fertilizer Recommendation

This dataset provides comprehensive data for predicting the most suitable fertilizer for various crops based on environmental and soil conditions. It includes environmental factors like temperature, humidity, and moisture, along with soil and crop types, and nutrient composition (Nitrogen, Potassium, and Phosphorous). The target variable is the recommended fertilizer name.
The data is already pre-processed without anu Null values.

Categories:: Agriculture
Machine Learning
Sensors

1643 Views

Survey on Agile Practices and Technology Adoption

This dataset contains survey responses collected from Agile practitioners across various roles, including Scrum Masters, Developers, Product Owners, and Agile Coaches, from organizations with diverse Agile practices. The survey aimed to identify the common challenges in backlog refinement, such as time constraints, prioritization issues, and ambiguous user stories. It also explored perceptions of Generative AI's role in streamlining Agile workflows, enhancing productivity, and reducing cognitive load.

Categories:: Artificial Intelligence
Other

80 Views

COVID-19 on YouTube: A Data-Driven Analysis of Sentiment, Toxicity, and Content Recommendations

Please cite the following paper when using this dataset:

Vanessa Su and Nirmalya Thakur, “COVID-19 on YouTube: A Data-Driven Analysis of Sentiment, Toxicity, and Content Recommendations”, Proceedings of the IEEE 15th Annual Computing and Communication Workshop and Conference 2025, Las Vegas, USA, Jan 06-08, 2025 (Paper accepted for publication, Preprint: https://arxiv.org/abs/2412.17180).

Abstract:

Categories:: Artificial Intelligence
Education and Learning Technologies
Machine Learning
Computational Intelligence
COVID-19
Demographic
Health

147 Views

Palmer Penguins 100k

To provide machine learning and data science experts with a more robust dataset for model training, the well-known Palmer Penguins dataset has been expanded from its original 344 rows to 100,000 rows. This substantial increase was achieved using an adversarial random forest technique, effectively generating additional synthetic data while maintaining key patterns and features. The method achieved an impressive accuracy of 88%, ensuring the expanded dataset remains realistic and suitable for classification tasks.

Categories:: Machine Learning
Social Sciences

379 Views

MisbehaviorX: Comprehensive V2X Misbehavior Detection Dataset Enabled by the V2X Application Spoofing Platform

This dataset contains a comprehensive V2X misbehavior dataset simulated using VASP, an open-source framework. VASP allows the simulation of diverse types of V2X attacks and works as a sub-module for Veins, a well-established open-source framework for running vehicular network simulations. Veins runs on an event-based network simulator OMNeT ++, and road traffic simulator SUMO. Data are collected from the Boston traffic network, which is a good candidate to represent real-world traffic mobility. We run VASP simulation for 3,000 simulated seconds to collect benign traces without any attacks.

Categories:: Wireless Networking
Machine Learning
Security
Transportation

341 Views

A Richly Attributed Dataset for Talent Intelligence Evaluation

This is a test case for a talent intelligence evaluation benchmark dataset with rich attributes (Attributes: 11, 909, Samples: 244, 610), containing information on honors, masterpieces, projects, rankings, and other attributes. Please note that we are providing this for scientific research use only; to use the full dataset, please contact liuying.void@gmail.com.

Categories:: Standards Research Data

52 Views

Deenz Psychopathy Spectrum Scale (DPSS-24) German Population

The Deenz Psychopathy Spectrum Scale (DPSS-24) is a newly developed psychometric instrument aimed at assessing psychopathy traits across diverse adult populations. This study presents preliminary data collected from two distinct samples—a group of 21 participants from an initial testing phase and a German sample of 31 participants. Each participant completed the DPSS-24, a 24-item scale designed to measure various psychopathy-related behaviors, including impulsivity, emotional detachment, and interpersonal difficulties, using a Likert scale ranging from 1 to 5.

Categories:: Social Sciences

97 Views

*.csv

*.csv

Pages