*.csv

This is a dataset that contains the testing results presented in the manuscript "Exploring the Potential of Offline LLMs in Data Science: A Study on Code Generation for Data Analysis", and it aims to assess offline LLMs' capabilities in code generation for data analytics tasks. Best utilization of the dataset would occur after thorough understanding of the manuscript. A total of 250 testing results were generated for each of the two LLMs evaluated. They were merged, leading to the creation of this current dataset.
- Categories:

The PermGuard dataset is a carefully crafted Android Malware dataset that maps Android permissions to exploitation techniques, providing valuable insights into how malware can exploit these permissions. It consists of 55,911 benign and 55,911 malware apps, creating a balanced dataset for analysis. APK files were sourced from AndroZoo, including applications scanned between January 1, 2019, and July 1, 2024. A novel construction method extracts Android permissions and links them to exploitation techniques, enabling a deeper understanding of permission misuse.
- Categories:

This dataset was generated using high-fidelity air combat simulations to develop and evaluate Weapon Engagement Zone (WEZ) prediction models. It contains data for various Beyond Visual Range (BVR) air combat scenarios, capturing diverse conditions and configurations between a shooter aircraft and a target.
The dataset is split into factorial and random design datasets, with outputs representing critical WEZ parameters, including the maximum range (Rmax) and the no-escape zone (Rnez).
- Categories:
This dataset provides comprehensive data for predicting the most suitable fertilizer for various crops based on environmental and soil conditions. It includes environmental factors like temperature, humidity, and moisture, along with soil and crop types, and nutrient composition (Nitrogen, Potassium, and Phosphorous). The target variable is the recommended fertilizer name.
The data is already pre-processed without anu Null values.
- Categories:

This dataset contains survey responses collected from Agile practitioners across various roles, including Scrum Masters, Developers, Product Owners, and Agile Coaches, from organizations with diverse Agile practices. The survey aimed to identify the common challenges in backlog refinement, such as time constraints, prioritization issues, and ambiguous user stories. It also explored perceptions of Generative AI's role in streamlining Agile workflows, enhancing productivity, and reducing cognitive load.
- Categories:

Please cite the following paper when using this dataset:
Vanessa Su and Nirmalya Thakur, “COVID-19 on YouTube: A Data-Driven Analysis of Sentiment, Toxicity, and Content Recommendations”, Proceedings of the IEEE 15th Annual Computing and Communication Workshop and Conference 2025, Las Vegas, USA, Jan 06-08, 2025 (Paper accepted for publication, Preprint: https://arxiv.org/abs/2412.17180).
Abstract:
- Categories:
To provide machine learning and data science experts with a more robust dataset for model training, the well-known Palmer Penguins dataset has been expanded from its original 344 rows to 100,000 rows. This substantial increase was achieved using an adversarial random forest technique, effectively generating additional synthetic data while maintaining key patterns and features. The method achieved an impressive accuracy of 88%, ensuring the expanded dataset remains realistic and suitable for classification tasks.
- Categories:

This dataset contains a comprehensive V2X misbehavior dataset simulated using VASP, an open-source framework. VASP allows the simulation of diverse types of V2X attacks and works as a sub-module for Veins, a well-established open-source framework for running vehicular network simulations. Veins runs on an event-based network simulator OMNeT ++, and road traffic simulator SUMO. Data are collected from the Boston traffic network, which is a good candidate to represent real-world traffic mobility. We run VASP simulation for 3,000 simulated seconds to collect benign traces without any attacks.
- Categories:
This is a test case for a talent intelligence evaluation benchmark dataset with rich attributes (Attributes: 11, 909, Samples: 244, 610), containing information on honors, masterpieces, projects, rankings, and other attributes. Please note that we are providing this for scientific research use only; to use the full dataset, please contact liuying.void@gmail.com.
- Categories:

The Deenz Psychopathy Spectrum Scale (DPSS-24) is a newly developed psychometric instrument aimed at assessing psychopathy traits across diverse adult populations. This study presents preliminary data collected from two distinct samples—a group of 21 participants from an initial testing phase and a German sample of 31 participants. Each participant completed the DPSS-24, a 24-item scale designed to measure various psychopathy-related behaviors, including impulsivity, emotional detachment, and interpersonal difficulties, using a Likert scale ranging from 1 to 5.
- Categories: