*.csv; *.json;
A fact-checking dataset focused exclusively on quantitative claims. It includes 33,422 fact-checked claims featuring comparative, statistical, interval, and temporal entities. Each claim is accompanied by detailed metadata and supporting evidence, providing a robust foundation for automated verification. This dataset contains claims and their corresponding fact-checking details. It is provided in JSON format, with each entry containing information about a claim, its processed version, fact-checking results, and relevant metadata.
- Categories:
Partial dataset of CHVM-1K dataset for illustration purposes.
{
"question": "What stages can be divided into in the development history of ancient Chinese bronzes? Why?",
"answer": "The development history of ancient Chinese bronzes can be divided into several stages: Xia (2100-1600 BCE), Shang (1600-1046 BCE), Early Western Zhou (1046-771 BCE), Middle Western Zhou (771-720 BCE), Late Western Zhou (720-256 BCE), and Eastern Zhou (256-256 BCE). These stages are marked by technological advancements, stylistic evolution, and cultural significance.",
- Categories:
This paper introduces the Metal-Oxide-Semiconductor Field Effect Transistor (MOSFET) Electrical Simulation Dataset, MESD, an extensive collection of I-V and C-V characteristics data simulated across different foundries' Berkeley Short-channel IGFET Models (BSIMs). The MESD dataset covers a range of bias voltages, temperatures, and MOSFET physical dimensions across several technology nodes from 3 to 350 nm.
- Categories:
We organized and collected two years' worth of complete fault work orders from a wind farm, and structured these work orders into a fault diagnosis event knowledge graph using the proposed algorithm. This graph includes fault modes, fault impacts, fault symptoms, inspection schemes, root cause identification, and maintenance strategies, covering all potential fault information and handling methods for wind turbines. This dataset records the head entity-relation-tail entity information in the form of triples using JSON format.
- Categories:
This dataset contains information about code smell, which is a very important issue in software engineering.
It is built by collecting the method having code smell from GitHub using the SonarCloud tool.
There are 5 code smells and 1 normal class with 500 examples each.
the metadata: method (function),smellkey, smellid
Smell Type
ID
Description
Reference
java:S100
0
- Categories:
This study investigates the application of advanced machine learning models, specifically Long Short-Term Memory (LSTM) networks and Gradient Booster models, for accurate energy consumption estimation within a Kubernetes cluster environment. It aims to enhance sustainable computing practices by providing precise predictions of energy usage across various computing nodes. Through meticulous analysis of model performance on both master and worker nodes, the research reveals the strengths and potential applications of these models in promoting energy efficiency.
- Categories:
<p class="MsoNormal"><span lang="EN-US">The Text2RDF dataset is primarily designed to facilitate the transformation from text to RDF. It contains 1,000 annotated text segments, encompassing a total of 7,228 triplets. Utilizing this dataset to fine-tune large language models enables the models to extract triplets from text, which can ultimately be used to construct knowledge graphs. </span></p>
- Categories:
The IEEE Xplore database is vital in democratizing access to high-quality research datasets, fostering global collaboration, and promoting interdisciplinary studies. Insights from the IEEE Xplore database support applications in academic collaboration networks, predictive research trends, recommendation systems, and the evolution of scientific discourse. Our cirdc dataset extracts key information of all articles in the IEEE Xplore database using web data mining methods. Source codes and scripts for data collection are provided to promote transparency and reproducibility.
- Categories:
This repo contains the results and analysis data used in the experiment reported in the paper "Anycast and Third-party Libraries: A Recipe for a Privacy Disaster?" (under revision).
To this end, we conducted an experiment where we analyzed the personal data transfers of more than 5,500 Android apps, further identifying the libraries triggering the transfers and the destinations’ geolocation. The results show that 90% of third-party libraries and 98.65% of apps integrating them potentially fail to meet the requirements for international personal data transfers.
- Categories:
This study investigates whether the ingredients listed on restaurant menus can provide insights into a city's socioeconomic status. Using data from an online food delivery system, the study compares menu items with local education rates and rental prices. A machine learning model is developed to predict menu prices based on ingredients and socioeconomic factors. An efficiency metric is proposed to cluster restaurants to address autocorrelation, comparing ingredient averages to socioeconomic indicators.
- Categories: