*.csv; *.json;
This paper introduces the Metal-Oxide-Semiconductor Field Effect Transistor (MOSFET) Electrical Simulation Dataset, MESD, an extensive collection of I-V and C-V characteristics data simulated across different foundries' Berkeley Short-channel IGFET Models (BSIMs). The MESD dataset covers a range of bias voltages, temperatures, and MOSFET physical dimensions across several technology nodes from 3 to 350 nm.
- Categories:
We organized and collected two years' worth of complete fault work orders from a wind farm, and structured these work orders into a fault diagnosis event knowledge graph using the proposed algorithm. This graph includes fault modes, fault impacts, fault symptoms, inspection schemes, root cause identification, and maintenance strategies, covering all potential fault information and handling methods for wind turbines. This dataset records the head entity-relation-tail entity information in the form of triples using JSON format.
- Categories:
This dataset contains information about code smell, which is a very important issue in software engineering.
It is built by collecting the method having code smell from GitHub using the SonarCloud tool.
There are 5 code smells and 1 normal class with 500 examples each.
the metadata: method (function),smellkey, smellid
Smell Type
ID
Description
Reference
java:S100
0
- Categories:
This study investigates the application of advanced machine learning models, specifically Long Short-Term Memory (LSTM) networks and Gradient Booster models, for accurate energy consumption estimation within a Kubernetes cluster environment. It aims to enhance sustainable computing practices by providing precise predictions of energy usage across various computing nodes. Through meticulous analysis of model performance on both master and worker nodes, the research reveals the strengths and potential applications of these models in promoting energy efficiency.
- Categories:
<p class="MsoNormal"><span lang="EN-US">The Text2RDF dataset is primarily designed to facilitate the transformation from text to RDF. It contains 1,000 annotated text segments, encompassing a total of 7,228 triplets. Utilizing this dataset to fine-tune large language models enables the models to extract triplets from text, which can ultimately be used to construct knowledge graphs. </span></p>
- Categories:
The IEEE Xplore database is vital in democratizing access to high-quality research datasets, fostering global collaboration, and promoting interdisciplinary studies. Insights from the IEEE Xplore database support applications in academic collaboration networks, predictive research trends, recommendation systems, and the evolution of scientific discourse. Our cirdc dataset extracts key information of all articles in the IEEE Xplore database using web data mining methods. Source codes and scripts for data collection are provided to promote transparency and reproducibility.
- Categories:
This repo contains the results and analysis data used in the experiment reported in the paper "Anycast and Third-party Libraries: A Recipe for a Privacy Disaster?" (under revision).
To this end, we conducted an experiment where we analyzed the personal data transfers of more than 5,500 Android apps, further identifying the libraries triggering the transfers and the destinations’ geolocation. The results show that 90% of third-party libraries and 98.65% of apps integrating them potentially fail to meet the requirements for international personal data transfers.
- Categories:
This study investigates whether the ingredients listed on restaurant menus can provide insights into a city's socioeconomic status. Using data from an online food delivery system, the study compares menu items with local education rates and rental prices. A machine learning model is developed to predict menu prices based on ingredients and socioeconomic factors. An efficiency metric is proposed to cluster restaurants to address autocorrelation, comparing ingredient averages to socioeconomic indicators.
- Categories:
This research introduces the Open Seizure Database and Toolkit as a novel, publicly accessible resource designed to advance non-electroencephalogram seizure detection research. This paper highlights the scarcity of resources in the non-electroencephalogram domain and establishes the Open Seizure Database as the first openly accessible database containing multimodal sensor data from 49 participants in real-world, in-home environments.
- Categories:
Automatic extraction of valuable, structured evidence from the exponentially growing clinical trial literature can help physicians practice evidence-based medicine quickly and accurately. However, current research on evidence extraction has been limited by the lack of generalization ability on various clinical topics and the high cost of manual annotation. In this work, we address these challenges by constructing a PICO-based evidence dataset PICO-DS, covering five clinical topics.
- Categories: