*.csv; *.json;

This paper introduces the Metal-Oxide-Semiconductor Field Effect Transistor (MOSFET) Electrical Simulation Dataset, MESD, an extensive collection of I-V and C-V characteristics data simulated across different foundries' Berkeley Short-channel IGFET Models (BSIMs). The MESD dataset covers a range of bias voltages, temperatures, and MOSFET physical dimensions across several technology nodes from 3 to 350 nm.

Categories:
129 Views

We organized and collected two years' worth of complete fault work orders from a wind farm, and structured these work orders into a fault diagnosis event knowledge graph using the proposed algorithm. This graph includes fault modes, fault impacts, fault symptoms, inspection schemes, root cause identification, and maintenance strategies, covering all potential fault information and handling methods for wind turbines. This dataset records the head entity-relation-tail entity information in the form of triples using JSON format.

Categories:
745 Views

This dataset contains information about code smell, which is a very important issue in software engineering. 

It is built by collecting the method having code smell from GitHub using the SonarCloud tool.

There are 5 code smells and 1 normal class with 500 examples each.

the metadata: method (function),smellkey, smellid

Smell Type

ID

Description

Reference

java:S100

0

Categories:
463 Views

This study investigates the application of advanced machine learning models, specifically Long Short-Term Memory (LSTM) networks and Gradient Booster models, for accurate energy consumption estimation within a Kubernetes cluster environment. It aims to enhance sustainable computing practices by providing precise predictions of energy usage across various computing nodes. Through meticulous analysis of model performance on both master and worker nodes, the research reveals the strengths and potential applications of these models in promoting energy efficiency.

Categories:
453 Views

<p class="MsoNormal"><span lang="EN-US">The Text2RDF dataset is primarily designed to facilitate the transformation from text to RDF. It contains 1,000 annotated text segments, encompassing a total of 7,228 triplets. Utilizing this dataset to fine-tune large language models enables the models to extract triplets from text, which can ultimately be used to construct knowledge graphs.&nbsp;</span></p>

Categories:
333 Views

The IEEE Xplore database is vital in democratizing access to high-quality research datasets, fostering global collaboration, and promoting interdisciplinary studies. Insights from the IEEE Xplore database support applications in academic collaboration networks, predictive research trends, recommendation systems, and the evolution of scientific discourse. Our cirdc dataset extracts key information of all articles in the IEEE Xplore database using web data mining methods. Source codes and scripts for data collection are provided to promote transparency and reproducibility.

Categories:
92 Views

This repo contains the results and analysis data used in the experiment reported in the paper "Anycast and Third-party Libraries: A Recipe for a Privacy Disaster?" (under revision).

To this end, we conducted an experiment where we analyzed the personal data transfers of more than 5,500 Android apps, further identifying the libraries triggering the transfers and the destinations’ geolocation. The results show that 90% of third-party libraries and 98.65% of apps integrating them potentially fail to meet the requirements for international personal data transfers.

Categories:
77 Views

This study investigates whether the ingredients listed on restaurant menus can provide insights into a city's socioeconomic status. Using data from an online food delivery system, the study compares menu items with local education rates and rental prices. A machine learning model is developed to predict menu prices based on ingredients and socioeconomic factors. An efficiency metric is proposed to cluster restaurants to address autocorrelation, comparing ingredient averages to socioeconomic indicators.

Categories:
274 Views

This research introduces the Open Seizure Database and Toolkit as a novel, publicly accessible resource designed to advance non-electroencephalogram seizure detection research. This paper highlights the scarcity of resources in the non-electroencephalogram domain and establishes the Open Seizure Database as the first openly accessible database containing multimodal sensor data from 49 participants in real-world, in-home environments.

Categories:
1415 Views

Automatic extraction of valuable, structured evidence from the exponentially growing clinical trial literature can help physicians practice evidence-based medicine quickly and accurately. However, current research on evidence extraction has been limited by the lack of generalization ability on various clinical topics and the high cost of manual annotation. In this work, we address these challenges by constructing a PICO-based evidence dataset PICO-DS, covering five clinical topics.

Categories:
158 Views

Pages