Artificial Intelligence
This dataset is specifically designed for smart contract code search, aiming to facilitate research and development in this domain. To construct the dataset, we collected annotated Solidity smart contract code from two major sources: GitHub and Etherscan. In total, we compiled approximately 470K raw (code, docstring) pairs, ensuring a diverse and comprehensive dataset.
- Categories:
Dataset 1 include 100 rebar-reinforced rectangular UHPC beams data.
- Categories:
The analysis suggests various innovative ideas to improve English instruction, with an emphasis on current technologies and an inclusive approach. These include using AI as a peer tutor, exploring virtual reality to create immersive learning environments, analyzing data to create customized learning materials, integrating local cultural values into instructional materials, implementing a technology-based inclusive learning model, implementing a policy for digital advancement in education, and making the most of contemporary learning resources.
- Categories:
Our paper presents a novel approach to enhance vulnerability descriptions using code-based methods for predicting missing phrase-based concepts. We leverage Large Language Models (LLMs) integrated with 1-hop relationship analysis to address hallucination issues. The code processes Textual Vulnerability Descriptions (TVDs) and extracts relevant security-related concepts such as vulnerability type, root cause, and impact. Our methodology involves generating predictions, verifying these with external knowledge sources, and filtering inaccurate results.
- Categories:
This dataset contains 2,016 sensor responses collected from an array of conductive carbon-black polymer composite sensors, exposed to four target analytes—acetonitrile, dichloromethane (DCM), methanol, and toluene—at nine distinct concentration levels ranging from 0.5% to 20% P/P₀. Each sensor was exposed to the analytes for 20 minutes, followed by 20 minutes of nitrogen flushing to restore the baseline. The data consists of 80 time points (one every 30 seconds) per response, with each time point representing the sensor's resistance to a specific analyte concentration.
- Categories:
This dataset provides 98208 self-play history data of a MahJong agent using MahJong Competition Rules. This dataset provides 98208 self-play history data of a MahJong agent using MahJong Competition Rules. This dataset provides 98208 self-play history data of a MahJong agent using MahJong Competition Rules. This dataset provides 98208 self-play history data of a MahJong agent using MahJong Competition Rules. This dataset provides 98208 self-play history data of a MahJong agent using MahJong Competition Rules. This dataset provides 98208 self-play history data of a MahJong agent using MahJon
- Categories:
All the healthcare facilites in this dataset were collected from the MOH 2018 list of Uganda healthcare facilites (https://library.health.go.ug/sites/default/files/resources/National%20Health%20Facility%20MasterLlist%202017.pdf) Additional features were scraped using the Google Maps API and additionally from some of the websites of the healthcare facilities themselves.
- Categories:
This paper describes a dataset of droplet images captured using the sessile drop technique, intended for applications in wettability analysis, surface characterization, and machine learning model training. The dataset comprises both original and synthetically augmented images to enhance its diversity and robustness for training machine learning models. The original, non-augmented portion of the dataset consists of 420 images of sessile droplets. To increase the dataset size and variability, an augmentation process was applied, generating 1008 additional images.
- Categories:
The datasets used for TASLP manuscript. These datasets are widely recognized and validated within the summarization research community, and using these well-established datasets ensures that our results are comparable with the vast majority of related work.In each dataset, we use 80% as the training set, while the remaining 20% as the test data. The range of datasets used for collecting knowledge is also consistent with the above. For the large dataset, we only use a part of it because collecting knowledge is time-consuming.
- Categories: