.csv; .json;

Text2RDF: LLM Fine-tuning Dataset for NER and RE

<p class="MsoNormal"><span lang="EN-US">The Text2RDF dataset is primarily designed to facilitate the transformation from text to RDF. It contains 1,000 annotated text segments, encompassing a total of 7,228 triplets. Utilizing this dataset to fine-tune large language models enables the models to extract triplets from text, which can ultimately be used to construct knowledge graphs. </span></p>

Categories:: Artificial Intelligence

395 Views

CIRDC

The IEEE Xplore database is vital in democratizing access to high-quality research datasets, fostering global collaboration, and promoting interdisciplinary studies. Insights from the IEEE Xplore database support applications in academic collaboration networks, predictive research trends, recommendation systems, and the evolution of scientific discourse. Our cirdc dataset extracts key information of all articles in the IEEE Xplore database using web data mining methods. Source codes and scripts for data collection are provided to promote transparency and reproducibility.

Categories:: Other

112 Views

Anycast and Third-party Libraries: A Recipe for a Privacy Disaster?

This repo contains the results and analysis data used in the experiment reported in the paper "Anycast and Third-party Libraries: A Recipe for a Privacy Disaster?".

To this end, we conducted an experiment where we analyzed the personal data transfers of more than 5,500 Android apps, further identifying the libraries triggering the transfers and the destinations’ geolocation. The results show that 90% of third-party libraries and 98.65% of apps integrating them potentially fail to meet the requirements for international personal data transfers.

Categories:: Communications

100 Views

Pita and pizza restaurants and menus in Ankara

This study investigates whether the ingredients listed on restaurant menus can provide insights into a city's socioeconomic status. Using data from an online food delivery system, the study compares menu items with local education rates and rental prices. A machine learning model is developed to predict menu prices based on ingredients and socioeconomic factors. An efficiency metric is proposed to cluster restaurants to address autocorrelation, comparing ingredient averages to socioeconomic indicators.

Categories:: Other
Social Sciences
Demographic
Education
Financial

314 Views

Open Seizure Database v1.0.0

This research introduces the Open Seizure Database and Toolkit as a novel, publicly accessible resource designed to advance non-electroencephalogram seizure detection research. This paper highlights the scarcity of resources in the non-electroencephalogram domain and establishes the Open Seizure Database as the first openly accessible database containing multimodal sensor data from 49 participants in real-world, in-home environments.

Categories:: Artificial Intelligence
Signal Processing
Discrete-time signal processing
Sensors
Biomedical and Health Sciences

1768 Views

PICO-DS

Automatic extraction of valuable, structured evidence from the exponentially growing clinical trial literature can help physicians practice evidence-based medicine quickly and accurately. However, current research on evidence extraction has been limited by the lack of generalization ability on various clinical topics and the high cost of manual annotation. In this work, we address these challenges by constructing a PICO-based evidence dataset PICO-DS, covering five clinical topics.