Skip to main content

CSV

Along with the continuous growth of Internet usage, mobile users are becoming increasingly relevant as they are responsible for the largest percentage of web traffic. Conse- quently, a large and growing body of literature has been based on cellular data to gain a deeper understanding of several Internet-related concerns. Nevertheless, accessing high-quality cellular datasets can be a challenge for research teams due to scarcity and restricted access.

Categories:

The networks are stored under the data/ folder, one file per network. The filename should be <network>.csv.
One line per interaction/edge.
Each line should be: user, item, timestamp, state label, comma-separated array of features.
First line is the network format.
User and item fields can be alphanumeric.
Timestamp should be in cardinal format (not in datetime).
State label should be 1 whenever the user state changes, 0 otherwise. If there are no state labels, use 0 for all interactions.

Categories:

The JKU-ITS AVDM contains data from 17 participants performing different tasks with various levels of distraction.
The data collection was carried out in accordance with the relevant guidelines and regulations and informed consent was obtained from all participants.
The dataset was collected using the JKU-ITS research vehicle with automated capabilities under different illumination and weather conditions along a secure test route within the

Categories:

Nasal Cytology, or Rhinology, is the subfield of otolaryngology, focused on the microscope observation of samples of the nasal mucosa, aimed to recognize cells of different types, to spot and diagnose ongoing pathologies. Such methodology can claim good accuracy in diagnosing rhinitis and infections, being very cheap and accessible without any instrument more complex than a microscope, even optical ones.

Categories:

a Sinkhole attack in a dataset, we'll generate data that typically reflects network traffic and interactions, where some nodes act as sinkholes by attracting all or most of the traffic due to malicious intent. Here's how I'll structure the dataset for 80,000 records:

Categories:

The LIAR dataset has been widely followed by fake news detection researchers since its release, and along with a great deal of research, the community has provided a variety of feedback on the dataset to improve it. We adopted these feedbacks and released the LIAR2 dataset, a new benchmark dataset of ~23k manually labeled by professional fact-checkers for fake news detection tasks.

Categories:

Raw datasets of PDA10A receiver signals acquired with MSO2024 oscilloscope for Time Difference of Arrival Optical Wireless Positioning measurements.
The data is stored in csv, numpy and pickle format to be easily imported and processed in Python.
The first dataset consists of measurements on a grid without reflections.
The second dataset are identical measurements but with a corner geometry at the fourth receiver creating diffuse reflections.

Categories: