*.csv

Internet Traffic (KDD Cup 99)

<p><span style="font-family: 'Times New Roman'; font-size: medium;">This is the data set used for The Third International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-99 The Fifth International Conference on Knowledge Discovery and Data Mining. The competition task was to build a network intrusion detector, a predictive model capable of distinguishing between ``bad'' connections, called intrusions or attacks, and ``good'' normal connections.

Categories:: Machine Learning

13 Views

Large-scale Statistical Keyword Co-occurrence Network

We collect metadata including published year and keywords for 84,725 papers published in 42 statistical journals from 1992 to 2021 from the Web of Science (www.webofscience.com). After combining different expressions of the same keyword and filtering out keywords with low frequency, we finally obtain 5,037 keywords. Multiple keywords co-exist within a paper, and this co-occurrence relationship can be utilized to construct the keyword co-occurrence network.

Categories:: Social Sciences

23 Views

Electricity Demand and Weather Data (2021–2023)

This dataset contains hourly electricity demand data and corresponding weather indicators collected from 2021 to 2023. The electricity data was sourced from the U.S. Energy Information Administration (EIA), covering both winter and summer periods across three years. Weather features—including temperature, wind speed, and humidity—were collected to capture the external conditions affecting demand. All files are stored in CSV format and aligned by timestamp. This dataset supports research in time series forecasting, demand prediction, and energy systems modeling.

Categories:: Power and Energy

123 Views

Droidware Android Malware Dataset

Droidware is an Android malware dataset developed at the Cybersecurity Lab, GLA University, India. It comprises 253,527 applications, including 129,950 benign and 123,577 malicious samples. The dataset captures 68 features extracted from function call graphs, permissions, and Java source code, providing a comprehensive view of Android malware behavior. This latest and up-to-date dataset supports the training of AI-based malware detection models, aiding in the development of robust malware classification and threat mitigation strategies for cybersecurity research.

Categories:: Artificial Intelligence
Machine Learning

44 Views

VOCs from blood culture broth samples

This dataset comprises volatile organic compound (VOC) profiles collected from blood culture broth samples using an electronic nose (E-nose) system. The samples include cultures positive for Candida spp., including C. albicans, C. glabrata, C. tropicalis, among others, as well as negative control samples.

Categories:: Sensors

19 Views

Bitcoin Tweets 2022

Bitcoin(₿) is a cryptocurrency invented in 2008 by an unknown person or group of people using the pseudonym Satoshi Nakamoto. The currency began use in 2009 when its implementation was released as open-source software.

Categories:: Artificial Intelligence
Social Sciences
Communications

154 Views

Quality of Service 5G

A rapid growth of wireless communication networks, particularly in 5G Non-Standalone (NSA) deployments, has necessitated advanced multiple access techniques to enhance spectral efficiency, interference management, and energy optimization [1-3]. Rate-Splitting Multiple Access (RSMA) has arisen as a strong candidate to replace conventional Non-Orthogonal Multiple Access (NOMA) by efficiently splitting user data into common and private components. [1-2].

Categories:: Communications

52 Views

MET nonlinear

This dataset comprises a collection of CSV files containing paired time-series measurements essential for nonlinear compensation research in electrochemical seismometers (MET). Each CSV file, named according to specific magnitude-frequency combinations (magX_freqY.csv), contains two columns: 'origin' representing the original system response and 'target' representing the desired compensated output.

Categories:: Sensors

32 Views

SFI Smart Ocean dataset for underwater acoustic communications

At-sea testing of underwater acoustic communication systems requires resources unavailable to the wider research community, and researchers often resort to simplified channel models to test new protocols. The present dataset comprises in-situ hydrophone recordings of communications and channel probing waveforms, featuring an assortment of popular modulation formats. The waveforms were transmitted in three frequency bands (4-8 kHz, 9-14 kHz, and 24-32 kHz) during an overnight experiment in an enclosed fjord environment, and were recorded on two hydrophone receivers.

Categories:: Signal Processing
Communications

154 Views

CIR for Rapid and Gradual Deceleration

This study presents a deep learning-based framework for detecting vehicle deceleration patterns using Ultra-Wideband (UWB) Channel Impulse Response (CIR) analysis. Unlike traditional GPS or IMU-based systems, which struggle in GPS-denied environments such as tunnels, the proposed method leverages UWB CIR signal variations to classify two key driving behaviors: rapid deceleration and gradual deceleration. All data were collected from real-world experiments using UWB devices installed on actual vehicles at a professional highway testing site.

Categories:: IoT
Machine Learning
Sensors
Transportation

19 Views

*.csv

*.csv

Pages