This dataset contains a list of 284 popular websites and URLs to their privacy statements. The websites belong to the three largest South Asian economies, namely, India, Pakistan, and Bangladesh. Each website is categorized into 10 sectors, namely, e-commerce, finance/banking, education, healthcare, news, government, telecom, buy and sell, job/freelance, and blogging/discussion. We hope that this dataset will help researchers in investigating website privacy compliance.

Instructions: 

The dataset is split by country via sheets, i.e., one sheet per country. Each sheet contains five columns. The column description is as follows:Column 1 specifies the sector that a specific website belongs to.Column 2 specifies the sources that were leveraged to collect the websites belonging to a specific sector.Column 3 specifies the name of a website.Column 4 specifies the URL of a website.Column 5 specifies the privacy policy/statement URL, if provided by the corresponding website. An empty cell in this column depicts that the respective website does not provide a privacy statement.

Categories:
Category: 
3 Views

Today, more and more Internet of Things devices are deployed, and the field of applications for decentralized, self-organizing networks keeps growing. The growth also makes these systems more attractive to attackers. Sybil attacks are a common issue, especially in decentralized networks and networks deployed in scenarios with irregular or unreliable Internet connectivity.

Instructions: 

The main part of the data set is the file "20200623_Rechained--v08.xml", which contains a CPN model for the Rechained protocol. To use this file, it needs to be renamed to "20200623_Rechained--v08.cpn". This file can be used with "CPN Tools", an open source, GPL licenced tool for working with Colored Petri nets. CPN Tools is available on: http://cpntools.org/

 

The file "20200617_Protocol-Semantics.pdf" contains the protocol semantics, CPN token color sets, names and acronyms used.

 

The file "20200623_Rechained--v08--state-space-analysis.txt" contains the results of the state space analysis.

Categories:
Category: 
35 Views

This is a dataset for testing purposes.

Categories:
Category: 
36 Views

The following data set is modelled after the implementers’ test data in 3GPP TS 33.501 “Security architecture and procedures for 5G System” with the same terminology. The data set corresponds to SUCI (Subscription Concealed Identifier) computation in the 5G UE (User Equipment) for IMSI (International Mobile Subscriber Identity) based SUPI (Subscription Permanent Identifier) and ECIES Profile A.

Instructions: 

The following data set is modelled after the implementers’ test data in 3GPP TS 33.501 “Security architecture and procedures for 5G System” with the same terminology. The data set corresponds to SUCI (Subscription Concealed Identifier) computation in the 5G UE (User Equipment) for IMSI (International Mobile Subscriber Identity) based SUPI (Subscription Permanent Identifier) and ECIES Profile A, the IMSI consists of MCC|MNC: '274012'. 

In the 5G system, the globally unique 5G subscription permanent identifier is called SUPI as defined in 3GPP TS 23.501. For privacy reasons, the SUPI from the 5G devices should not be transferred in clear text, and is instead concealed inside the privacy preserving SUCI. Consequently, the SUPI is privacy protected over-the-air of the 5G radio network by using the SUCI. For SUCIs containing IMSI based SUPI, the UE in essence conceals the MSIN (Mobile Subscriber Identification Number) part of the IMSI. On the 5G operator-side, the SIDF (Subscription Identifier De-concealing Function) of the UDM (Unified Data Management) is responsible for de-concealment of the SUCI and resolves the SUPI from the SUCI based on the protection scheme used to generate the SUCI. 

The SUCI protection scheme used in this data set is ECIES Profile A. The size of the scheme-output is a total of 256-bit public key, 64-bit MAC & 40-bit encrypted MSIN. The SUCI scheme-input MSIN is coded as hexadecimal digits using packed BCD coding where the order of digits within an octet is same as the order of MSIN. As the MSINs are odd number of digits, bits 5 to 8 of final octet is coded as ‘1111’.  

# Example Python code to load data into Spark DataFrame

df = spark.read.format("csv").option("inferSchema","true").option("header","true").option("sep",",").load(“5g_suci_using_ecies_profile_a_100k.gz”)

Categories:
292 Views

Secure cryptographic protocols are indispensable for modern communication systems. It is realized through an encryption process in cryptography. In quantum cryptography, Quantum Key Distribution (QKD) is a widely popular quantum communication scheme that enables two parties to establish a shared secret key that can be used to encrypt and decrypt messages.

Categories:
27 Views

Presented here is a dataset used for our SCADA cybersecurity research. The dataset was built using our SCADA system testbed described in our paper below [*]. The purpose of our testbed was to emulate real-world industrial systems closely. It allowed us to carry out realistic cyber-attacks.

 

Instructions: 

Provided dataset is cleased, pre-processed, and ready to use. The users may modify as they wish, but please cite the dataset as below.

M. A. Teixeira, M. Zolanvari, R. Jain, "WUSTL-IIOT-2018 Dataset for ICS (SCADA) Cybersecurity Research," 2018. [Online]. Available: https://www.cse.wustl.edu/~jain/iiot/index.html.

Categories:
165 Views

We introduce a new database of voice recordings with the goal of supporting research on vulnerabilities and protection of voice-controlled systems (VCSs). In contrast to prior efforts, the proposed database contains both genuine voice commands and replayed recordings of such commands, collected in realistic VCSs usage scenarios and using modern voice assistant development kits.

Instructions: 

The corpus consists of three sets: the core, evaluation, and complete set. The complete set contains all the data (i.e., complete set = core set + evaluation set) and allows the user to freely split the training/test set. Core/evaluation sets suggest a default training/test split. For each set, all *.wav files are in the /data directory and the meta information is in meta.csv file. The protocol is described in the readme.txt. A PyTorch data loader script is provided as an example of how to use the data. A python resample script is provided for resampling the dataset into the desired sample rate.

Categories:
166 Views

Message Queuing Telemetry Transport (MQTT) protocol is one of the most recent standards used in Internet of Things (IoT) machine to machine communication. The increase in the number of available IoT devices and used protocols reinforce the need for new and robust Intrusion Detection Systems (IDS). However, building IoT IDS requires the availability of datasets to process, train and evaluate these models. The dataset presented in this paper is the first to simulate and MQTT-based network. The dataset is generated using a simulated MQTT network architecture.

Instructions: 

The dataset consists of 5 pcap files, namely, normal.pcap, sparta.pcap, scan_A.pcap, mqtt_bruteforce.pcap and scan_sU.pcap. Each file represents a recording of one scenario; normal operation, Sparta SSH brute-force, aggressive scan, MQTT brute-force and UDP scan respectively. The attack pcap files contain background normal operations. The attacker IP address is “192.168.2.5”. Basic packet features are extracted from the pcap files into CSV files with the same pcap file names. The features include flags, length, MQTT message parameters, etc. Later, unidirectional and bidirectional features are extracted.  It is important to note that for the bidirectional flows, some features (pointed as *) have two values—one for forward flow and one for the backward flow. The two features are recorded and distinguished by a prefix “fwd_” for forward and “bwd_” for backward. 

 

Categories:
421 Views

This dataset accompanies the article "Palisade: A Framework for Anomaly Detection in Embedded Systems."  It contains traces, programs, and specifications used in the case studies from the paper.

Instructions: 

Case Study 1: Autonomous Vehicle - Comparison between Siddhi and Palisade nfer processor

  • cs1_gear_flip_flop_data.csv - the data used in the Gear Flip-Flop anomaly study and the comparison with Siddhi
  • cs1_comparison.nfer - the nfer specification used in the comparison with Siddhi
  • cs1_comparison.siddhi - the siddhi specification used in the comparison with Siddhi

 

Case Study 2: ADAS-on-a-treadmill - Comparison between Beep Beep 3 and Palisade rangeCheck and lossDetect processors

  • cs2_platoon_dead_spot_data.csv - the data used in the Platoon Dead-Spot anomaly study and the comparison with Beep Beep 3
  • cs2_platoon_no_anomaly_data.csv - data used for training in the Platoon Dead-Spot anomaly study
  • cs2_platoon_range_model.json - trained model used by the rangeCheck processor
  • RangeCheck.java - Beep Beep 3 program to check both range and loss
  • BenchSink.java - Beep Beep 3 program to print events
  • BenchPublisher.java - Beep Beep 3 program to read from a file and publish events to the RangeCheck program
  • BenchEvent.java - Custom Beep Beep 3 event class used in the comparison

 

Categories:
65 Views

Pages