This dataset contains a list of popular websites and their privacy statements. The websites belong to the three largest South Asian economies, namely, India, Pakistan, and Bangladesh. Each website is categorized into 10 sectors, namely, e-commerce, finance/banking, education, healthcare, news, government, telecom, buy and sell, job/freelance, blogging/discussion. We hope that this dataset will help researchers in investigating website privacy compliance.
We build an original dataset of thermal videos and images that simulate illegal movements around the border and in protected areas and are designed for training machines and deep learning models. The videos are recorded in areas around the forest, at night, in different weather conditions – in the clear weather, in the rain, and in the fog, and with people in different body positions (upright, hunched) and movement speeds (regu- lar walking, running) at different ranges from the camera.
About 20 minutes of recorded material from the clear weather scenario, 13 minutes from the fog scenario, and about 15 minutes from rainy weather were processed. The longer videos were cut into sequences and from these sequences individual frames were extracted, resulting in 11,900 images for the clear weather, 4,905 images for the fog, and 7,030 images for the rainy weather scenarios.
A total of 6,111 frames were manual annotated so that could be used to train the supervised model for person detection. When selecting the frames, it was taken into account that the selected frames include different weather conditions so that in the set there were 2,663 frames shot in clear weather conditions, 1,135 frames of fog, and 2,313 frames of rain.
The annotations were made using the open-source Yolo BBox Annotation Tool that can simultaneously store annotations in the three most popular machine learning annotation formats YOLO, VOC, and MS COCO so all three annotation formats are available. The image annotation consists of a centroid position of the bounding box around each object of interest, size of the bounding box in terms of width and height, and corresponding class label (Human or Dog).
Presented here is a dataset used for our SCADA cybersecurity research. The dataset was built using our SCADA system testbed described in our paper below [*]. The purpose of our testbed was to emulate real-world industrial systems closely. It allowed us to carry out realistic cyber-attacks.
Provided dataset is cleased, pre-processed, and ready to use. The users may modify as they wish, but please cite the dataset as below.
M. A. Teixeira, M. Zolanvari, R. Jain, "WUSTL-IIOT-2018 Dataset for ICS (SCADA) Cybersecurity Research," 2018. [Online]. Available: https://www.cse.wustl.edu/~jain/iiot/index.html.
This repository contains the results of running more than 70 samples of ransomware, from different families, dating since 2015. It contains the network traffic (DNS and TCP) and the Input/Output (I/O) operations generated by the malware while encrypting a network shared directory. These data are contained in three files for each ransomware sample: one with the information from the DNS requests, other with the TCP connections another one containing the I/O operations. This information can be useful for testing new and old ransomware detection tools and compare their results.
The dataset is organised as one zip file for all text files organised in one directory for each ransomware sample. Although another zip file could be uploaded with all the trace files organised in the same manner as the previous zip file, it was extremely large file (more than 650GB after compression). In order to make the download easier, we have uploaded the trace files in separated zip files, one for each directory or scenario. We have also published in an external website (link) the trace files available to download individually. If a single trace file download is desired, we recommend to visit the website and download it.
For each malware sample three text files are generated (dnsInfo.txt, TCPconnInfo.txt and IOops.txt) and placed in the directory with the ransomware strain’s name. Structure of all the directory and subdirectories are shown in README.pdf file and in the text file “repositoryStructure.txt”.
The I/O operations file contains one text line for each operation (open or close file, read, write, rename, delete, etc). Each line contains fields separated by the blank space character (ASCII 0x20), with the useful metadata about the operation (file name, read and write offset and length, timestamp, etc). The file README.pdf explains all the fields in the I/O operations file.
The DNS info file has one line per each DNS request made by the user machine. The DNS server is ‘22.214.171.124’ for all traces. The file README.pdf explains each column. The TCP info file has one line per each TCP connection. In case the connection contains a HTTP request, the method, response code and url are present in this file. As in previous cases, in the README file the columns and structure of file is explained.
We started downloading ransomware samples in 2015 from hybrid-analysis.com and malware-traffic-analysis.com. The samples were executed in one machine and the DNS and HTTP petitions were collected by a traffic probe mirroring the traffic. The ‘infected’ machine has a mounted directory, shared by a server. The content of this directory is encrypted by the ransomware during its activity. The operations over this directory were captured by the same traffic probe and processed with specialised software to extract the I/O operations in the format explained in the README.pdf file.
In order to analyse the ransomware behaviour, we made different shared directories and we ran some samples in both directories. These shared directories follow an statistic distribution for the file sizes and location of each one, trying to simulate users’ fileset. Changing the seed in the generation of the directory we can make similar directories with different number of files, distribution and subdirectories. The trace files of ransomwares run in this cases can be found in zip files named ‘5GvXdirectory.zip’ where X goes from 2 to 10. We have also run samples with shared directory of 10GB size, which trace files are placed in zip file called ‘10Gdirectory’.
We have also run one sample sweeping the network speed for simulating ransomware encrypting the files slowly. These traces can be found in ‘networkSpeed.zip’ file. Finally, the samples run in scenario with Windows 10 user and server generated traffic traces placed in the file ‘W10scenario.zip’. There is not text files for these samples as the traffic is encrypted in the version 3 of SMB protocol (used in Windows 10 machines).
As we have explained above, the traces files can be downloaded individually from an external link but the text files associated to them are placed in a single zip file (it is possible to download them all together due to its smaller size).
1. Folder “1.SlibCrypto 4DIAC project” includes 4DIAC (1.10.3) project that can be added into the workspace of 4DIAC. It contains the various standalone applications representing security mechanisms and the extended one shown in Fig. 5.
BS-HMS-Dataset is a dataset of the users' brainwave signals and the corresponding hand movement signals from a large number of volunteer participants. The dataset has two parts; (1) Neurosky based Dataset (collected over several months in 2016 from 32 volunteer participants), and (2) Emotiv based Dataset (collected from 27 volunteer participants over several months in 2019).
There are two folders under each user; session I and sessions II. Each session folder contains four different folders; one for each activity performed by the user. Each activity folder contains .csv files; (1) EEG Data (brainwave.csv), (2) Handmovement Accelerometer Data (accelerometer.csv), and (3) Handmovement Gyroscope Data (gyroscope.csv).
A more deatailed description of the data is given in BS-HMS-Dataset-Documentation.pdf file.
Acknowledgement: This data collection was supported in part by the National Science Foundation (NSF) under grant SaTC-1527795.
Please cite:  Diksha Shukla, Sicong Chen, Yao Lu, Partha Pratim Kundu, Ravichandra Malapati, Sujit Poudel, Zhanpeng Jin, Vir Phoha, "Brain Signals and the Corresponding Hand Movement Signals Dataset (BS-HMS-Dataset)", IEEE Dataport, 2019. [Online]. Available: http://dx.doi.org/10.21227/my1k-dd23. Accessed: Dec. 05, 2019.
The dataset contains measurement results of Radar Cross Section of different Unmanned Aerial Vehicles at 26-40 GHz. The measurements have been performed fro quasi-monostatic case (when the transmitter and receiver are spatially co-located) in the anechoic chamber. The data shows how radio waves are scattered by different UAVs at the specified frequency range.
Some of DJI, Walkera, Parrot and Kyosho drones were measured.
The data is in ".csv" format. Each file contains the following information: frequency, theta, phi, and RCS.
The RCS signatures of the following drone models are available:
-DJI Phantom 4 Pro
-DJI Mavic Pro
-DJI Matrice M100
-Walkera Voyager 4
-Custom built hexacopter
-Tricopter HMF600, frame only
Polarization is mentioned in the file name:
HH - horizontal polarisation of the transmitter and the receiver
HV/VH - horizontal and vertical or vice versa
VV - vertical polarization of the transmitter and the receiver
In addition, 6S LiPo battery RCS is available.
Published article can be found at: https://ieeexplore.ieee.org/document/9032332
The proliferation of IoT systems, has seen them targeted by malicious third parties. To address this challenge, realistic protection and investigation countermeasures, such as network intrusion detection and network forensic systems, need to be effectively developed. For this purpose, a well-structured and representative dataset is paramount for training and validating the credibility of the systems. Although there are several network datasets, in most cases, not much information is given about the Botnet scenarios that were used.