Datasets
Open Access
Open Repository for the Evaluation of Ransomware Detection Tools
- Citation Author(s):
- Submitted by:
- Eduardo Berrueta
- Last updated:
- Tue, 05/17/2022 - 22:21
- DOI:
- 10.21227/qnyn-q136
- Data Format:
- Link to Paper:
- Links:
- License:
- Categories:
- Keywords:
Abstract
This repository contains the results of running more than 70 samples of ransomware, from different families, dating since 2015. It contains the network traffic (DNS and TCP) and the Input/Output (I/O) operations generated by the malware while encrypting a network shared directory. These data are contained in three files for each ransomware sample: one with the information from the DNS requests, other with the TCP connections another one containing the I/O operations. This information can be useful for testing new and old ransomware detection tools and compare their results. More details about this dataset can be found in this paper published in the IEEEAccess with doi: 10.1109/ACCESS.2020.2984187.
The dataset is organised as one zip file for all text files organised in one directory for each ransomware sample. Although another zip file could be uploaded with all the trace files organised in the same manner as the previous zip file, it was extremely large file (more than 650GB after compression). In order to make the download easier, we have uploaded the trace files in separated zip files, one for each directory or scenario. We have also published in an external website (link) the trace files available to download individually. If a single trace file download is desired, we recommend to visit the website and download it.
For each malware sample three text files are generated (dnsInfo.txt, TCPconnInfo.txt and IOops.txt) and placed in the directory with the ransomware strain’s name. Structure of all the directory and subdirectories are shown in README.pdf file and in the text file “repositoryStructure.txt”.
The I/O operations file contains one text line for each operation (open or close file, read, write, rename, delete, etc). Each line contains fields separated by the blank space character (ASCII 0x20), with the useful metadata about the operation (file name, read and write offset and length, timestamp, etc). The file README.pdf explains all the fields in the I/O operations file.
The DNS info file has one line per each DNS request made by the user machine. The DNS server is ‘8.8.8.8’ for all traces. The file README.pdf explains each column. The TCP info file has one line per each TCP connection. In case the connection contains a HTTP request, the method, response code and url are present in this file. As in previous cases, in the README file the columns and structure of file is explained.
We started downloading ransomware samples in 2015 from hybrid-analysis.com and malware-traffic-analysis.com. The samples were executed in one machine and the DNS and HTTP petitions were collected by a traffic probe mirroring the traffic. The ‘infected’ machine has a mounted directory, shared by a server. The content of this directory is encrypted by the ransomware during its activity. The operations over this directory were captured by the same traffic probe and processed with specialised software to extract the I/O operations in the format explained in the README.pdf file.
In order to analyse the ransomware behaviour, we made different shared directories and we ran some samples in both directories. These shared directories follow an statistic distribution for the file sizes and location of each one, trying to simulate users’ fileset. Changing the seed in the generation of the directory we can make similar directories with different number of files, distribution and subdirectories. The trace files of ransomwares run in this cases can be found in zip files named ‘5GvXdirectory.zip’ where X goes from 2 to 10. We have also run samples with shared directory of 10GB size, which trace files are placed in zip file called ‘10Gdirectory’.
We have also run one sample sweeping the network speed for simulating ransomware encrypting the files slowly. These traces can be found in ‘networkSpeed.zip’ file. Finally, the samples run in scenario with Windows 10 user and server generated traffic traces placed in the file ‘W10scenario.zip’. There is not text files for these samples as the traffic is encrypted in the version 3 of SMB protocol (used in Windows 10 machines).
As we have explained above, the traces files can be downloaded individually from an external link but the text files associated to them are placed in a single zip file (it is possible to download them all together due to its smaller size).
Dataset Files
- Example of DNS requests of one ransomware sample example_DNSinfo.txt (1.08 kB)
- Example of I/O operations file of one ransomware sample example_IOops.txt (34.47 MB)
- Example of TCP connections of one ransomware sample example_TCPconnInfo.txt (315 bytes)
- output of 'tree' command of the full directory repositoryStructure.txt (38.51 kB)
- Trace files for the version 1 of the 5GB directory. Original scenario originalScenario.zip (147.01 GB)
- Trace files for the version 2 of the 5GB directory. Original scenario. 5Gv2directory.zip (24.01 GB)
- Trace files for the version 3 of the 5GB directory. Original scenario. 5Gv3directory.zip (23.90 GB)
- Trace files for the version 4 of the 5GB directory. Original scenario. 5Gv4directory.zip (22.88 GB)
- Trace files for the version 5 of the 5GB directory. Original scenario. 5Gv5directory.zip (15.23 GB)
- Trace files for the version 6 of the 5GB directory. Original scenario. 5Gv6directory.zip (10.66 GB)
- Trace files for the version 7 of the 5GB directory. Original scenario. 5Gv7directory.zip (15.80 GB)
- Trace files for the version 8 of the 5GB directory. Original scenario. 5Gv8directory.zip (17.74 GB)
- Trace files for the version 9 of the 5GB directory. Original scenario 5Gv9directory.zip (16.14 GB)
- Trace files for the version 10 of the 5GB directory. Original scenario 5Gv10directory.zip (17.52 GB)
- Trace files for the 10G directory. Original scenario. 10Gdirectory.zip (7.79 GB)
- Trace files for the 5G directory with different link bandwidth networkSpeed.zip (412.79 GB)
- Text files textFiles.zip (1.04 GB)
- Traces files for the 5G directory in NAT scenario NATscenario.zip (138.09 GB)
- Trace files for the 5G directory in Windows 10 scenario W10scenario.zip (3.78 GB)
Open Access dataset files are accessible to all logged in users. Don't have a login? Create a free IEEE account. IEEE Membership is not required.
Documentation
Attachment | Size |
---|---|
PDF with information about the files and the capture process | 194.13 KB |
Comments
!#