Datasets
Standard Dataset
Variable-length File Fragment Dataset (VFF-16)
- Citation Author(s):
- Submitted by:
- YI WANG
- Last updated:
- Mon, 03/06/2023 - 20:30
- DOI:
- 10.21227/qb7g-g653
- Data Format:
- License:
261 Views
- Categories:
- Keywords:
0 ratings - Please login to submit your rating.
Abstract
A variable-length file fragment (VFF-16) dataset with 16 file types is to reflect the file system fragmentation. The sequential memory sectors contain contextual information about file fragments. The 16 file types are ‘jpg’, ‘gif’, ‘doc’, ‘xls’, ‘ppt’, ‘html’, ‘text’, ‘pdf’, ‘rtf’, ‘png’, ‘log’, ‘csv’, ‘gz’, ‘swf’, ‘eps’, and ‘ps’. We split the dataset into the training and test sets with a ratio of about 4:1. There are 1,310,918 training samples and 328,599 test samples in a sector size of 512 bytes, and 167,564 training samples and 41,993 test samples in a sector size of 4,096 bytes.
Instructions:
See readme.md for details.
Dataset Files
- Memory sector size is 4k bytes 4k.zip (454.44 MB)
- Memory sector size is 512 bytes 512.zip (443.96 MB)
Documentation
Attachment | Size |
---|---|
Readme.md | 5.4 KB |