This FFT-75 dataset contains randomly sampled, potentially overlapping file fragments from 75 popular file types (see details below). It is the most diverse and balanced dataset available to the best of our knowledge. The dataset is labeled with class IDs and is ready for training supervised machine learning models. We distinguish 6 different scenarios with different granularity and provide variants with 512 and 4096-byte blocks. In each case, we sampled a balanced dataset and split the data as follows: 80% for training, 10% for testing and 10% for validation.

Dataset Files

You must be an IEEE Dataport Subscriber to access these files. Subscribe now or login.

Documentation: 
[1] Govind Mittal, Pawel Korus, Nasir Memon, "File Fragment Type (FFT) - 75 Dataset", IEEE Dataport, 2019. [Online]. Available: http://dx.doi.org/10.21227/kfxw-8084. Accessed: Jan. 22, 2025.
@data{kfxw-8084-19,
doi = {10.21227/kfxw-8084},
url = {http://dx.doi.org/10.21227/kfxw-8084},
author = {Govind Mittal; Pawel Korus; Nasir Memon },
publisher = {IEEE Dataport},
title = {File Fragment Type (FFT) - 75 Dataset},
year = {2019} }
TY - DATA
T1 - File Fragment Type (FFT) - 75 Dataset
AU - Govind Mittal; Pawel Korus; Nasir Memon
PY - 2019
PB - IEEE Dataport
UR - 10.21227/kfxw-8084
ER -
Govind Mittal, Pawel Korus, Nasir Memon. (2019). File Fragment Type (FFT) - 75 Dataset. IEEE Dataport. http://dx.doi.org/10.21227/kfxw-8084
Govind Mittal, Pawel Korus, Nasir Memon, 2019. File Fragment Type (FFT) - 75 Dataset. Available at: http://dx.doi.org/10.21227/kfxw-8084.
Govind Mittal, Pawel Korus, Nasir Memon. (2019). "File Fragment Type (FFT) - 75 Dataset." Web.
1. Govind Mittal, Pawel Korus, Nasir Memon. File Fragment Type (FFT) - 75 Dataset [Internet]. IEEE Dataport; 2019. Available from : http://dx.doi.org/10.21227/kfxw-8084
Govind Mittal, Pawel Korus, Nasir Memon. "File Fragment Type (FFT) - 75 Dataset." doi: 10.21227/kfxw-8084