Datasets
Standard Dataset
Audio Steganalysis Dataset
- Citation Author(s):
- Submitted by:
- yuntao wang
- Last updated:
- Tue, 05/17/2022 - 22:17
- DOI:
- 10.21227/rab0-vf56
- Data Format:
- Research Article Link:
- Links:
- License:
- Categories:
- Keywords:
Abstract
The steganography and steganalysis of audio, especially compressed audio, have drawn increasing attention in recent years, and various algorithms are proposed. However, there is no standard public dataset for us to verify the efficiency of each proposed algorithm. Therefore, to promote the study field, we construct a dataset including 33038 stereo WAV audio clips with a sampling rate of 44.1 kHz and duration of 10s. And, all audio files are from the Internet through data crawling, which is for a better simulation of a real detection environment. The dataset is used for MP3 steganalysis at this stage. We provide corresponding MP3 encoder, LAME, and steganographic encoder, HCM, EECS and so on, which is developed based on LAME. What's more, some useful python scripts are supplied for samples make in batch. The dataset is still expanding, and we will include AAC, AMR and other audio formats in the future.
Keywords: Audio, MP3, Steganalysis, Steganography
Dataset Files
- The secret message files for embedding. secret_message.zip (5.72 MB)
- MP3 audio encoder and steganographic encoder. encoder.zip (2.73 MB)
- A tiny dataset including 1000 WAV audio files. wav_10s demo.zip (1.58 GB)
- A small dataset including 5000 WAV audio files. wav_10s_lite.zip (7.88 GB)
- A middle dataset including 10000 WAV audio files. wav_10s_middle.zip (15.65 GB)
- A script for steganographic samples make. samples_make.py (30.63 kB)
- A script for QMDCT coefficients matrix extraction. QMDCT_extraction.py (4.52 kB)
- The utils script. utils.py (1.49 kB)
Documentation
Attachment | Size |
---|---|
instruction.md | 3.19 KB |
Comments
useful dataset
This dataset is well-orgnized and extemely valuable in audio steganlysis. I believe that this dataset will inspire many related researches.
A nice dataset. All people who are engaging audio steganography and steganalysis can try it out.
good good good
非常不错的数据集
A good data set for the reseachers delve into audio steganography and steganalysis !
it's very useful. thank you for sharing.
Very specialized data set, very nice
The dataset takes an important role in researching audio steganlysis.Especially useful.
Thanks to the author for providing a good data set for audio steganography and steganographic analysis, very useful.