The VifFRUC Dataset

Citation Author(s):
Ran
Li
Xinyang Normal University
Juan
Dai
Xinyang Normal University
Submitted by:
Ran Li
Last updated:
DOI:
10.21227/jw0n-d193
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Video is a sequence of pictures, which are taken by camera at a short interval of time. A picture in video is called as frame, and the number of frames per second is defined as the frame rate, which denotes the temporal resolution of video. With the higher frame rate, the video contains the more details, such as to the improvement of visual quality for human interpretation or the fine representation for automatic machine perception. A high frame rate relies on the hardware configuration of camera, the higher the frame rate, the more expensive hardware devices. To reduce the costs of capturing the high-rate-frame videos, the efficient way is Frame Rate Up-Conversion (FRUC), which is a post-processing algorithm that insert periodically several new frames into a video sequence. At present, lots of videos over Internet are produced by FRUC after captured by a cheap camera, and their visual qualities are not match their frame rate, therefore, it is necessary to detect whether a video has the true frame rate.

To help the design for detecting FRUC, we build a large-scale, high-quality video dataset, VifFRUC, of which the full name is Videos forged by FRUC. This dataset contains two types of video clips, in which one is naturally captured by camera, and another is forged by some known FRUC algorithms. By performing the new FRUC methods to process the original videos, the types of forge videos can be increased continually, therefore, VifFRUC can be expanded in collaboration with a number of researchers. The VifFRUC dataset is expected to be developed by the communications community to further facilitate and promote the experimental evaluation of FRUC detection.

Instructions: 

This VifFRUC dataset stems from 657 video sequences downloaded from pexels.com (https://www.pexels.com/), which covers large variety of natural scenes and actions. These videos are captured by camera at 720P and 30 Hz, and encoded by the lossless compression, i.e., their spatial resolutions are 1280×720 pixels, and their frame rates are 30 frames per second (fps). We split these sequences into 5143 clips of length 61 frames, and regard these clips as original videos, in which the 4088 clips are used to produce the training set, and the 1055 clips are used to produce the testing set.

For each original video, we delete its even frames, and derive them again from the remained odd frames by respectively using the popular FRUC algorithms as follows:
(1) Frame Copying (FC), which interpolates the absent frames by copying their temporally adjacent frames.
(2) Unidirectional Motion Estimation (UME), which is from the reference “T.-H. Tsai, A.-T. Shi, and K.-T. Huang, Accurate Frame Rate Up-Conversion for Advanced Visual Quality, IEEE Transactions on Broadcasting, vol. 62, no. 2, pp. 426-435, Jun. 2016”.
(3) Bidirectional Motion Estimation (BME), which is from the reference “S. Yoon, H. Kim and M. Kim, Hierarchical Extended Bilateral Motion Estimation-Based Frame Rate Upconversion Using Learning-Based Linear Mapping, IEEE Transactions on Image Processing, vol. 27, no. 12, pp. 5918-5932, Dec. 2018”.
(4) Multiple-Hypotheses Motion Estimation (MHME), which is from the reference “S. Jeong, C. Lee and C. Kim, Motion-Compensated Frame Interpolation Based on Multihypothesis Motion Estimation and Texture Optimization, IEEE Transactions on Image Processing, vol. 22, no. 11, pp. 4497-4509, Nov. 2013”.

These videos processed by the FRUC algorithms are called as the forged videos. To ensure the transmission or storage of video, the loss compression must be performed in practice, so we compress all original and forged videos by using the FFmpeg tool (https://ffmpeg.org/). By setting Constant Rate Factor (CRFs) to be 12, 18, 24 and 30, respectively, the videos with different bitrates are produced. The smaller CRF, the higher visual quality, and the larger the amount of bits.

By the above steps, the VifFRUC dataset are constructed. VTrain is the directory corresponding to the training set, and it has the four sub-directories including crf12, crf18, crf24 and crf30, of which one denotes the used CRF value in FFmpeg compression, e.g., in the sub-directory crf12, each video is compressed by FFmpeg with CRF of 12. Each crf sub-directory has also the five sub-directories named 00, 01, 02, 03, and 04, respectively, in which 00 denotes original videos, 01 denotes the forged videos by FC, 02 denotes the forged videos by UME, the forged videos by BME, and the forged videos by MHME. VTest is the directory corresponding to the testing set, and its structure is same to VTrain. Each bottom directory in VTrain has 4088 clips, and the one in VTest has 1055 clips. Original videos from the 00 directory are regarded as negative instances, and forged videos from the 01 to 04 directories are regarded as positive instances.

We share the VifFRUC dataset by Baidu Netdisk (TeraBox). Download links are
(1) Testing set only (24.4 GB): https://pan.baidu.com/s/1703V4YZx2j6dKWxVybTp7g, password: fruc
(2) Training set only (96.7 GB): https://pan.baidu.com/s/1BKLfhtmai-3NLYVawmhAiA, password: fruc

The author of VifFRUC is Ran Li, and his email is liran@xynu.edu.cn. If the VifFRUC dataset cannot be downloaded by using the above links, you can contact Ran Li by email.

Dataset Files

LOGIN TO ACCESS DATASET FILES

Documentation

AttachmentSize
File An Introduction on VifFRUC Dataset180.82 KB