Object-Wise Just Recognizable Distortion Dataset
Just Recognizable Distortion (JRD) refers to the minimum distortion that notably affects the recognition performance of a machine vision model. If a distortion added to images or videos falls within this JRD threshold, the degradation of the recognition performance will be unnoticeable. Based on this JRD property, it will be useful to Video Coding for Machine (VCM) to minimize the bit rate while maintaining the recognition performance of compressed images. In this study, we first construct a large image dataset of Object-Wise JRD (OW-JRD) containing 29,218 original images with 80 object categories, and each image was compressed into 64 distorted versions using Versatile Video Coding (VVC).
We selected 8,961 images from COCO2017 as source images. Then, these source images were coded and decoded by the general VVC codec, i.e., VTM14.0, with 64 QPs and All-Intra configuration. 64 distorted versions were generated for each source image. Secondly, we detected objects on both the source and distorted images using Faster R-CNN configured with ResNet-101 to generate object labels. Meanwhile, a threshold Tobject was set as 0.9 to ensure all objects were detected with a confidence higher than 90%. There are totally 29,218 objects detected from 8,961 images. Thirdly, three utilities are used to evaluate the performance of object detection, including bounding box, confidence and category. The details can be found in our work <Learning to Predict Object-Wise Just Recognizable Distortion for Image and Video Compression>.