Datasets
Standard Dataset
SCAN Faster R-CNN Image Features
- Citation Author(s):
- Submitted by:
- yongle huang
- Last updated:
- Fri, 08/16/2024 - 00:11
- DOI:
- 10.21227/r7ft-ke86
- Research Article Link:
- Links:
- License:
- Categories:
- Keywords:
Abstract
This dataset contains precomputed MS-COCO and Flickr30K Faster R-CNN image features, which are all the data needed for reproducing the experiments in "Stacked Cross Attention for Image-Text Matching", our ECCV 2018 paper. We use splits produced by Andrej Karpathy. The raw images can be downloaded from their original sources http://nlp.cs.illinois.edu/HockenmaierGroup/Framing_Image_Description/KC..., http://shannon.cs.illinois.edu/DenotationGraph/ and http://mscoco.org/.
The precomputed image features of MS-COCO are originally from https://github.com/peteanderson80/bottom-up-attention. The precomputed image features of Flickr30K are extracted from the raw Flickr30K images using the bottom-up attention model from https://github.com/peteanderson80/bottom-up-attention.
The image features are stored in the ./data directory, and vocabulary mapping files are stored in the ./vocab directory.
Prefix 'train', 'dev', and 'test' represent the training, validation, and test sets, respectively. For the CoCo dataset, the prefix 'testall' represents the complete test set, and the prefix 'test' represents part of the test set.
Comments
I need the dataset for project