Datasets of PCMDA

Citation Author(s):
ZiFeng
Xie
Submitted by:
ZiFeng Xie
Last updated:
Fri, 06/07/2024 - 09:18
DOI:
10.21227/f5w5-1995
License:
0
0 ratings - Please login to submit your rating.

Abstract 

This dataset comprises three benchmarks: Digits-5, PACS, anf office_caltech_10. Digits-5 is a set of handwritten digit images sampled from five domains: MNIST, MNIST-M, USPS, SynthDigits, and SVHN.  All sample are images of numbers ranging from 0 to 9.  PACS is composed of four different datasets, each representing a different visual domain: Photo, Art Painting, Cartoon, and Sketch. It contains 9,944 images, including 1,792 real photos, 2,048 art paintings, 2,344 cartoon images, and 2,760 sketches. Each image is labeled with one of seven categories: dog, elephant, giraffe, guitar, horse, house, and person. Office_caltech_10 is a commonly used domain adaptation dataset, which consists of four domains: Amazon, Webcam DSLR, and Caltech. It contains 9,000 images, with 2,533 images from Amazon, 795 images from Webcam, 498 images from DSLR, and 4,174 images from Caltech. Each image is labeled with one of ten categories, including bike, calculator, headphone, laptop, motorcycle, mug, projector, skateboard, stapler, and umbrella.

Instructions: 

Digits-5 data are stored in MAT file format. The size of each sample is 32x32.  Each PACS sample is a RGB image whose size is 224x224. Each Office_caltech_10 sample is a RGB image whose size is 252x252.