Datasets
Standard Dataset
Marketable Foods (MF) Dataset
- Citation Author(s):
- Submitted by:
- Jordan Vice
- Last updated:
- Mon, 09/04/2023 - 01:56
- DOI:
- 10.21227/56e9-7a71
- Data Format:
- Research Article Link:
- License:
- Categories:
- Keywords:
Abstract
The Marketable Foods (MF) dataset was originally constructed to fine-tune the language and visual network layers and facilitates backdoor injections in text-to-image generative models. The dataset consists of images from three popular food corporations with prominent, recognisable brands (Coffee = Starbucks, Burger = McDonald's, Drink = Coca Cola). Samples were collected from the internet and were cleaned using a filtering algorithm discussed in the corresponding paper. While we propose using the dataset for fine-tuning text-to-image models, the image data can be used for tasks like classification or object detection. Ultimately, the MF dataset can be used induce or detect bias towards particular coffee, burger or soft-drink brands. The dataset contains: (i) 250 McDonald's branded images, (ii) 499 Starbucks branded images and (iii) 616 Coca Cola branded images. We label images such that the first letter of each file can be used to identify the image class i.e.: 'b' = burger, 'c' = coffee and 'd' = drink.
The dataset contains 3 subdirectories - one for each class where: burger = McDonald's images, coffee = Starbucks images and drink = Coca Cola images. All images are in a .png format, with the first character being used to identify the class the image belongs to (first character of parent directory). Each image is also named using a unique integer identifier i.e. from 'b0.png' to 'd1368.png'.
Comments
this is for training purpose
The dataset contains class and image data so it can be used for training, note that the labels are single token captions.
just for training purpose
The dataset contains class and image data so it can be used for training, note that the labels are single token captions.