Marketable Foods (MF) Dataset

Citation Author(s):
Jordan
Vice
Submitted by:
Jordan Vice
Last updated:
Mon, 09/04/2023 - 01:56
DOI:
10.21227/56e9-7a71
Data Format:
Research Article Link:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

The Marketable Foods (MF) dataset was originally constructed to fine-tune the language and visual network layers and facilitates backdoor injections in text-to-image generative models. The dataset consists of images from three popular food corporations with prominent, recognisable brands (Coffee = Starbucks, Burger = McDonald's, Drink = Coca Cola). Samples were collected from the internet and were cleaned using a filtering algorithm discussed in the corresponding paper. While we propose using the dataset for fine-tuning text-to-image models, the image data can be used for tasks like classification or object detection. Ultimately, the MF dataset can be used induce or detect bias towards particular coffee, burger or soft-drink brands. The dataset contains: (i) 250 McDonald's branded images, (ii) 499 Starbucks branded images and (iii) 616 Coca Cola branded images. We label images such that the first letter of each file can be used to identify the image class i.e.: 'b' = burger, 'c' = coffee and 'd' = drink.

Instructions: 

The dataset contains 3 subdirectories - one for each class where: burger = McDonald's images, coffee = Starbucks images and drink = Coca Cola images. All images are in a .png format, with the first character being used to identify the class the image belongs to (first character of parent directory). Each image is also named using a unique integer identifier i.e. from 'b0.png' to 'd1368.png'.

Funding Agency: 
This research was supported by National Intelligence and Security Discovery Research Grants (project# NS220100007), funded by the Department of Defence Australia
Grant Number: 
(project# NS220100007)

Comments

this is for training purpose

Submitted by Addisu Mengistu on Tue, 01/09/2024 - 19:09

The dataset contains class and image data so it can be used for training, note that the labels are single token captions. 

Submitted by Jordan Vice on Fri, 12/13/2024 - 01:12

just for training purpose

Submitted by ZIxiang Liu on Mon, 10/14/2024 - 04:55

The dataset contains class and image data so it can be used for training, note that the labels are single token captions. 

Submitted by Jordan Vice on Fri, 12/13/2024 - 01:12