Art category prediction

Citation Author(s):
Tatsuya
Tojima
Submitted by:
Tatsuya Tojima
Last updated:
Sat, 12/28/2024 - 00:49
DOI:
10.21227/4nse-mj53
Data Format:
License:
16 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

The raw data from Artsy (from https://www.artsy.net/ ) were collected using the original Python scraping program. Artsy labeled the categories as follows: "Painting", "Work on Paper", "Sculpture", "Print", "Photography", and "Textile Arts". These categories were predicted by other explanatory variables. The rationale behind the selection of this task is the price levels differ greatly across the various categories. In general, categories that can be reproduced with relative ease, such as Print and Photography, tend to be less expensive, while those that are more challenging to reproduce, such as Painting, and Work on Paper, tend to be more expensive.

  • auction_results_top_500_medium.csv:
    • lists the top 500 items sorted by medium frequency, with 20 random samples taken from each medium.
  • fold.csv:
    • the group numbers used for cross-validation splitting randomly.

 

Instructions: 

Column details of the two files are described in the pdf file.
We use medium_category for the objective variable.

Documentation

AttachmentSize
File dataport.pdf309.7 KB