Amazon-Book and MovieLens-1M

Citation Author(s):
Ziyang
Liu
Tsinghua University
Submitted by:
Ziyang Liu
Last updated:
Tue, 01/28/2025 - 22:35
DOI:
10.21227/ztyg-gp55
Data Format:
License:
4 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

  1. MovieLens-1M: This dataset contains user ratings for movies from the MovieLens website. It consists of 6,040 users and 3,952 movies. Users rate movies on a 5-star scale, and for our analysis, we convert the ratings into binary signals (positive and negative feedback) using a threshold of 3.5. The training set contains 456,138 positive feedback interactions and 337,990 negative feedback interactions, while the testing set consists of 111,412 positive feedback interactions.
  2. Amazon-Book: We select the Amazon-Book dataset from a large crawl of product reviews on Amazon. This dataset comprises 35,736 users, 38,121 items, and 1,960,674 5-star ratings. We used a threshold of 3.5 to convert ratings into binary signals. The training set contains 1,252,292 positive feedback interactions and 302,056 negative feedback interactions, while the testing set consists of 327,682 positive feedback interactions. The dataset reflects real user behavior in an online retail environment and poses challenges related to exposure bias.
Instructions: 

None.

Dataset Files

    Files have not been uploaded for this dataset