Skip to main content

Datasets

Standard Dataset

Amazon-Book and MovieLens-1M

Citation Author(s):
Ziyang Liu (Tsinghua University)
Submitted by:
Ziyang Liu
Last updated:
DOI:
10.21227/ztyg-gp55
Data Format:
79 views
Categories:
Keywords:
Average: 5 (1 vote)

Abstract

  1. MovieLens-1M: This dataset contains user ratings for movies from the MovieLens website. It consists of 6,040 users and 3,952 movies. Users rate movies on a 5-star scale, and for our analysis, we convert the ratings into binary signals (positive and negative feedback) using a threshold of 3.5. The training set contains 456,138 positive feedback interactions and 337,990 negative feedback interactions, while the testing set consists of 111,412 positive feedback interactions.
  2. Amazon-Book: We select the Amazon-Book dataset from a large crawl of product reviews on Amazon. This dataset comprises 35,736 users, 38,121 items, and 1,960,674 5-star ratings. We used a threshold of 3.5 to convert ratings into binary signals. The training set contains 1,252,292 positive feedback interactions and 302,056 negative feedback interactions, while the testing set consists of 327,682 positive feedback interactions. The dataset reflects real user behavior in an online retail environment and poses challenges related to exposure bias.

Instructions:

None.

Tks bro. I will use it for my school project. Btw this dataset is very helpful

Sang Nguyen Sat, 04/19/2025 - 02:10 Permalink

Dataset Files

Files have not been uploaded for this dataset