Datasets
Standard Dataset
Embedding Imputation
- Citation Author(s):
- Submitted by:
- Uras Varolgunes
- Last updated:
- Mon, 07/08/2024 - 15:58
- DOI:
- 10.21227/50qj-9k55
- License:
- Categories:
- Keywords:
Abstract
The datasets are used to test an embedding imputation model. There are two different experiments: finance and mobile applications.
Finance datasets priceMat_small and priceMat_large: Each company has an industry category label, e.g., Google belongs to the IT industry, while Blackrock belongs to the financial industry. There are eleven different category labels representing eleven industry sectors. Every company also has a historical daily trading return information. For the small dataset, daily stock returns are from 2016-08-24 to 2018-08-27. The large dataset contains the daily returns for 400 trading days ending on 2018-11-01.
AppleStore: contains more than 7000 Apple iOS mobile application details extracted from the iTunes Search API at the Apple Inc website. Each app has a name and a primary genre such as Games, Sports, or Business. There are 23 possible genres in total.
Detailed Info can be obtained from Kaggle:
https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps
Dataset Files
- AppleStore.csv (818.02 kB)
- priceMat_small.csv (5.04 MB)
- priceMat_large.csv (32.25 MB)