Abstract

The datasets are used to test an embedding imputation model. There are two different experiments: finance and mobile applications.

Instructions:

Finance datasets priceMat_small and priceMat_large: Each company has an industry category label, e.g., Google belongs to the IT industry, while Blackrock belongs to the financial industry. There are eleven different category labels representing eleven industry sectors. Every company also has a historical daily trading return information. For the small dataset, daily stock returns are from 2016-08-24 to 2018-08-27. The large dataset contains the daily returns for 400 trading days ending on 2018-11-01.

AppleStore: contains more than 7000 Apple iOS mobile application details extracted from the iTunes Search API at the Apple Inc website. Each app has a name and a primary genre such as Games, Sports, or Business. There are 23 possible genres in total.

Detailed Info can be obtained from Kaggle:

https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps

Dataset Files

AppleStore.csv (818.02 kB)
priceMat_small.csv (5.04 MB)
priceMat_large.csv (32.25 MB)

Datasets

Standard Dataset

Embedding Imputation

Abstract

Dataset Files

QUESTIONS?