Skip to main content

Datasets

Standard Dataset

[S&P 500] and [FTSE 100] minute-level market records from May 16, 2021, to April 20, 2022

Citation Author(s):
Guan-Ju Peng (https://datascience.nchu.edu.tw/en/index.php)
Yu-Hsi Chen (https://datascience.nchu.edu.tw/en/index.php)
Yi-Chieh Chen (https://phddsia.nchu.edu.tw/)
Submitted by:
YI CHIEH CHEN
Last updated:
DOI:
10.21227/cbdv-tg61
Data Format:
No Ratings Yet

Abstract

This dataset contains one-minute frequency financial data collected from 19 assets, forming a spatial domain across five categories: stock indices and futures (e.g., NASDAQ Composite, S&P 500, FTSE 100, DJIA, Nikkei 225, E-mini Russell 2000, and futures contracts), commodity futures (Crude Oil/USD, Silver/USD, Gold/USD), cryptocurrencies (BTC/USD, CMC Crypto 200 Index), foreign exchange pairs (GBP/USD, EUR/USD, USD/JPY), and other instruments such as the 10-Year U.S. Treasury Yield and the Volatility Index (VIX).

For each asset and each minute, the dataset includes normalized price features, including high, low, and open values scaled by the closing price, as well as a set of six n-minute moving averages of the closing price (n ∈ {5, 10, 15, 20, 25, 30}). These features are concatenated into a 10-dimensional feature vector per asset per time step.

The data spans from May 16, 2021, to April 20, 2022, and covers all regular trading days. It is suitable for tasks such as financial signal denoising, feature extraction, time-series modeling, and spatio-temporal learning.

Instructions:

Dataset Title: Multi-Asset Minute-Level Financial Dataset (May 2021 – April 2022)

Overview:
This dataset contains one-minute frequency financial data collected from 19 assets across five categories:
- Stock Indices and Futures: NASDAQ Composite, S&P 500, DJIA, Nikkei 225, FTSE 100, E-mini Russell 2000, and futures (S&P 500 Futures, DJIA Futures, NASDAQ Futures)
- Commodity Futures: Crude Oil/USD, Silver/USD, Gold/USD
- Cryptocurrencies: BTC/USD, CMC Crypto 200 Index
- Foreign Exchange: GBP/USD, EUR/USD, USD/JPY
- Other Instruments: 10-Year U.S. Treasury Yield, Volatility Index (VIX)

Time Span:
May 16, 2021 – April 20, 2022  
(Only market open days are included)

Data Format:
- Minute-level records
- Each asset has 10-dimensional features per time step:
   [high/end, low/end, open/end, 1 - open/end, 
    m5, m10, m15, m20, m25, m30]
- open and end as the opening and closing prices
- high and low denote the highest and lowest price 
- Moving average (mN) is calculated over N minutes and normalized by the current closing price.


File Format:
All files are in NumPy binary format (.npy) and can be loaded with:
   import numpy as np
   features = np.load('ftse100/features.npy')

File Structure:
- sp500/
   ├── features.npy        # Shape: (3636, 32, 190)
   ├── targets.npy         # Shape: (3636, 24, 2)
   ├── targets_eval.npy    # Shape: (3636, 24, 2)
- ftse100/
   ├── features.npy        # Shape: (4848, 32, 190)
   ├── targets.npy         # Shape: (4848, 24, 2)
   ├── targets_eval.npy    # Shape: (4848, 24, 2)

File Format:
All files are in NumPy binary format (.npy) and can be loaded using:
   import numpy as np
   features = np.load('ftse100/features.npy')

Description:
1. features.npy
  - Shape: (T, N, D)
    - T: Number of time steps (sliding windows) — 3636 for S&P 500, 4848 for FTSE 100
    - N: Input sequence of length = 32
    - D: Feature dimension = 190 (19 financial assets × 10 features each)
2. targets.npy
  - Shape: (T, H, 2) 
  - H: Prediction sequence of length = 24
  - Each label is a binary classification vector:
    - [1, 0] for downward movement
    - [0, 1] for upward movement

3. targets_eval.npy
  - Shape: (T, H, 2) 
  - H: Prediction sequence of length = 24
  - Each entry:
    - [negative_return_abs, 0] if return is negative
    - [0, positive_return] if return is positive
  - Only one of the two elements is non-zero per time step.  
  - Used for calculating return-based performance (e.g., Sharpe ratio).

 

 

Funding Agency
National Science and Technology Council
Grant Number
113-2115-M-005 -005 -MY2