Skip to main content

Analysis

Model 1.5.1

Citation Author(s):
Konstantinos Theodorakos
Submitted by:
Konstantinos Theodorakos
Last updated:

Abstract

 

Model 1.5.1

 

Dropout Additive Regression Trees (DART) + Clustered-meter average monthly ratios

 

 Monthly (kWh) consumption clustering:

 

1. Twelve clustering models (one per month of signup): Spectral Clustering

2. Clustering using (daily kWh) consumption extracted features. Cluster count set to 4 (empirically better than the 5 or 6 suggested by the elbow distortion method).

 

Forecasting for a meter: Using the mean monthly consumption + in-cluster month-to-month ratios.

 

Clustering

 

Features (170) from weekend/weekday and full series:

- Statistical: median, variance, quantiles, ...

- Time-series: autocorrelations, trends, seasonalities, ...

 

Month-to-month ratios:

- Robust STL LOESS trend was used to calculate the month-to-month percentage ratios. All monthly averages are replaced with the new computed ratios.

 

Preprocessing:

- Time-series: Converted 0 to Nans and dropped them.

 

Regression Forecasting:

- Using DART Gradient Boosted Trees (GBT) on the calculated 2018 month ratios + Nested Cross-validation. Dual Annealing used for hyper-parameter optimization.