Echo State Network icon Echo State Network
University Project #Machine Learning

K-Step Ahead Forecasting#

A custom implementation of an echo state neural network to perform k-step ahead forecasting on 2sin and lorenz functions.

NOTE

Please click the link in the header to see the full write-up along with all the code and diagrams used.

Intro#

A k step ahead forecasting task consists of predicting the value of a time series at time t+kt+k by using the value of the time series at time tt, where k>=1k >= 1 is called forecasting horizon. In general, the predicted value is always unidimensional (i.e. a single number). However, it is possible to use multiple input values in order to improve the results. Notably, once k is decided, the output to be predicted is the value of the time series at time t+kt+k, and the input may be a vector containing values of the times series at time t,t1,...,tnt, t-1, ..., t-n, where n>=0n >= 0 is defined by the user and sets the dimensionality of the input vector.

Conclusion#

Overall, there are several conclusions that can be drawn from all of these models.

Firstly, as kk increases in K-Step Ahead Forecasting the correlation between MSE and NrNr seems to decrease, potentially becoming negative.

Secondly, K-Step Ahead Forecasting, in general, will provide better results the lower kk is. With the best results being provided when k=1k = 1. This is because to predict kk steps ahead the current time-step, the network must predict all the steps between time-step tt and time-step kk to predict time-step t+kt + k. This causes the error caused by the ESN to compound on itself, as each state calculation needs to use a prediction as the input utu_t.

Thirdly, the difference between a local minima and global minima in hyper-parameter space can result in vastly different Mean Squared Errors as shown by 2Sin’s 2-Step and 3-Step predictions.

Fourthly, if the comparison made between gradient descent and the hyper-parameter search in the description of my optimize_hyperparams function is correct, the addition of a gradient descent-like algorithm to optimize the weights (hyper-parameters) would likely result in better forecasts for all kk.

← Back to Projects