Applied Energy, Vol.239, 610-625, 2019
Prediction and explanation of the formation of the Spanish day-ahead electricity price through machine learning regression
Until recently, detailed information on the power system state to estimate future spot prices by regression analysis was generally restricted to qualified parties. However, to ensure transparency in operation, the Spanish Transmission System Operator has launched an informative web in which a sizable amount of real-time energy related data can be consulted through a graphical interface. Undoubtedly, this provides the opportunity for non qualified parties to develop applications and algorithms in which price forecast and maybe knowledge about how price is determined are required. This paper approaches the use of data extracted from that interface with two aims: the prediction of the day ahead price in a simple way, and the exploration of the influence that the underlying energy drivers have on it. For the prediction we specified a quantile regression model based on Gradient Boosted Regression Trees. It improves the accuracy over multiple linear regression models at the cost of more complexity, and still it has simpler specification and tuning compared to other machine learning approaches. The calculated metrics show that our model produces remarkably low prediction errors when using the median as point prediction method (RMSE = 2.78 (sic)/MWh, MAE = 1.94 (sic)/MWh, and MAPE = 0.059). Interestingly, the quantile regression model also allows to inherently define prediction intervals, with a different interpretation of accuracy. Our results show that on average 90% of times the prediction error will not exceed 6.8 (sic)/MWh. We also implemented a partial dependence analysis on that model. This implementation as far as we know the first time employed to analyze the formation of electricity prices has shown to be of significant usefulness in detecting highly non-linear relationships.