Eric Zelikman *, Sharon Zhou *, Jeremy Irvin *, Cooper Raterink, Hao Sheng, Anand Avati, Jack Kelly, Ram Rajagopal, Andrew Y. Ng †, David J. Gagne†
Increasing adoption of renewable energy in the electricity sector is essential to the reduction of anthropogenic greenhouse gas emissions. Due to the high volatility and intermittency of solar energy, solar forecasting methods have become necessary to increase the penetration of solar power into the grid while ensuring cost-effectiveness and security. Solar forecasting methods that characterize uncertainty have the potential to aid real-time grid integration of solar energy and help gauge when to deploy new storage.
Read The PaperFigure 1. NGBoost 5 minute resolution probabilistic forecasts starting 5 minutes ahead up until one hour ahead, shown here for several days at the Boulder SURFRAD station.
We used public data from NOAA’s Surface Radiation (SURFRAD) network, consisting of seven stations throughout the continental U.S. measure a variety of meteorological variables. Relative humidity, wind speed, wind direction, air pressure, time of day, solar zenith angle, air temperature, and the five previous irradiance values up to the forecast time were used as input, and daytime global horizontal irradiance (GHI) as output.
We developed four probabilistic models which output a probability distribution over the outcome space instead of a point prediction: a Gaussian process regression model, a neural network with uncertainty based on dropout variation (Dropout Neural Network), a neural network whose predictions parameterize a Gaussian distribution optimized to maximize likelihood (Variational Neural Network), and a decision-tree based model using natural gradient boosting (NGBoost) assuming a Gaussian output distribution.
We additionally explored the use of post-hoc probabilistic calibration methods for encouraging well-calibrated predictions, including the Kuleshov method, CRUDE, and maximum likelihood estimation (MLE).
We computed the test CRPS of the probabilistic models with different calibration methods across all seven stations, averaged over each 5-minute horizon from 5 minutes to an hour. NGBoost attained the highest performance on all stations.
We compared NGBoost to benchmark solar irradiance forecasting models on the intra-hourly resolution task, namely the complete history persistence ensemble (CH-P), persistence ensemble (PeEn), and a Markov-chain Mixture model (MCM). NGBoost outperformed each of these models with improvements of NGBoost over MCM ranging from 5% to over 15%.
We compared NGBoost with CRUDE post-hoc calibration to benchmark solar irradiance forecasting models on the hourly resolution task, including the complete history persistence ensemble (CH-P), Gaussian error distribution (Gau), and a standard numerical weather prediction model (NWP) from the European Centre for Medium-Range Weather Forecasts. NGBoost(+C) was the best model on three stations (CO, NV, MT), the NWP model was the best model on three stations (IL, MS, SD), and the Gaussian error distribution was the best model on one station (PA).
If you have questions about our work, contact us at:
ezelikman@cs.stanford.edu
, sharonz@cs.stanford.edu
, and jirvin16@cs.stanford.edu