Automated Adaptive-Ensemble Framework for Large Wind Power Prediction in Poland Using Deep Learning Models

Prediction of considerable wind power is a signifi cant factor in modern power systems’ robust and resilient operation. As a result, many studies addressed up-to-day-ahead wind power forecasting. Taking into account the abilities of machine learning (ML) Models and their combinations, in this paper, the Authors would like to present the framework for 60-hour wind power forecasting in Poland. Possession of longer than one-day ahead of accurate wind power forecasts gives an ability to the transmission system operator (TSO) and distribution system operator (DSO) the to improve the control of grid traffi c and assimilate more effi ciently the renewable energy sources (RES) generated power. The presented method uses the geographical coordinates of wind farms in the area of interest with their wind-power curves. It combines it with weather parameters such as wind speed and direction, wind gust, air temperature, and pressure for improved accuracy. The novelty of the proposed method is that model can adapt autonomously according to achieved past accuracy. A back-score accuracy has been tracked based on past predictions and measured wind power generation. Furthermore, the model combines diff erent ML models that can be adapted or retrained if prediction performance drops below an acceptable level. In this paper, we would like to present a method and performance analysis of the adaptive ensemble-model method, whose accuracy has been calculated compared with actual data measured in the National Power System of Poland. Furthermore, the paper’s novelty is that the proposed method/framework can use only an approximate model of large wind generation at the beginning, and the model can be fi ne-tuned in the proposed method’s operation as a function of back-accuracy measurement triggering ensemble-model adaptation


INTRODUCTION
based on numerical weather prediction (NWP) combined with wind farms' geographical locations and technical data [1,2]. The increasing computational abilities of servers allowed for combining classical wind generation forecasting with different ML methods, as proposed in [3], to achieve better prediction performance [4]. In [5], the Authors offered an approach Support Vector Machine (SVM)-based regression algorithm day ahead prediction of wind energy or [6] where the forecast of wind generation has been done using SVM, and probability distribution SVM forecast error had been estimated by Sparse Bayesian Learning (SBL). There are also methods based on Long Short-Term Memory (LSTM), which have been discussed in among others [7][8][9]. Different LSTM combinations with other approaches or neural network models have been used for wind power predictions. In [8], a solution proposed where a time-section fusion pattern classification forecasting model is proposed. It includes multiple forecasting models based on different machine learning theories, such as LSTM, SVM, and autoregressive integrated moving average (ARIMA), which are used to predict wind power. In [7], wind speed and direction have been included from neighboring points in the vicinity of the wind turbine or wind farm. Furthermore, to make better forecasting, there had been proposed the usage of Convolutional Long-Short Term Recurrent Neural Networks (convLSTM). Their advantage is the property of including temporal dependencies from the wind time series and spatial subjection obtained from geographically scattered wind forecasts.
A similar solution can be found in [10], where multi-tasks convolutional long-short term recurrent neural network (convLSTM) was applied to simultaneously predict the energy output of the wind turbine while modeling the spatio-temporal structure of the input wind flow. In [11], the Authors proposed a novel two-layer nonlinear combination method for short-term wind speed prediction problems, such as 10-min ahead and 1-h ahead. The first layer is based on an extreme learning machine (ExLM), elman neural network (ENN), and LSTM to separately forecast wind speed by making use of their merits of calculation speed or strong ability in forecasting, and the second layer by making use of ENN-based nonlinear aggregated mechanism to alleviate the inherent weakness of single method and linear combination. In [12], the Authors proposed a new model considering climate factors, such as temperature and air pressure. It consists of mathematical morphology decomposer (MMD) and two LSTM networks to perform ultra-short-term wind speed forecasts. The MMD is developed to improve the forecast accuracy, separating the wind speed into two parts: a stationary long-term baseline and a nonstationary short-term residue. In [13], the prediction system is proposed using the gated recurrent units (GRU), which is based on a deep learning prediction model and combining the relevant historical data of wind power.
According to the above review, it can be found that the prediction of immense wind power is an essential factor in the robust and resilient operation of modern power systems. To the best knowledge of the Authors, there has been done research which addresses day-ahead wind power forecasting such as [14][15][16][17][18][19], where different ML techniques have been proposed, such as tree-based learning algorithms with XGBoost method [14] or cascaded convolutional neural network (CNN) with an exciting aggregator in the form of radial basis function neural networks (RBF). Furthermore, an ensemble method was proposed in [16][17][18][19], where different and modified combinations of neural networks with statistical methods have been used for wind power forecasting. As stated above, ensemble methods using other ML models and algorithms with chosen statistical processing have been proven accurate for wind power forecasting. Inspired by this fact, this work aims to present the framework for 2.5 days (60 hours) extensive wind power forecasting, which in the Authors' opinion, fills the gap in long-term forecasting of ample wind power. Possession of more than one day ahead of accurate wind power prediction allows the TSO and DSO to precisely control the grid traffic and assimilate the RES-generated energy efficiently. The presented method uses the geographical coordinates of each wind turbine in the area of interest with its windpower curves. It combines NWP with weather parameters such as air temperature, wind direction, gust, and pressure for improved accuracy. The novelty of the proposed method is that the presented ML ensemble model is checked against performance error every six hours, which in this case is every new NWP. If the performance error is below a certain level, the model remains unchanged and performs a 60-hour wind power prediction.
On the other hand, if the error is higher than the error threshold, the model is adapted, trained using the newest data, and assembled into new architecture, finally performing 60-hour forecast. The rest of the presented paper is structured as follows. At first, the ML methods and component models used for extensive wind power forecasting in Poland were introduced with comparative analysis. Then, there have been proposed the adaptive ensemble ML framework for significant wind power forecast in Poland with results compared to wind-generated power, measured values by TSO. Finally, the conclusions have been introduced, where the final interpretation, analysis, and reasoning of achieved results with future work have been gathered.

Data acquisition and assessment of wind generation in Poland
As stated in the Introduction, predicting the wind generation in the whole country is essential from the TSO and DSO perspectives. The more accurate wind power prediction, the better thermal generation scheduling, transmission infrastructure usage, or better CO 2 management in the future.

General concept
According to research [1,2], accurate wind generation forecasting require a model that combines the NWP augmented with wind farms' geographical and technical data. In the proposed framework, the NWP acquired from remote service provides weather parameters influencing wind generation, such as wind speed at 90 m, wind gust at 90 m, air temperature, atmospheric pressure, and wind direction. Then, NWP's wind speed and gust from NWP predictions are combined with farms' coordinates and Wind Turbine (WT)s technical data to estimate the upper and lower bounds of expected wind generation. The characterized first stage of data preparation for significant wind power prediction has been presented in Figure 1. It is worth noting that created at this stage model for wind generation forecasting, developed with wind turbines' technical data/ power curves merged with localization, acts as an ideal model. However, in natural conditions, it is characterized by significant inaccuracy. As a result, it can calculate the minimal and maximal wind generation potential, generally speaking, in the area of interest or, in the presented case, in Poland. The source of model uncertainty comes mainly from the fact that it calculates purely generated power using wind turbines' power curves based on predicted wind conditions in their locations according to Equation 1 and 2, respectively: : where: N -number of wind turbines, v w -wind speed at 90 meters altitude above ground level (m/s), v g -wind speed during gusts (m/s), f i -i-th wind turbine generation curve as a function of wind speed v w or v g at j -th time, j -NWP wind forecast sample number.
The operational code of grid-interconnected wind farms requires that wind farm owners commit their power and energy production over a time horizon based on NWP according to grid limitations set by TSO. In such a case, the accurate prediction of large wind generation can be improved using ML techniques. The results of the augmented raw ML model used for wind generation forecast based on NWP combined with turbine localization and their power curves compared to those measured by TSO in the National Power System have been presented in Figure 2.
In Figure 2, there has been presented a chosen period where the NWP was acquired from June 2020 to July 2021 with estimated upper and lower bounds of large wind generation, calculated based on the wind at 90 m and wind gust at 90 m, respectively. It is worth noting that accurate values of wind generation almost all the time stays between upper and lower bounds. In addition, a varying moment of inertia can also be observed as a function of generation level. To improve the accuracy of long-term large wind generation, the Authors proposed a method for long-term forecast improvement named ML Model with Adaptive-Ensemble Architecture. According to Figure 3, where have been presented the measured wind generation in Poland as a function of nonadaptive prediction models. Also, the conclusion is that it is hard to predict wind generation accurately using nonlinear but non-adaptive models.
Analyzed techniques typically combine different ML models at the first layer in parallel. In contrast, in the second layer, outputs are aggregated using varying structures of ML models and fuzzy or statistical methods. In [20], there has also been presented an up-and-coming way with LSTM and aggregator in the form of Choquet integral or, i.e., Dempster-Shafer based function [21]. An alternative ML forecasting model can be found in [22], which was proposed for predicting shorttime wind power forecasting.
In general, paralleled ML models are used to reflect different features of anticipated time series in varying time horizons. Thus, diverse aggregation methods result in other final forecasts and overall model accuracy. Nevertheless, the typical forecasting approach assumes that once prepared and trained architecture is used for future predicting. Such a model can be adapted or retrained, but the architecture remains static. Based on the above considerations, it can be stated that such a forecasting model is valid in a particular time horizon since the static or, on the same level, quasidynamic characteristics of large wind generation can be non-stationary and may change over time. The mentioned wind generation can be non-stationary and may change over time. The said wind generation non-stationary features' can be affected by many factors, such as slow aging or minor damage of the wind turbines' blades and rotors, changing the terrain roughness or their growth during the year. Slow changes are challenging to per bounds based on NWP wind and gust forecast compared to real WG value measured by TSO identify in large wind generation, e.g., in a whole country. In such Once a trained forecasting model can lose its prediction accuracy over time, its validity can vary during the year. Therefore, to avoid accuracy degradation and achieve acceptable forecasting errors all the time, the architecture consisting of a few ML models can be refreshed or rearranged. In this paper, the proposed method, component models, and model architecture can be changed based on the measured accuracy of component models. Moreover, in performed research, a dynamic complex model structure has been proposed in Figure 4. An example of the resulting architecture is shown in Figure 5, where an automated adaptive-ensemble algorithm has been described in Figure 6.

Automated adaptive-ensemble model architecture
The proposed method of automated adaptiveensemble architecture model allows for accurate 60 hours of wind generation prediction using NWP. As presented in Figure 6, at the first stage, models are separately trained using 70% 15% 15% of data used for training, validation, and testing, respectively. At this stage, every model architecture tuning is also performed using the Bayesian search method [23,24], finding the best number of layers, neurons, and activation functions that allow for the best performance and generalization. All optimizations were performed using Optuna Python Package [25]. According to Figure 6, if back-scored MAPE in the last 6 hours falls below MAPE<5%, it is assumed that the actual model is accurate, so it stays without any changes for future predictions. If 5%< MAPE <10%, the model is retrained on contemporary architecture, and if MAPE > 10%, the model runs for a new architecture (the model is optimized again). In the model optimization procedure, all unit models are trained along with their structure optimization, and every component is trained and validated using the newest data included in the train/valid dataset (in the presented research, there were used datasets for a year 12 months, sampled in a 15-minutes time interval). In the first step of the optimization procedure, all models are retrained, and their accuracy is measured. Then, the four best models are taken into the first layer, where all four are connected in parallel. In the second stage, outputs of the first-layer models are accepted for training to obtain an aggregator module at the second stage. All available component models are trained and optimized in the second stage, focusing on their structure. Finally, only one best-performing model is taken as the second layer aggregator.

Results of operation of adaptive-ensemble ML
In this section, the Authors would like to present the operation of the Adaptive Ensemble Architecture Model and achieved forecasting results applied to significant wind generation prediction in Poland. As it was stated before, the NWP was acquired in June 2020. However, the regular operation of the presented method started on the 1st      Table 1, Table 2, and  Table 3, respectively.

Accuracy analysis of the developed adaptive-ensemble architecture method
To compare results and prove the applicability of the presented adaptive-ensemble architecture method (AEAM), the base model : : where: y i -measured wind generation by TSO, x i -predicted wind generation, n = 240 number of samples in 60 hours, x -mean error, σ -standard deviation, z -confidence interval coefficient dependent chosen according to Table 4.
The results achieved using the adaptiveensemble model method have been presented in Figure 10 (error of wind power prediction) and Figure 11 (error in energy generation prediction)

CONCLUSIONS
This paper presents a novel method and performance analysis of the Adaptive Ensemble-Model Method for 60-hour wind power prediction in Poland, where the final architecture used for prediction is optimized based on the most actual data. The proposed method's accuracy has been calculated and compared with data from the National Power System of Poland. Finally, the prediction accuracy has been calculated in the one-month continuous operation of the presented method, achieving forecasting precision at a 60-hour horizon at the mean error level of 4.02% in energy production and the mean error of generating wind power at the level of 22.16 MW. It is worth noting that the presented method tends to improve its predicting accuracy trying to find the best architecture in consecutive iterations.

Acknowledgments
Authors would like to thank Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, for supplying the Numerical Weather Predictions for the purpose of this work.