Uncertainty-Aware Forecasting of Renewable Energy Sources

Smart grid systems are designed to enable the efficient capture and intelligent distribution of electricity across a distributed set of utilities. They are an essential component of increasingly important renewable energy sources, where it is vital to forecast the levels of energy being fed into and drawn from the grid. However, because of the high levels of uncertainty affecting real-world environments, accurate forecasting for example of wind power generation - being directly dependent on meteorological parameters and climatic conditions - is extremely challenging. Fuzzy Logic systems are frequently used in control systems to leverage their capacity for handling varying levels of uncertainty. In most cases, while uncertainty affecting the systems is captured in fuzzy sets (FSs), the final output of such systems is reduced to a crisp number (e.g. a control output). The latter process, while providing an efficient pathway to generating a specific control output, at the same time implies substantial information loss, as the uncertainty information captured in the FS outputs of these systems is effectively discarded. In this paper, we explore the potential of Mamdani fuzzy logic system based forecasting in order to generate not only a numeric forecast of the energy generated, but to also generate uncertainty intervals around said forecast indicating the level of uncertainty associated with the prediction. The proposed model is explored using both synthetic and smart-grid specific real-world (wind power) time series datasets. The results of the study indicate that utilising the 'complete' FS output can provide valuable additional information in terms of the reliability of the forecast without any extra computational cost. At a general level, the approach indicates strong potential for leveraging the uncertainty information in fuzzy system outputs - which is commonly discarded - in real world applications.


I. INTRODUCTION
Smart grid systems are designed to enable the efficient capture and intelligent distribution of electricity across a distributed set of utilities. Increasingly, renewable energy is taking a more prominent role in these utilities of smart grid applications, due to its advantages [1] over fossil fuels. Wind power is frequently one of the most cost-effective resources among renewable energy technologies [2], already delivering substantial energy resources through wind parks across the world. While wind power and similar renewable energy resources such as solar have substantial benefits, their use also poses new challenges to smart grid infrastructure. Specifically, as these sources are weather dependent, in order to maintain a stable electricity grid, accurate forecasting of the energy produced by the renewable sources becomes vital in order to manage the production level and power output of other energy sources (such as pumped water storage, nuclear and fossil fuel), ensuring that any shortfall in for example wind energy production is matched with other sources.
Real-world environments are exposed to many different high levels of uncertainty sources e.g. climatic conditions. These uncertainty sources have a detrimental effect on forecasting. In the literature, many different methods have been developed/improved to handle the uncertainty in real-world and enable an increasingly accurate energy time series forecasting [3]- [5]. Beyond the forecasting of discrete time series, recent approaches have also targeted the forecasting of what are effectively interval-valued time series, providing a confidence interval for each step of the forecast, rather than only a discrete value. The key rationale here is that in practice, a numeric forecast is of limited value: what is really needed is information on the expected minimum and maximum power output which will arise from renewable energy sources -to inform the production level by alternative sources (i.e. produce enough to match the minimum expected renewable power output, but definitely not more than is needed).
Limited research has addressed the prediction of interval rather than crisp values. In [6], one of the key foci was the forecast interval coverage as well as the point forecast accuracy throughout the three models of Forecast Pro, ARIMA and exponential smoothing based algorithms. In [7], a new end-to-end Bayesian neural network architecture is utilised to implement a more accurate time series predictions as well as uncertainty estimations in the prediction. In [8], the results of the ARIMA forecasting models are converted to FSs and alpha-cut is applied to the generated FS to obtain intervals of forecasting. In [9], Takagi Sugeno (T-S) fuzzy models are utilised and by following the principles in [10] and upperlower prediction interval bounds -based on a given percentage value-are provided by utilising the covariance of data. Thereafter, a new prediction interval modelling methodology based on fuzzy numbers is proposed [11] which generate the size of prediction intervals based on a given percentage value. Fuzzy Set (FS) theory was introduced by Zadeh [12] and is applied to Fuzzy Logic Systems (FLSs) which are considered as explainable and robust systems for capturing and handling uncertainty in decision making applications. In the context of FLSs, various approaches have also been put forward to improve time series forecasting applications [13]- [18].
In conventional Mamdani fuzzy model [19], the output FSs are defuzzified into a crisp value though a defuzzification process (See 'x', marking the defuzzified centroid in Fig. 1). This process of acquiring a crisp value from the FS-valued output discards substantial information on the uncertainty captured in the outputs FS model. In this paper, we explore a different approach designed to minimise this information loss by not reducing the FS-valued output to a numeric value used for forecasting, but by also generating uncertainty interval around said forecast to provide information akin to an associated level of confidence (See Fig. 1).
The overall approach to generate uncertainty intervals is based on the idea of using α-level cuts of the output FSs to generate interval outputs (illustrated in Fig. 1) for each individual forecasting value. The resulting predictions incur a very minimal additional computational cost, as the output FSs are commonly generated as-standard in Mamdani fuzzy logic systems.
This paper puts forward a first exploration of this approach, highlighting its potential and articulating outstanding questions including on the relationship between different levels of α and traditional confidence levels and general advances required to optimise fuzzy systems to generating meaningful intervalvalued outputs. In the experiments conducted in this paper, a synthetic chaotic Mackey-Glass [20] and a real-world wind speed dataset are used to demonstrate the proposed approach and conduct preliminary evaluation.
The structure of this paper is as follows: Section II provides brief background information about Smart Grid, Fuzzy Sets, α-cuts and Mamdani Inference model. Section III gives the proposed method to provide uncertainty intervals around the forecasting points. In Section IV, the experimental setup and associated results of the proposed method are provided with the discussion. Lastly, in Section V, the conclusions of the current work with possible future work directions are given.

A. Energy Management System in Smart Grid
Energy Management System (EMS) in smart grid systems is designed to provide continuous operation under variable generation and load. EMSs can be structurally categorised under two main approaches as centralised and distributed. In a centralised design, dispatch of units is determined throughout an optimisation procedure. To achieve a suitable optimised stage, the relevant information is provided to the system. Generally, this relevant information contains each generation unit and loads (e.g., cost functions, tec hnical characteristics/limitations, network parameters, and modes of operation), as well as information from forecasting systems (e.g., local load, wind speed, and solar irradiance).

B. Fuzzy Sets
Fuzzy Set (FS) theory was introduced by Zadeh [12] and the following definition is stated: "A fuzzy set is a class with a continuum of membership grades." So FS is formed in a universe of discourse (X) by membership function (MF) that associated with each element x ∈ X where the membership grade μ I (x) takes values in the range [0,1]. The definition as follows: where μ I (x) is the membership degree of x on the FS I. Surely, the FSs can be formed as non-convex, convex and sub-normal, normal. As an example, a non-singleton Gaussian input FS is illustrated in Fig. 2 and it is formulated in 3.
where x is the mean and σ is the standard deviation of the FS.

C. Alpha Cut
The general idea of alpha cut (α) is to decompose fuzzy sets into a collection of crisp sets related together via the α levels [21], [22]. For all membership degrees α level is defined in [0, 1] and given a Fuzzy set A in a universe of discourse (X), usually, (α) cuts definitions are as follows: The illustration of a α cut can be seen in Fig. 1.

D. Mamdani FLS
Traditional Mamdani [19] FLSs are completed in 4 main steps which are illustrated in Fig. 3. In the rule base step, different approaches can be used to create the rule set of the model. In fuzzification step, crisp input values are transformed into FSs which can be formed as in various shape convex nonconvex. In the inference engine step, the generated input FSs are processed over the generated rules throughout the selected operators and the output FS is formed. In defuzzification step, the obtained output FSs is reduced to a crisp value by using a selected defuzzification technique. The details of these procedures can be found in [23].

III. METHODOLOGY
In the literature, there are many different time series forecasting implementations which utilise various models and techniques [6], [8], [9], [11], [13]- [18]. However, generally, the predicted values are constrained to crisp numbers, whereas uncertainty is likely to exist in respect to the predicted values. Therefore, in this paper, along with the predicted crisp values, we focus on the generation of an interval capturing the uncertainty level associated with the prediction, akin to a confidence interval. Alpha cuts on Mamdani [19] Fuzzy logic system based output FSs (commonly defuzzified to a crisp number to generate a discrete output) are used to provide this interval using the following 4-step process: [19] FLS is designed to conduct time series prediction. Standard approaches to FLS time series prediction are applied, including input FS design [16]- [18], rule set generation [24], etc. We note that in future the design process may be tailored specifically to optimally generating interval-valued predictions, but we are not addressing this in this pilot paper. • Step 2 Generating Output FSs: After constructing the standard Mamdani FLS, the inferencing over the input FSs and the rules are processed with conventional operators [23]. By implementing the selected operators in the inference step, the output FSs of the Mamdani FLSs are generated. • Step 3 Defining alpha-cut level: The output FSs of Mamdani FLSs can be complex, including non-normal and non-convex FSs, and there is substantial scope for research on how to select an appropriate level of α for a given application. In general, the higher the selected value of α, the more narrow the output interval (Fig.  5a). However, generally, the height of each output FS may differ. Thus, some α levels (those which are greater than the height of the output FS) may not result in any uncertainty interval output (α-cut is an empty set) as shown in Fig. 5b. In addition, as illustrated in Fig.  5c, some α levels may lead the traditional centroid defuzzification results (marked as blue x) falling outside of the generated uncertainty intervals. The prospect of addressing non-convexity/non-normality and selecting a particular alpha-cut level serves as an incentive for future research. While a detailed discussion of this α level selection is outside the scope of this paper, as a possible research direction, we note that one direct approach to addressing this would be application of a multi (two) objective optimisation technique, optimising the FLS toward prediction quality and maximum α level (minimum uncertainty interval). For simplicity, in this paper, we use α = 0.4 throughout.  set and a crisp value is calculated. The general idea of using output FSs is applicable in many different areas. In this paper, we proceed to develop one specific instance of the general framework for time series forecasting as shown in Fig. 4. In the first step, the Mamdani FLS is built and the inputs x 1 ... x 5 are operated on the model rules conventionally. In the second step, the model output (O) is obtained and retained. In the next step, the α level is selected and the uncertain bounds ([μ] α ) is determined. In the last step, the uncertainty boundary of the prediction (x 6 ) is visualised (interval in grayscale). All four steps of the proposed method are shown in Fig. 4.
Overall, the employing output FSs and utilise uncertainty intervals enable to provide the advantage of more information in regards to the confidence of the prediction that incur a very minimal additional computational, as the output FSs of Mamdani FLSs.

IV. EXPERIMENT AND RESULT
In the experiments of this study, first, synthetic chaotic Mackey-Glass [20] and second a real-world wind speed dataset are used to implement the forecasting experiments. The wind speed time series dataset was collected -with a sampling time Considering the varying uncertain circumstances in the real world and the chaotic behaviour in MG time series respectively, an accurate crisp value prediction may not be possible. Therefore, the output sets are used to provide uncertainty intervals around the prediction. By doing so, the level of uncertainty associated with the prediction can be captured, in turn providing valuable information -such as for the control of other power resources in the case of smart grid management for renewable energy resources.
For both time series, 70% is used to train model and 30% is used for testing. In the experiments, one-step ahead (15 minutes in the case of the wind dataset) forecasting is implemented. In the rule generation phase of the Mamdani [19] Fuzzy model, the commonly used one-pass Wang-Mendel method [24] is implemented as follows: • The domain interval of the training set [x min , x max ] is defined and evenly split into 2L−1 region where L is defined as 6 to obtain 11 antecedent FSs (A 1 , A 2 , ..., A 11 ).
The generated FSs can be seen in Fig. 6. • Nine past values are used as inputs and the following (10 th ) value is predicted, i.e. it is the output. The examples of the input-output pairs can be seen in (5). ...
where n is the number of value in the training set and N is the paired data value. • After constructing the antecedent FSs and input-output pairs, the inputs are assigned to the corresponding antecedent FSs. For the consequent FSs, the same 11 FSs are used and the outputs (y i ) are assigned to the corresponding FSs as well. • Thereafter, a rule reduction procedure is implemented on the conflicting rules. For details, please see [24]. During prediction, input values are fuzzified to singleton FSs and they are processed in respect to the rules, where the min and max operators are used for the t-norm and t-conorm respectively. As noted above, while in standard Mamdani FLSs, the final output FS is defuzzified to a crisp number, in the approach considered in this paper, an α-cut at a prespecified α level is used as the final prediction output. In the experiments, we define α level to be 0.4; however, we note that in practice, different α levels can be investigated as well.
The performance of the proposed method is assessed on the basis of whether the actual time series values (the ground truth) lie within the prediction interval. In other words, we establish the 'coverage percentage' where perfect performance means that the intervals capture the actual time series for 100% of the predicted values. This approach to performance assessment is valuable, as it directly reflects the real-world requirement for a prediction method which provides accurate min-max bounds for the expected production of renewable energy to enable the smart grid to balance power production across the network. In other words, in this application, a less specific prediction (i.e. an interval rather than a crisp value), which is correct (i.e. which covers the actual wind level) is more useful than a more specific prediction (e.g. a crisp prediction), which is incorrect. Further, as part of the experiments, we establish the relationship between various levels of α and the coverage percentage. This analysis will be key in future work in supporting the establishment of what suitable levels of α are for specific applications.

A. Experiment 1 -Mackey-Glass Time Series Prediction
As part of the experiments, the commonly used chaotic Mackey-Glass time series is used to implement time series forecasting. In order to generate the respective datasets, initially 2000 samples (from t = −999 to t = 1000) are generated and, in order to avoid fluctuations in the initial part of the time series, only the last 1000 (from t = 1 to t = 1000) points are preserved for use in the experiments.
Specifically, the Mackey-Glass (MG) time series is generated by using the nonlinear time delay differential equation: where a, b and n are constant real numbers, t is the current times and τ the delay time. For τ > 17, (6) is known to exhibit chaotic behaviour. In this paper values are set as τ = 30 , a = 0.2 and b = 0.1.
After generating 1000 values of the MG times series as mentioned above, nine past points are utilised to make onestep ahead prediction and the Wang-Mendel method is implemented to generate rules from the first 70% of time series dataset. Thereafter, the remaining 30% of the time series dataset is used for testing. An example prediction results can be seen in Fig. 7, where the proposed prediction results are illustrated as gray vertical interval, the standard Mamdani centroid defuzzification results are marked as blue 'x's and the ground truth MG values black dashed lines.

B. Experiment 2 -Wind Speed Time Series Prediction
For experiment 2, the same procedures as in the previous experiment is followed using the real world wind speed time series dataset which contains 11254 samples. Here, no initial data is omitted as the problem of 'settling' which is relevant for the MG time series, is not relevant for the real world dataset. The first 70% is used for training, the remained 30% is used for testing as well. A part of the prediction sample can be seen in Fig 8.

C. Results
As noted, as the main measure of performance, we explore the coverage of the interval-valued prediction with respect to the ground truth of the time series. For Experiment 1, coverage is 79%, i.e. 79% of the numeric ground truth samples are covered by the prediction interval. In experiment 2, this percentage is substantially higher with 97.3%.
Further analysis is carried out and the relationship between the chosen α-level and the coverage is examined and the results are reported in Figs. 9 and 10. As can be seen in Fig. 9, above a level of 0.3, the coverage is declining rapidly and after 0.6, almost no ground truth is captured. Fig. 10 also indicates that after alpha level 0.5, there is a sharp reduction of coverage percentage. Conversely, the figures also highlight that at an αlevel of 0.2, we achieve close to 100% coverage. In other words, the FLS time series prediction with α = 0.2 provides extremely reliable forecasts expressed as the min-max bounds of the expected MG or wind speed values. Of course on the other hand, for small values of α, the actual predictions are of low specificity (i.e. they result in large intervals), which can limit the utility of the prediction. In real world applications, there will be a trade-off between the levels of reliability and the specificity required for forecasts but in key applications such as wind speed prediction, the reliability is of primary concern (to maintain a stable grid).

D. Discussion
Two different time series prediction experiments are implemented by using the proposed approach to generating intervalvalued outputs using a standard Mamdani FLS. As shown in Figs. 7 and 8, the resulting predictions provide a direct Fig. 9: Mackey-Glass time series prediction coverage percentage with respect to the different α-cut levels applied to the output FSs. measure of the uncertainty associated with each prediction -captured through the varying size of the intervals over time. In addition, we note that in cases where the actual (traditional) centroid-based prediction ('x' shape) is inaccurate, the corresponding intervals are wider, providing a useful indicator of the level of 'confidence' associated in the discrete prediction. The latter aspect will require further research but does provide an interesting glimpse of the value in preserving more information than a crisp output from the generally rich FS-valued output.
Finally, additional analysis is conducted on different α levels with respect to the coverage percentage of the proposed method. Figs 9 and 10 show a clear decline trend after a certain alpha levels in both dataset. This sharp drop can be explained as there are not many outputs which have a greater height than the determined alpha level. Therefore, when the alpha level is greater than the height of the output FSs, uncertainty bounds are not generated which results in no coverage. Again, determining the appropriate level of α for a given setting will be a key research aspect for future research.

V. CONCLUSION
In this paper, we explore an alternative approach to using Mamdani FLSs in order to generate not only a forecast of time series which is tailored to the requirements of smart grid and similar applications. Specifically, the proposed approach is designed to preserve uncertainty information in the FLS output FS by generating an interval, rather than a discrete prediction. Through this approach, the resulting FLSs do not only provide the means to generate predictions which can provide very high reliability in respect to capturing the actual (ground truth) value which is being predicted, but can also provide a measure of uncertainty which can be used on its own or associated with a traditional (centroid-based) numeric forecast. The proposed method requires quasi no additional computational effort in comparison to traditional approaches as it leverages information which is already captured within standard output FSs but which is commonly not used.
While this paper puts forward the conceptual idea of the approach and provides an initial empirical demonstration for a synthetic and real-world smart-grid dataset in the context of renewable energy production, in future, we expect substantial research efforts targeting, in particular, the appropriate selection of specific α-cut levels, the relationship of the generated intervals in respect to traditional confidence intervals, and, the appropriate optimisation of FLSs designed to generate intervalvalued, rather than crisp outputs.