Neural Network based Weighting Factor Selection of MPC for Optimal Battery and Load Management in MEA

This paper presents a Neural Network (NN)-based weighting factor (WF) selection method for the multi-objective cost function in Model Predictive Control (MPC). MPC is adopted for scheduling the loads and charging/discharging the battery intelligently on More-Electric Aircraft (MEA) in a preferred manner. The decisions which are made while the MPC is running utilize a cost function which weights together different objectives (using WFs). The final overall evaluation is performed by considering various objectives with full knowledge of what happened throughout the whole operation, which are weighted together by utilising weights appropriate to the user. The WFs utilized by the MPC to get the best overall result will usually differ from the weights used in the final evaluation. A NN is trained to predict the effects of different combinations of WF values, facilitating optimisation to find the minimum evaluation index, i.e. the most suitable weighting factors for the applied MPC.


I. INTRODUCTION
Loads are increasingly driven by electric power on More-Electric Aircraft (MEA) with the replacement of hydraulic and pneumatic power for higher efficiency [1]. Those responsible for flight safety are named critical loads and must be powered in all flight scenarios regardless of high power peaks or fault-causing emergencies [2]. Use of the Energy Storage System (ESS) and shedding of noncritical loads both help to maintain power for critical loads when power shortages occur. Intelligently combining the load shedding with the ESS capabilities is essential for safe on-board Electric Power System (EPS) operation [3].
A Model Predictive Control (MPC) framework is adopted in this paper for scheduling the battery charging/discharging and load shedding. This onlineoptimisation control aims to keep the energy storage (ES) highly charged, within a defined target range, while also minimising the load shedding, with less switching to improve device lifetimes, and to avoid unnecessary transients. Using a Mixed-Integer Linear Programming (MILP) formulation for MPC modelling, the cost function This work is funded by the INNOVATIVE doctoral programme.
can combine different objectives linearly by adding various Weighting Factors (WFs) [4]. The selection of suitable WFs has vital impacts on achieving required control performance, since objectives could conflict with each other. However, the most commonly used selection methodology involves carrying out time-consuming simulations and relying on a trial-and-error approach to test the effect of different WFs [5], [6]. To this end, a more efficient selection method is required. This paper proposes a Neural Network (NN)-based approach to select WFs of MPC for optimal battery and load management in MEA. An MPC model for a main source -energy storage -load (MS-ES-L) system is first built up and the ranges for WFs in the cost function are given. A multi-objective evaluation index is then proposed, which quantifies the required control performance of MPC. After that, a range of values for each WF is swept with large sample steps using which the system is simulated. The corresponding evaluation index values are then collected as the training data for the desired NN. The training data is processed into an input/output matrix where every row corresponds to a combination of WF values and an associated index value. This matrix is used to train a NN, which can serve as a fast surrogate model of the system and its evaluation, thus allowing a fast selection of the optimised WF combination with small WFs granularity in the given range. The simulation results adopting both the optimised WFs and an empirical design by adopting evaluation weights are compared to verify the effectiveness of the proposed method. This paper is organized as follows: Section II explains the architecture of the studied MS-ES-L system and the MPC model in this system. Section III demonstrates the evaluation index for each objective, and a multi-objective evaluation index is proposed with evaluation weights. The NN-based WF selection method is presented in Section IV. The MPC based system with different WFs are simulated and presented in Section V, and the paper is concluded in Section VI.

II. SYSTEM DESCRIPTION AND MPC MODEL
The studied MS-ES-L system is illustrated in Fig. 1, the MS and ES are connected to one LV bus to supply the loads. Three types of loads are considered -critical loads, high priority and low priority noncritical loads. The control of the MS-ES-L system can be formulised as an MILP model. The system operation is represented as a group of linearized constraints in MILP, including power balance, storage dynamics, SOC target range, battery charging/discharging modes, and power bounds. The cost function targets on minimising total load shedding and the switching activities, and maximising the battery energy storage. In the meantime, an MPC scheme is adopted in this work. For each time step, the controller gets the system status and load prediction in a prediction horizon to update the constraints in MILP model, the model is then solved for the prediction horizon to obtain a sequence of control decisions for discrete time instants k (k = 1, 2, …). Only the control decisions for the first instant is provided to the system, which consists of input power from MS and load shedding. Then the battery power can be consequently controlled.

A. Nomenclature
The parameters and decision variables used in the proposed formulation are described in TABLE 1. , where K and M are the total numbers of time intervals for the H and T. The following two subsections will discuss the three objectives in the cost function and four groups of constraints.

B. Cost functions
Three control objectives are considered in this work. The MPC controller firstly aims to minimize the total number and time of non-critical load shedding; further the high priority load is less shed than the low priority load. This cost function for this objective is represented in (1).
Since constant load connecting and shedding will lead to transient and instability problem to the system, a cost function in (2) is introduced to minimise the switching activities.
The ES supplies power to loads when the input power from MS has shortage. However, the battery SOC is required to be kept in a target range [LO, HI] to improve the battery lifetime and the system safety. Moreover, high SOC means more energy can be potentially supplied. Hence the MPC controller has the objective to keep SOC close to its upper limit HI as much as possible, which helps keep battery SOC in the target range and in the meantime, store more enough energy in battery. The cost function for this objective is presented in (3).

C. Constraints 1) Power balance constraints:
For the bus, the sum of power flowing into/out of it The 23rd International Conference on Electrical Machines and Systems (ICEMS) equals zero, assuming no losses within this bus. (5) indicates that the power from the MS and the ES equals the power of connected critical and non-critical loads. The battery power in (5) can be calculated as shown in (6).

2) Battery dynamics
The battery SOC dynamics is calculated from the charging/discharging power in a time step using (7) [7].
Battery SOC varies from 0 to 1, with 1 indicating fully charged, while 0 indicates a depleted battery. In the aircraft, SOC is preferred to be kept within a target range [LO, HI]. In this case, we select LO=0.3, HI=0.9, therefore upper and lower bounds are defined as (8).

3) Charging/discharging mode constraints
The battery can be either in charging or discharging mode under the maximum charging/discharging power.
By introducing an indicator ( ) for each mode, the one mode selection can be represented with the constraint in (9) [8]. In each mode, the battery charging/discharging power varies from 0 to the maximum, which is presented as (10) and (11).

4) Bounds of input power from MS side
The power obtained from MS side should not exceeds its maximum value, which is presented as (12).
III. EVALUATION WITH DIFFERENT WEIGHTS As presented in Section II, the MPC cost function combines multiple objectives with WFs. Different combination of the WFs will lead to different energy management results. To evaluate the predictive control performance for a flight load profile, a set of evaluation indices are proposed to make the performance quantified for comparisons.
In this section, three evaluation indices are introduced first. The weighting factors are then sampled in a given range, and the resulting evaluation values are recorded.

A. Evaluation indices
According to three objectives mentioned in Section II, three evaluation indices are first confirmed. Smaller value of each index indicates better control performance.
The load shedding index g(S) calculates the ratio of shed load over the duration of whole flight as presented in (13). Then, wSs and wSOC are sampled with large steps for the range [0.1, 5] with the step size 0.2, while for the range [5,62], the step size is 3. For each weighting factor combination, the MPC model is conducted with a load profile of 240 load samples, the evaluation values for each model simulation are recorded, as follows. Fig. 2 illustrates three evaluation index values in the given ranges of wSs and wSOC. Fig. 2 (a) shows how the amount of load shedding g(S) varies according to the value of wSs and wSOC sampled. When wSs ≤4 and wSOC ≤30 (wSs/ wSUM ≤0.1025, wSOC/wSUM ≤0.7692), the system will perform the least load shedding activities. When wSs ≤20 and wSOC increases from 30 to 60 (0.2353≤wSs/wSUM ≤0.3636, 0.5455≤wSs/wSUM ≤0.7059), load shedding increases a lot, while comparing with the g(SOC) changes in Fig. 2 (c), the SOC in this range decreases to the minimum value. Similarly, Fig. 2 (b) shows how switching activities g(Ss) changes according to the value of wSs and wSOC sampled. When wSs ≤2 (wSs/wSUM ≤0.2816), the system performs the most switching on/off changes.

C. Multi-objective evaluation function
To select proper WFs for MPC, a multi-objective evaluation function should be provided, which sets a selection criteria for the objective function in (4). The multi-objective evaluation function can be presented as (16), where vS, vSs, and vSOC are the weights for the evaluation functions.
The trade-offs between each pair of evaluation indices are presented in Fig. 3, showing the results for each execution (with different WFs). The range of the third index value is indicated by the colour of each point. Fig. 3 (a) indicates that the best potential values for g(SOC) decrease as those for g(S) increase. Similar relationships are observed in Fig. 3 (b) for g(Ss) and g(S), and in Fig 3(c) for g(SOC) and g(Ss).
It is noted that the weights for combining multiple objectives in an evaluation function are usually determined by practical requirements, i.e. priorities of each objective. In this paper, rather than choosing arbitrary weights, we normalise their ranges, using the ranges of the best values shown in Fig. 3, to give the weights, so that varying each objective can then have a similar effect in MPC.
The acceptable ranges for g(S), g(Ss) and g(SOC) can be obtained from Fig. 3 (a) Relation between g(S) and g(SOC), with g(Ss) in colour map (b) Relation between g(S) and g(Ss) , with g(SOC) in colour map (c) Relation between g(Ss) and g(SOC), with g(S) in colour map Fig. 3.

The relation between each two evaluations
The 23rd International Conference on Electrical Machines and Systems (ICEMS) Therefore, the multi-objective evaluation function in (16) can be rewritten as (20) and the results in Fig. 2 can be expressed using (20) as Fig. 4. The optimal weighting factor combination is the one whose g(Obj) value is the minimum.  Fig. 4, if the sample step for wSs and wSOC is small, theoretically the optimal weights can be directly obtained from this heat map. However, small sample steps will cause the detailed system to run for tens even hundreds of hours, which makes this method unrealistic.
To solve that, relatively large sample steps are used to run the model within a reasonable time, as presented in Section I. Then NN is adopted for training the data collected. Furthermore, the trained NN is used to exhaustively predict the g(Obj) values for small sample steps (e.g. 0.1 for each WF) in the given range, and the optimised weighting factors can be finally obtained.   5 depicts the schematic of the used feedforward NN. A basic forward ANN comprises an input layer, one or more hidden layers, and an output layer. The neuron numbers in input and output layers are determined by sample designs while the neuron number in hidden layers can be changed. The NN in this study directly maps the relation from two weighting factors in (4) to g(Obj). As mentioned, the NN should be trained by the sample data obtained from the detailed MILP-MPC simulations. Then, based on the tiny-step sampling in the 2D input space, the variations of g(Obj) (against two factors) can be globally obtained. Noting that NN is a math model with simple structure thus, the generation process of g(Obj) can be very fast (less than 0.1 sec for 0.36 million data points). Fig. 6 shows the global g(Obj) results based on the trained NN. The optimised WFs for the evaluation function in (20) are obtained as wS =5, wSs =34.7, wSOC =56.8, correspondingly, wS/wSUM =0.0518, wSs/wSUM =0.3596, wSOC/wSUM =0.5886. The predicted minimum value for the g(Obj) is 1.76. The rmse between the raw dataset and the NN predicted data is 0.0169. In this section, the MPC with the optimised WFs is simulated to obtain the best g(Obj) in (20). The optimised WFs are different from the g(Obj) weights which are empirically adopted. Hence both WFs are compared in simulation to demonstrate the comparison. Fig. 7 shows the load connection results based on the optimised WFs (a) and the empirical WFs (wS =2.8, wSs =1.68, wSOC =1) (b). By adopting the optimised WFs, the low priority noncritical load is shed for a longer time to reduce the switching activities. The load shedding when adopting optimised WFs is 1.17 times of that adopting the empirical ones, while the switching activities are reduced by 4.17 times.  Fig. 5 and Fig. 8 are evaluated by the multiobjective evaluation function in (20). When the optimised WFs are adopted, g(Obj)=1.776, while the empirical WFs makes g(Obj)=1.97. Therefore, the optimised WFs selected in Section IV perform much better than the empirical weights. Moreover, the minimum g(Obj) with sampled WFs in Fig. 4 is 1.778. Hence, the WFs obtained from NN perform better than the sampled WFs, which verifies the effectiveness of the proposed NN approach.

VI. CONCLUSIONS
This work presents a novel method for selection of the WFs in the multi-objective cost function of the MPC algorithm. A NN is used to predict the effects of different weights in order to allow the user to more easily find appropriate weights to use. The NN is trained by sample WFs and the corresponding evaluation value in a given range, with large step sizes. This NN model is then used to predict the evaluation value for WFs with smaller granularity to find optimised WFs, which gives a lower evaluation index. The NN therefore saves time for running the MPC model. The optimised WFs are different from the weights in evaluation index, and the simulation results verifies that the optimised WFs selected by the proposed method has better performance.