Covariance Local Search for Memetic Frameworks: A Fitness Landscape Analysis Approach

The design of each agent composing a Memetic Algorithm (MA) is a delicate task which often requires prior knowledge of the problem to be effective. This paper proposes a method to analyse one feature of the fitness landscape, that is the epistasis, with the aim of designing efficient local search algorithms for Memetic Frameworks. The proposed Analysis of Epistasis performs a sampling of points within the basin of attraction and builds a data set containing those candidate solutions whose objective function value falls below a threshold.The covariance matrix associated with this data set is then calculated. The eigenvectors of this covariance matrix are then computed and used as the reference system for the local search: a change of variables is performed and then the local search is performed on the new variables. The Analysis of Epistasis has been implemented on the three local search algorithms composing a popular MA called Multiple Trajectory Search (MTS). Numerical results show that the three modified local search algorithms outperform their original counterparts.


I. INTRODUCTION
Local search is a fundamental element of Memetic Computing. Within Memetic Algorithms (MAs), properly designed local search algorithms may have a major impact on the performance of the entire Memetic Framework [1].
Within the context of the algorithmic design, studies on Fitness Landscape Analysis (FLA) [2], [3] showed that testing and understanding the optimisation problem enable the design of powerful problem specific heuristics [4]. For example, the Fitness Cloud allows to estimate the potential of the search [5] while the negative slope [6] and hardness measure [7] estimates the difficulty of the problem. Some other examples of fundamental features of an optimisation problem are modality, epistasis, ruggedness and deceptiveness [8]. Although most of the work on FLA currently present in the literature tends to focus on the combinatorial domain, recent studies showed that FLA has great potential also on continuous problems, see [4], [8].
In Memetic Computing, FLA provides precious pieces of information to design global and local search algorithms employed within its framework [9]- [11]. For example, the balance between global and local search should be related to the modality of the problem [12]. The most studied feature in the continuous domain is perhaps the ruggedness, that is the number and location of local optima, see [13], [14]. A popular way to analyse the ruggedness of a problem consists of building up a random walk which, while moving according to a pattern, samples and saves the points of the landscape, see e.g. [8].
The present paper focuses on local search design based on the analysis of epistasis. Furthermore, this study refers to optimisation in the continuous domain. The concept of epistasis is borrowed from biology where it refers to the dependency between genes and phenotype. In Optimisation and more specifically in Genetic Algorithms (GAs) epistasis has been reinterpreted by means of an analogy: epistasis refers to the degree of dependency between genes in a chromosome and its objective function/fitness value [15].
For optimisation problems in general (when we do not necessarily have the GA metaphor of chromosome), epistasis is the degree of interdependency between variables with respect to the objective function value. In other words, a multi-variate optimisation problem is characterised by low epistasis if it can be decomposed into multiple independent problems with lower dimensionality while it is characterised by high epistasis if this decomposition is not possible. This property is also referred to as operational/non-linear separability and is broadly used to address large scale optimisation problems, see [16]- [18]. This paper proposes a method for handling highly epistatic local problems by means of FLA. The proposed FLA at first generates a data set of candidate solutions whose fitness value is below (minimisation) a certain threshold. This data set describes the geometry of the basin of attraction [19]. The covariance matrix of the points composing this set is calculated and its eigenvectors extracted. A linear transformation is performed: the directions of the eigenvectors are used as the new reference system. The local search is then designed to move alongside the new system of coordinates. In this paper, we demonstrate and test the proposed approach on the three local search algorithms composing an MA called Multiple Trajectory Search (MTS) proposed in [20].
The remainder of this paper is organised in the following way. Section II describes the proposed analysis of epistasis and consequent change of reference system. Section III shows how the analysis of epistasis is used to modify the local search algorithms of MTS. Section IV displays the numerical results of this study and demonstrates that local search designed after the analysis outperforms the original local search for the three algorithms composing MTS. Section V provides the conclusion of this study.

II. ANALYSIS OF EPISTASIS AND NEW REFERENCE SYSTEM
In order to clarify the notation used in this article, we will refer to the minimisation problem of an objective function f (x) in the continuous domain. The candidate solution x is a vector of n design variables in a hyper-cubical decision space D ⊂ R n : Each vector x ∈ D is expressed in the orthonormal reference system of its variables. In other words, any vector x can be interpreted as the linear combination of the orthonormal basis B e = {e 1 , e 2 , . . . , e n , } where e k is a vector of length n composed of all zeros and a 1 in the k th design variable, see [21].
Let us now consider that, at some point of the functioning of an MA, a basin of attraction has been detected and a local search must be activated, see [19], [22], [23].
The proposed analysis of epistasis consists of the following steps. A number of candidate solutions/points are sampled in the decision space D and their objective function values are calculated. The function values are compared with a threshold thr and those values that are below thr are saved in a data structure, while the others are discarded. The purpose of this step is to have a sample of points whose distribution describes the geometry of the problem.
The m vectors/candidate solutions whose objective function value is below the threshold thr . . , x m n ) are allocated in the data structure V.
These points can be interpreted as a multivariate statistical distribution characterised by a mean vector: x i n and a covariance matrix: Subsequently, the n eigenvectors of the matrix C are calculated by means of Cholesky Factorisation, see [21], [24]. These eigenvectors are the columns p i of a matrix P: The directions of the eigenvectors are then used by the algorithms to perform the search. Algorithm 1 displays the pseudocode of the Analysis of Epistasis.

Algorithm 1 Analysis of Epistasis
INPUT objective function f (x), decision space D, and parameters thr and Process the data structure V to calculate the mean vector µ and covariance matrix C Apply Cholesky Factorisation to extract the eigenvectors P = p 1 , p 2 , . . . , p n OUTPUT Matrix P From a theoretical standpoint, the analysis of epistasis is a procedure that returns a new reference system that is identified by the eigenvectors p k of the matrix C. From linear algebra, we know that a change of coordinates is the multiplication of a matrix (whose columns are vectors of the new reference system) by the vector in the old reference system [21]. Hence, to express x in the reference system identified by P, we calculate the vector x in the new system as x = Px. Equivalently, every point x in the new reference system can be interpreted as a linear combination of the vectors of the basis B P = {p 1 , p 2 , . . . , p n }.
In order to justify the proposed new system of coordinates, we have to consider the fundamental meaning of the covariance matrix of a multivariate distribution, see [25]. The diagonal elements of the matrix directly describe the geometry of the problem since they represent the extent of the distribution along a variable. A diagonal element much larger in value than that of other diagonal elements means that the distribution is stretched along a design variable.
The extra-diagonal elements represent the correlation between pairs of variables. A large value means high correlation while zero means no correlation. In our case, since the points are sampled below a threshold, the correlation is meant with respect to the objective function: zero means that the function can be decomposed over the variables while a large value means that this decomposition is not possible. In order to intuitively visualise this fact, the covariance matrix associated with a sphere or a non-rotated ellipsoid would be diagonal while that associated with a rotated ellipsoid would be full.
The proposed analysis of epistasis detects that basis of vectors (system of coordinates) P that diagonalises the covariance matrix C [21], since where D is a diagonal matrix whose elements are the eigenvalues of the matrix C. Then, within this basin of attraction, we propose to perform the local search alongside the directions of the new variables. This operation can be interpreted as a problem transformation by the change of reference system. Since in the new system the problem is characterised by low epistasis, it can be addressed by moving alongside the new variables. This concept is broadly used in other contexts, especially in Data Science, and is closely related to Principal Component Analysis (PCA) [26].
In order to graphically represent the analysis of epistasis of the landscape let us consider the following objective functions in two dimensions within [−100, 100] 2 , see e.g. [27].
Let us consider that these problems have been shifted by means of a bias and rotated by means of an orthogonal matrix. Fig. 1 shows the data set of points below the threshold (in blue) and the corresponding new reference system (dashed black lines) identified by the columns of the matrix P. Furthermore, we may easily observe that the covariance matrix C is symmetric, i.e. with reference to its definition above c j,l = c l,j , ∀j, l. Hence, the following observation is valid, see [21].
Observation 1: Since the covariance matrix C is symmetric, it follows that 1) C is always diagonalisable and hence there always exists a non-singular transformation matrix P that diagonalises C. 2) each pair of the eigenvectors of C (column of matrix P) is orthogonal (P −1 = P T ), the proposed reference system is orthogonal. The meaning of this observation is that we are theoretically guaranteed that Algorithm 1 always returns a coherent output i.e. a new orthogonal reference system that diagonalises the covariance matrix.

A. Considerations about the Directional Derivatives
In order to better highlight the functioning of the proposed Analysis of Epistasis, we tested the values of the directional derivatives alongside the directions of the variables as well as those alongside the directions of the eigenvectors of the matrix C (columns on the matrix P).
With reference to Bent Cigar in Fig. 1 by using the detected minimum as a start point we performed ten steps of size 0.01 along the directions of the variables and those of the eigenvectors. At each step, we saved the objective function value and calculated the numerical directional derivative.  The test shows that one of the directional derivatives along one eigenvector has a high value while the other has nearly zero. The landscape analysis identifies a search direction along which small steps quickly lead to large improvements (steep slope) and another direction along which the problem is nearly "flat". The directional derivatives along the variables present intermediate features.
We repeated the same test on a number of functions and multiple dimensions ranging from 2 to 100 dimensions. Apart from the sphere where, due to the central symmetry, all the lines of the derivative plots collapse in one line, the analysis of epistasis systematically detected a direction whose gradient was higher than that of any variable directions. In other words, the proposed analysis of epistasis appears to be able to detect some preferential directions along which a local search displays a high convergence rate.
In order to show an example of the gradient in higher dimensions, Fig. 3 illustrates the results of the test for the rotated ill-conditioned ellipsoid in ten variables: where the design variables z i are obtained from x i after shifting and rotation, see [27]. The plots in Fig. 3 clearly show that the proposed analysis yields to the detection of one direction whose derivative is higher than that of the directional derivative along any variable directions. We may intuitively observe the correlation between this result and PCA [26].

B. Limitation of the proposed Analysis of Epistasis
An evident limitation of the proposed Analysis of Epistasis outlined in Algorithm 1 is that its success depends on the threshold parameter thr. This parameter should allow the generation of a data set that describes the geometry of the problem. With reference to minimisation, a value that is too low would result in a small or empty data set, while a value that is too high would generate a dataset that does not contain the geometrical features of the basin of attraction. Furthermore, this parameter is problem dependent as it depends on the specific objective function values. Besides being inelegant, the sampling procedure can also be computationally expensive in the high dimensional domain.
On the other hand, the latter is an issue common to other FLA methods and is a consequence of the concept of FLA. Moreover, in the context of a Memetic Framework, we may suggest that part of the points visited in other phases of the search can be saved in an archive and used for a local analysis of the fitness landscape. Finally, we observed that the selection of a reasonably good thr could be automatically performed with modest computational effort: we sample a small number of points (10 × n) in the domain and selected the lowest objective function value among those of the sampled points as a threshold thr.

III. LOCAL SEARCH DESIGN OF MULTIPLE TRAJECTORY SEARCH
MTS is an MA composed of three local search algorithms, each of them improving upon the performance of a single solution x. These three local search algorithms are supervised by a framework that processes a population of solutions and activates the local search on single solutions. The selection of the local search is adaptive and based on a mechanism that rewards the most successful local search algorithms, see [28].
The three local search algorithms, namely Local Search 1 (LS1), Local Search 2 (LS2), Local Search 3 (LS3) are briefly outlined in the following.
• LS1 is a greedy, deterministic local search [29] belonging to the family of pattern search [30] that explores all the variables of the problem in both directions • LS2 is a randomised version of LS1 where only some of the design variables (about 25% of the variables) in a randomly selected direction are explored at each iteration • LS3 is a steepest descent local search [31] that concurrently performs two actions: the deterministic perturbation and a randomised exploration along all the variables This section reports the implementation of LS1, LS2, and LS3 in the new system of coordinates determined by the Analysis of Epistasis in Algorithm 1. Let us consider that a local basin of attraction has been identified and a matrix P = p 1 , p 2 , . . . , p n has been determined. The three modified local search algorithms, namely Covariance Local Search 1 (CLS1), Covariance Local Search 2 (CLS2), and Covariance Local Search 3 (CLS3) are reported in Algorithms 2, 3, and 4, respectively.
Although the pseudocodes are straightforward and only slightly modify those in [20], it is worthwhile mentioning that the change of variables alters the update rule. While in its original version the local search varies one single design variable at the time, in the proposed scheme the update of one variable in the new reference system corresponds to a sum of vectors. For example, with reference to Algorithm 2, the update of a single design variable that is a sum of numbers.
In the new reference system, the corresponding proposed update rule is that is, unavoidably, a sum of vectors.
Algorithm 2 Covariance Local Search 1 INPUT x and P = p 1 , p 2 , . . . , p n while budget condition do It is worth mentioning the initialisation of ρ Algorithms 2 and 3 which means that 80% of the space is reachable with the first move. The stopping condition ρ > 10 −15 refers to the minimal radius allowed. The other parameters in Algorithms 2, 3, and 4 are the same parameters used in [20].

IV. NUMERICAL RESULTS
In order to experimentally demonstrate the effectiveness of the proposed Analysis of Epistasis, we tested the performance of CLS1, CLS2, and CLS3 and compared it against their original versions, LS1, LS2, LS3, respectively. For each experiment in this paper, LS1, LS2, and LS3 have been executed with a budget of 5000 × n function calls where n is the problem dimensionality. In order to guarantee a fair comparison, the budget of CLS1, CLS2, and CLS3 has been split into two parts: 2500 × n function calls have been used to build the covariance matrix C whilst 2500 × n function calls have been spent to execute the algorithm. The bound handling has been performed by saturating the design variable to the bound. We preferred the saturation to the bound over the toroidal insertion or reflection [32] since the latter two mechanisms would be equivalent to the sampling of a point. We evaluated that this sampling would modify the original logic of the local search algorithms.
Each algorithm for each problem had been run 30 times. Since the purpose of this experimental study is to check multiple pairwise comparisons LS# vs CLS# for each problem, we strengthen the statistical significance of the tests by the application of the Wilcoxon rank-sum test, see [33], [34]. In the Tables in this section, a "+" indicates that CLS# significantly outperforms LS#, a "-" indicates that LS# Algorithm 3 Covariance Local Search 2 INPUT x and P = p 1 , p 2 , . . . , p n while budget condition do end if end while end while RETURN x significantly outperforms CLS#, and a "=" indicates that there is no significant difference in performance.
To test the enhancement in performance due to the new reference system (directions of eigenvectors) we built the following testbed. We simulated the local search conditions we considered the unimodal functions of CEC2008 [35] and CEC2013 [27]. More specifically the following testbed has been used.
Numerical results displayed in Tables I, II, III show that the proposed Analysis of Epistasis is overall beneficial to the performance of the three local search algorithms. As a first comment, we should not expect an improvement for f 1 since a sphere is a lowly epistatic problem due to its central symmetry. In other words, the reference system will not affect the local search performance. The benefits of the proposed fitness landscape analysis are evident for LS1 where all the directions of the new basis of vectors (p 1 , p 2 , . . . ; p n ) are exploited. Results on LS3 display that the proposed approach is beneficial also to for this local search. The performance of CLS3 is marginally but consistently better than that of LS3. It should be observed that also in this case the entire basis of vectors is exploited. However, most of the exploration is set in the close neighbourhood of the current best point and since the radius is never reduced there is no clear mechanism to follow the variations of the gradient. Hence, the potentials of the information provided by the Analysis of Epistasis are not fully exploited. Furthermore, we observed that the randomisation in LS3 appears to be detrimental to the proposed fitness landscape approach. The advantages of the proposed fitness landscape approach on LS2 are, albeit present, not substantial. In this case, the local search selects randomly only 25% (on average) of the variables/search directions. This mechanism does not exploit the potentials of the proposed fitness landscape analysis since the new variables with high directional derivatives may be excluded from the search, see Fig.s 2 and 3.
In order to better highlight the impact of the proposed approach on the local search performance, Fig.s 4, 5, 6 show the performance trends of LS1 vs CLS1, LS2 vs CLS2, and LS3 vs CLS3, respectively. The performance trends in Fig.s 4, 5, and 6 show that for these problems the use of the eigenvectors of the covariance matrix as new variables is very efficient. It should be highlighted that the CLS# algorithms require half of the computational budget to analyse the landscape and build the covariance matrix. This explains why there are no improvements in the fitness values for the first half of the trend. However, the displayed results show that in many cases this investment in the budget pays off with a rapid improvement in performance during the second half of the run. This consideration in aligned with the algorithmic philosophy of Fitness Landscape Analysis [4], [8], [14] and should be taken into account while local search elements are designed within Memetic Frameworks [38], [39].      This paper proposes a method to analyse the fitness landscape of continuous optimisation problems. The analysis aims to assess the epistasis of a basin of attraction to exploit this information to design an effective local search within Memetic Frameworks. The proposed method makes use of sampled points describing the geometry of the problem and the covariance matrix associated with the distribution of these points. The local search explores the basin of attraction along the directions of the eigenvectors of this covariance matrix. In other words, the local search works on an alternative reference system of variables. This alternative reference system ensures that the covariance matrix is diagonalised and then the problem appears to be lowly epistatic.
The proposed method has been implemented on the three local search algorithms composing a popular MA called Multiple Trajectory Search. Multiple experiments have been run on a number of rotated unimodal problems. These unimodal problems simulate basins of attraction detected during the search of an MA. Numerical results show that the proposed landscape fitness analysis detects a set of preferential search directions. One of these directions is associated with a directional derivative higher than that of the directional derivatives associated with the directions of all the variables. Furthermore, each local search designed on the proposed fitness landscape analysis outperforms its original counterpart.
The proposed fitness landscape analysis can be potentially integrated within Memetic Frameworks to automatically design problem-specific local search algorithms.