Analysis of Objectives Relationships in Multiobjective Problems Using Trade-Off Region Maps

Understanding the relationships between objectives in many-objective optimisation problems is desirable in order to develop more effective algorithms. We propose a technique for the analysis and visualisation of complex relationships between many (three or more) objectives. This technique looks at conflicting, harmonious and independent objectives relationships from different perspectives. To do that, it uses correlation, trade-off regions maps and scatter-plots in a four step approach. We apply the proposed technique to a set of instances of the well-known multiobjective multidimensional knapsack problem. The experimental results show that with the proposed technique we can identify local and complex relationships between objectives, trade-offs not derived from pairwise relationships, gaps in the fitness landscape, and regions of interest. Such information can be used to tailor the development of algorithms.


INTRODUCTION
It is important to understand the relationships between objectives in multiobjective optimisation problems (MOPs) because this can help to tailor the search according to the multiobjective fitness landscape. This is particularly when tackling large real-world MOPs with many objectives (more than two). In the multiobjective optimisation literature, the focus is often on MOPs that exhibit strong conflict relationships between objectives as this is part of the motivation for applying multiobjective techniques. However, the conflict relationship between objectives could be local rather than Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.
GECCO '15, July 11 -15, 2015 global. A global conflict relationship would hold throughout most, if not all of, the search space. On the other hand, a local conflict relationship would hold in a restricted region of the search space, e.g. the objectives could be in conflict early in the search but not conflicting later. This was discussed by Knowles and Corne [1] in the context of the multiobjective quadratic assignment problem. Castro-Gutierrez et al. [2] studied the objectives' relationships in multiobjective vehicle routing problems.
As the number of objectives increases, composite relationships between objectives might emerge, i.e. relationships between objectives (conflicting or otherwise) that are not global but localised and more complex. Several techniques have been previously applied for the analysis and visualisation of relationships between objectives in MOPs. These include parallel coordinates, scatter-plots (which both involve graphical representations), Kendall correlation [3] (a quantitative metric), and statistical measures [4], among others. Purshouse and Fleming [5] discussed these techniques in their research into the relationships between objectives in MOPs. Other works that have used some of these techniques include Castro-Gutierrez et al. [2] on multiobjective vehicle routing problems, and Ishibuchi et al. [6] on many-objective problems with correlated objectives. One limitation of these techniques is that they are most suited to identify only pairwise relationships between objectives.
In this work, we propose a technique to analyse and visualise global and local relationships between many objectives in order to achieve a clearer understanding of the fitness landscape in a MOP. The technique requires a set of non-dominated solutions to be supplied (which can be obtained in any way) and uses Karnaugh maps [7] to visualise composite relationships between many objectives. The technique also uses correlation and scatter-plots to complement the analysis. The technique involves four steps: analyse global pairwise relationships between objectives in the given non-dominated set, estimate the range of values for each objective, identify objective trade-offs using Karnaugh region maps, and identify local relationships between objectives using scatter-plots. This technique seeks to analyse the objectives' relationships from multiple perspectives in order to better understand the fitness landscape in many-objective combinatorial optimisation problems, which are well known for having irregular and difficult to assess fitness landscapes. Section 2 surveys some related work. Section 3 provides the motivation for this work. Section 4 describes the proposed technique while experimental results applying the proposed technique to several instances of the multiobjective multidimensional 0-1 knapsack problem [8,9] are presented in Section 5. Our main observation is that different instances of the same problem may exhibit very different relationships between objectives. Finally, Section 6 concludes the paper.

RELATED WORK
Better understanding of fitness landscapes has been beneficial in multiobjective combinatorial optimisation (MOCO) problems. For example, Garrett and Dasgupta [10] [11] adapted single-objective landscape analysis techniques (distribution of optima, fitness distance correlation, ruggedness, random walk analysis and analysis of the geometry of the solution space) for tailoring multiobjective evolutionary algorithms. They applied such techniques to quadratic assignment and generalised assignment problems with two objectives. They concluded that the performance of hybrid algorithms benefits from using knowledge of the fitness landscape. Castro-Gutierrez et al. [2] used the objectives' pairwise dependency correlation analysis proposed by Purshouse and Fleming [5] to assess the conflicting nature of objectives in multiobjective vehicle routing problems.
Brownlee and Wright [12] proposed a visualisation technique to evaluate the quality of a non-dominated set based on a ranking of the objectives. However, this technique may not be suitable for large solution sets as it is based on the individual analysis of solutions. Other visualisation techniques include objective wheels, bar graphs and colour stacks as explored by Anderson and Dror [13].
Verel et al. [14] adapted single-objective landscape analysis techniques to set-based multiobjective problems with objective correlation. Later, Verel et al. [15] conducted a study on the landscape of local optima in such problems. Verel et al. [16] proposed to carry out a priori analysis of a problem by evaluating the problem size, its epistasis, the number of objectives and the correlation values between objectives, to suggest the best way to tackle it. They concluded that, depending on the problem features, different types of algorithms (scalar or Pareto approach) and sizes of the solution archive should be employed.
Walker et al. [17] reviewed different methods (scatter plots, parallel coordinates and heat maps) to visualise solution sets for many-objective problems. They also proposed two techniques: a data mining visualisation tool to plot a convex graph, and a new similarity measure of solutions to plot them in a two-dimensional space.
Giagkiozis and Fleming [18] proposed a technique to estimate the Pareto front of a continuous optimisation problem, and then use the estimated front to obtain values for the decision variables of interesting solutions. They proposed using a multiobjective algorithm to obtain an initial solution set, which is then used to calculate a projection matrix of the optimal Pareto set. They tested their technique on convex benchmark problems.
Tusar and Filipic [19] presented a comprehensive survey and assessment on several visualisation techniques for manyobjectives approximation sets. They also presented a visualisation method that uses orthogonal projections of a section and applied it to four-dimensional approximation sets.
It is clear that understanding the fitness landscape of multiobjective optimisation can help to develop better solution methods. It is also clear that the analysis and visualisation of objectives' relationships, particularly in combinatorial landscapes with many objectives, is a topic of interest for researchers. The technique proposed in this paper seeks to make a contribution in this area.

OBJECTIVES RELATIONSHIPS IN MULTIOBJECTIVE OPTIMISATION
The focus of this research is to investigate the relationship between objectives in MOPs by analysing the nondominated approximation set and its coverage of the solution space. We use the concepts of conflict, harmony and independence between objectives as proposed by Purshouse and Fleming [5].
Results from some existing techniques to assess the conflicting nature of objectives can be deceiving. The literature includes studies of pairwise relationships between objectives [2,6,20]. However, analysis techniques such as Kendall correlation [3] only manage to identify global relationships between objectives. Figure 1 shows a Pareto-front between two maximisation objectives, Z1 and Z2, in a scenario with three objectives (we omit the scatter-plots for Z3). Clearly, when Z1 < 0.5, the objectives are conflicting while when Z1 > 0.5 the objectives are harmonious. However, if the number of solutions with Z1 lower than 0.5 is roughly the same as the number of solutions with Z1 higher than 0.5, simply applying the Kendall correlation technique would result in a correlation value close to 0. The conclusion could be drawn that the objectives are independent, when in reality there may be local relationships that could be exploited.
Some existing techniques might not reveal trade-offs which result from composite relationships between two and more objectives. Figure 2a lists a set of non-dominated fitness vectors for three maximisation objectives and Figure 2b shows their scatter-plot matrix and correlation values. We can observe that high values (51-100) appear simultaneously only in up to two of the three objectives. Two points in Figure  2a have only one good objective value. Four points are good for only Z1 and Z2. Three points are good for only Z1 and Z3, and the remaining points are good for only Z2 and Z3. No solution present has values higher than 50 for all three objectives simultaneously. The scatter plot and correlation values do not help us to appreciate the three-way trade-off. Likewise, we can see that the correlation values do not indicate any strong pairwise correlation.
Hence, to better analyse and visualise the multiobjective nature of optimisation problems, we need techniques that help us to identify global, local and composite relationships

A FOUR STEPS ANALYSIS AND VISUALISATION TECHNIQUE
We propose a four-step technique to analyse and visualise objectives relationships. It requires some knowledge of the problem domain (the desirable range of objective values) and an approximation set of non-dominated solutions, which could be obtained using any multiobjective algorithm (MOA), such as those available in frameworks like JMetal [21] and ParadiseEO [22]. The quality of the approximation set given may affect the conclusions from the analysis because inaccurate ranges and scatter plots could lead to inaccurate observations, thus combining the results from the application of a number of well-accepted MOAs may be wise.
The scope of our technique is to aid the study of a subset of problem instances, to aid in tailoring an algorithm for solving other problem instances. Our aim in this work is to investigate the suitability of our technique, hence we test it on scenarios for which good algorithms are known. Although the approach requires some instances to be solved beforehand, the increased understanding of the problem should help in identifying the strengths and weaknesses of novel algorithms. Solution of these many-objectives problems are computationally expensive thus any help in tailoring fast techniques that can provide good-enough results has value, and this technique may allow a user to identify similarities between instances which could be exploited.
Each step in the proposed technique aggregates some information about the relationship between objectives, the coverage of the feasible solution space and the trade-offs in the fitness landscape. The four steps are described here and are illustrated by applying them to benchmark instances of the multiobjective multidimensional knapsack problem.

Step 1 -Global Pairwise Relationship Analysis:
First, the Kendall correlation values [3] are calculated, as in [5], to identify global pairwise relationships. Strongly conflicting correlations (values < −0.5) immediately indicate that a trade-off surface exists, whilst strongly harmonious correlations (values > 0.5) indicate that objectives could be aggregated or clustered. Correlation values showing that objectives are independent indicate that the data is not globally dependent, but do not imply the absence of local tradeoffs in the fitness landscape. If independence is detected, the problem could be decomposed by separating the decision variables according to the objectives, to solve each objective (or groups of objectives) separately as such an approach provides improved performance [23].
Step 2 -Objective Range Analysis: Here, the range (difference between best and worst values) is calculated for each objective in the given approximation set. Then, using problem domain knowledge the objectives which are interesting for further exploration are identified. A meaningful objective is an objective with a range which is large enough so that solutions can be classified into different quality categories regarding that objective value (e.g. good to bad, high to low, etc). Analogously, a non-meaningful objective is an objective with a range so small that the variability in the solution quality regarding that objective is considered negligible, thus not worth exploring further.
One way to deal with non-meaningful objectives is to ignore them during the optimisation. It is possible that by optimising the other objectives, the non-meaningful ones will present acceptable values within their small range anyway. Having a small range on the given approximation set does not mean that the range will be small across the entire solution space. The non-meaningful objective could be removed, the MOA re-executed and the ranged recalculated for the excluded objective. If the new solution set still exhibits only a small range for the excluded objective, it can be safely ignored. Another way to deal with non-meaningful objectives is to combine or cluster them [20].
Step 3 -Trade-off Regions Analysis: We apply a quantitative method, namely Karnaugh maps [7], to classify the solution space into regions to help with the identification of trade-offs and the complex relationships between objectives. A Karnaugh map is a method for simplifying boolean algebra expressions using a truth table. The map has 2 i cells where i is the number of variables. The cells are labelled with binary numbers following the Grey code, meaning that any two adjacent cells differ in one bit. Hence, in a three variable scenario, the cells adjacent to cell 0 (0002) are cells 1(0012), 2(0102) and 4(1002). Karnaugh maps make it easy to visualise patterns that are used to group boolean variables.
In this step, for each objective Zi where (i = 1, 2, . . . m), we define a threshold ti such that values above ti (maximisation problem) are considered good or acceptable, and values below ti are considered inadequate. These ti can be set using some knowledge of the problem domain or empirically -for example, using the average value for each objective as its threshold. Next, we classify each objective value in each solution as good (↑) or bad (↓). Finally, we draw a region map similarly to a Karnaugh map, but showing the count of solutions in each region rather than 0-1 output variables, and using solutions classifications (labelled ↑ and ↓) instead of the input variable values. The map is built with 2 m regions such that each region represents a single combination of good and/or bad objectives. We number the regions r k using a binary encoding such that the least significant digit represents Z1 and the most significant digit is Zm, ↑= 0 and ↓= 1. For instance, the region Z ↓ 3 , Z ↑ 2 and Z ↓ 1 , would be region r5, since binary 1012 is 5.  Figure 3 presents the region map schematics for 3, 4 and 5 objectives. Each region is identified with the r k . Regions with the same number of good solutions are highlighted with the same shade of grey in such a way that lighter tones represents a higher number of good solutions while darker tones represents fewer good solutions.
The main advantage of the region map is that we can easily identify which objectives simultaneously present good values and the existence of trade-offs. If the region r0 is not empty, then we have solutions with acceptable values in all objectives, meaning that the problem could potentially be tackled with single-objective algorithms. A range analysis on the solutions in this region could provide additional information on which approach is appropriate. On the contrary, when most solutions fall into region r2m−1, it means that the thresholds may have been set too high and should be lowered for more accurate results. When there are no solutions in r0, but there are solutions scattered across the regions, there are trade-offs and the map can be used to visualise them.
Step 4 -Multiobjective Scatter-plot Analysis: The last step is an analysis using scatter plots. First, for each instance we normalise the values of all objectives. Then, we select an objective and draw a scatter plot of all remain-ing objectives against the selected objective. Finally, we can combine all scatter plots into a single one. By visually inspecting this combined graph we can identify local relationships (conflicts and harmony), interesting patterns, gaps in the solution space, and well-spread trade-offs or isolated regions. This information can help us to tailor a solution algorithm by directing the search towards the regions of interest. When the landscape of the solution space is consistent throughout all instances analysed in this way, we could have a clearer idea of what type of solutions to expect when solving unseen instances.
When picking an objective for this process, it is preferable to select one that has a wider range of values rather than being concentrated in only a small range, otherwise the resulting graph may be more difficult to read. It may be interesting to test different objectives in order to spot which provide more useful information, or, if multiple objectives provide different insights, all of them could be considered instead of only one.
The next section presents experimental results from applying the proposed analysis and visualisation technique to five sets of benchmark instances of the multiobjective multidimensional knapsack problem.

SAMPLE ANALYSIS
In order to illustrate the analysis technique we apply it to different scenarios of the multiobjective multidimensional knapsack problem (MOMKP) [8]. We aim to show that within the same problem, the proposed technique can identify multiple scenarios with distinct multiobjective natures.
In the MOMKP, we have n items (i = 1, . . . , n) with m weights w i j (j = 1, . . . , m) and p profits c i k (k = 1, . . . , p). A set of items must be selected to maximise the p profits while not exceeding the capacities Wj of the knapsack. This problem can be formulated as follows: We considered five MOMKP datasets, each with five instances, all with m = 4, p = 4, n = 1000 and Wj = 50000. The first four datasets were generated following the guidelines in Bazgan et al. [9] and are as follows: • Set A contains only independent objectives. In set B all objectives are harmonious. Set C contains three pairs of conflicting objectives, (Z2, Z1), (Z3, Z2) and (Z3, Z4), while the weights are uncorrelated. Set D has conflicting objectives, as set C, but the weights are correlated to the objective values. The fifth set X was generated using data from a realworld home health-care scheduling problem.  Table 1: Results for the pairwise relationship analysis (1.0 is completely harmonious, -1.0 is completely conflicting).
For each instance we run a single-objective genetic algorithm on each objective alone, then both NSGAII [24] and MOEA/D [25] algorithms on each pair and triple of objectives. We then combined all the obtained non-dominated solutions into an archive. We randomly drew from the archive half of the individuals for the initial population and the other half were randomly generated. We performed three runs for the MOEA/D and three runs for the NSGA-II with all objectives. The approximation non-dominated set was formed with all non-dominated solutions which were found in the process. Both NSGAII and MOEA/D used a population of 200 individuals, binary tournament selection and half uniform crossover [26] for 1500000 function evaluations. Overall, we obtained non-dominated sets with approximately 900 solutions for set A, 3 for set B, 550 for set C, 1500 for set D and 2500 solutions for set X.
Only a few non-dominated solutions were found for set B, the objectives there are strongly harmonious. Therefore, when maximising one of the objectives, the other objectives are also maximised. The data is therefore not enough for some of the analysis steps. However, the results are presented for completeness and illustrate that the number  Table 2: Results for the objective ranges analysis.
of solutions obtained can have a major impact on the analysis and that, it is important to have a comprehensive set, with enough well-spread solutions.

Application of the Proposed Technique
Step 1 -Global Pairwise Relationships Analysis. The results for this step are in  cient values for set B are not provided for the reason given above. The table presents the individual pairwise correlation value for each combination of objectives. As expected for fully independent objectives, set A has values close to 0. The values for sets C and D are also predictably close to either 1 or −1 indicating global conflicting or harmonious relationships. Set X has values similar to A -they do not reveal a strong global relationship between objectives as they are closer to 0 than to 1 or −1. Also, in set X it is not possible to decompose the decision variables according to the objectives, as every item has all weights and values above zero.
Step 2 -Objective Range Analysis. The results for this step are in Table 2. Considering set B, it can be seen, that the set presents small ranges of less than 0.3% on average. In this dataset all objectives are harmonious and the solutions found are all located in a small region of the solution space. These few solutions dominated all other solutions explored. Both sets C and D present similar results to each other with large ranges for each objective (over 60%). This is expected since these are instances with conflicting objectives and present global trade-offs. The large ranges mean that while we have solutions with good values for a given objective, at least one other objective has poor value.
Finally, we highlight that while the global pairwise relationship analysis (step 1) hinted that sets A and X were similar, the difference between them now becomes clear with the results from step 2. In set A, each objective range is around 24.0% of the maximum value -the smallest ranges excluding the harmonious instances -whilst in set X the ranges go up to 84.9%, the largest range found. Thus, we can see that the ranges for set X are closer to those for set D, a conflicting scenario with global trade-offs.
Step 3 -Trade-Off Regions Analysis. The results for this step are in Table 3. For each instance we computed the number of solutions in each region and the map shows the average percentage for each set. We set the range threshold to 1% above the mean value found for each objective, thus considering a value slightly above the average to be good. We can observe that on set A the front is well distributed as we have solutions in all regions, scattered throughout the solution space, as a result of the independent objectives. Additionally, note that we have solutions both in r0 and r15. This is due to the map presenting the combined results for all five instances in that set, and in some instances we have solutions in r0 only and in other instances in r15 only.
In sets C and D we clearly identify the global relationships. There are no solutions with good values in all objectives and most instances present no solution with good values in three objectives. The majority of the solutions are situated where Z1 and Z3 alone have good values and where Z2 and Z4 alone have good values, as these are the harmonious pairs. Additionally, we can observe that almost no solutions are present in conflicting areas. For instance, where Z1 and Z2 present good values simultaneously. Moreover, solutions in conflicting areas should be close to the chosen threshold.
The set X does not contain solutions in r0 and there are no solutions in regions r1, r2, r4 and r8, meaning that no high values can be simultaneously found for three or more objectives. We can find good values simultaneously only for up to two objectives. The map for set X resembles the ones for sets C and D in the sense that we can clearly see that there are several regions without solutions. Thus, we have trade-offs to present to the decision-maker. This means that the decision-maker has to choose between up to two good objective values to the detriment of the remaining objectives, since all of the regions containing solutions have at most two simultaneously good values.
Step 4 -Multiobjective Scatter Plot Analysis. This analysis was performed for each instance and Figure 4 presents the results for all instances of each set combined. We can see in Figure 4a that although instances in dataset A are completely random, all of them show similar landscapes with a high concentration of solutions towards the (1, 1) corner. Moreover, no local relationships can be identified, which is expected as the data is completely random.
On sets C and D we can clearly see the trade-off regions. Also, there is a noticeable gap in the solution space when Z1 is in the range from 0 to 0.5 and when the remaining objectives are in the range from 0 to 0.4, approximately. Moreover, the landscape of the solution space appears to be similar for all instances of each of the sets C and D.
Since the data was uniformly generated (these gaps are unlikely to arise from the data itself) and could represent limitations in the solution algorithms, indicating that they did not explore the entire front. It is well known that the performance of some MOEAs is limited when the number of objectives is more than three [27].
Set X presents a unique scenario, as we can identify patterns and gaps in the solution space. The first feature worth noticing is that there is a lack of solutions with values within [0.85, 1]. Again, this is due to limitations of the solution algorithms. However, we can see that the size of the gap is small, confirming that instances with strongly conflicting objectives present a bigger challenge to the algorithms. We can also identify several local relationships. When Z1 ranges from 0.5 to 0.8, the three remaining objectives simultaneously conflict and harmonise. Knowing that only two simultaneous objectives present high values (from the region map analysis), we can conclude that whenever Z1 increases, only one of the other objectives simultaneously increases too.

Discussion
With the proposed analysis and visualisation technique, we can better understand the multiobjective nature of the problem instances considered. Looking at only the correlation coefficients, we could conclude that: sets A and X do not present interesting multiobjective traits, that set B is inconclusive and that sets C and D present conflicting and harmonious objectives. However, by applying our analysis and visualisation technique we can reach a more comprehensive understanding of these instance sets.
As fully random instances, dataset A does not present relevant global or local pairwise relationships according to the global pairwise analysis (step 1) and the multiobjective scatter plot analysis (step 4). Additionally, the objective range analysis (step 2) shows that even though there is a large set of non-dominated solutions, these are concentrated in a reasonably small area of the search space. For this dataset we can use the information from the trade-off region maps to interact with the decision-maker to identify which regions are of more interest and then use single-objective optimisation algorithms to find solutions in that region. Since we have solutions in all of the regions of the map, any objective vector could provide an adequate solution.
Set B presents a completely harmonious case and by analysing the ranges and bearing in mind that the algorithms found just a handful of solutions, we can assume that a single-objective algorithm aiming to maximise any of the objectives could provide a reasonable good solution.
Sets C and D present similar scenarios, hence the correlation between weights and coefficients does not impact on the nature of the problem. The entire solution set represents a huge trade-off. We also notice that the algorithms found it very difficult to expand along the front and that they mainly explored the region surrounding the intersection of the trade-off. Nonetheless, by perceiving that all instances in these sets have similar landscapes and by knowing the approximate boundaries of each objective (by applying single-objective algorithms to each objective alone), we can estimate the landscape of solutions for other instances in those sets. Therefore, we could direct the search to the regions of interest after presenting the expected trade-offs to the decision-maker. However, if it is imperative to use an a posteriori approach, the global pairwise analysis and the scatter plots provide sufficient information to make feasible the grouping of harmonious objectives.
Finally, set X presents a quite different picture. By only evaluating the global pairwise analysis (step 1) we conclude that there is no strong pairwise relationship between objectives. However, the objectives range analysis (step 2) shows that in fact we have non-dominated solutions that vary greatly in quality. This is an indication of the existence of trade-offs (as we can see by comparing this set with sets C and D). The trade-off region analysis (step 3) showed the existence of overall trade-offs as it is not possible to have solutions with good values in more than two objectives simultaneously. Finally, the multiobjective scatter plot analysis (step4) identified local relationships between objectives and gaps in the solution space, pointing to the existence of local conflicts. Therefore, instances in dataset X exhibit a distinctive multiobjective nature perhaps with interesting options for a decision-maker. A sound possibility to tackle this problem would be to use the region map to identify the regions of interest and then locate those regions in the scatter plot. In case a selected region contains a local conflict, we can use the algorithm proposed by [1] to reach the trade-off front and then expand through it.

CONCLUSION
We proposed a technique that uses correlation, tradeoff region maps and scatter-plots as tools for the analysis and visualisation of objectives' relationships in multiobjective optimisation problems. The technique consists of four steps: 1) evaluate the global correlation values, 2) compute the range of values for all objectives, 3) compute the distribution of solutions in the different trade-off regions, and 4) conduct a scatter-plot analysis of the objectives.
We applied the proposed technique to five sets of instances for the multiobjective multidimensional knapsack problem. The proposed technique helps to identify features such as local and complex relationships between objectives, tradeoffs not derived from pairwise relationships, gaps in the fitness landscape and regions of interest. Different conclusions can be reached about the objectives relationships, for different instances even though they are scenarios from the same problem. We also discussed how the analysis and visualisa-tion technique could be used to better understand the fitness landscape of the problem in hand.
Future work includes applying the proposed technique to other optimisation problems to validate further. It is also important to study the impact that the initial approximation set provided has on the accuracy of the analysis. Finally, we intend to investigate how the components of this technique, such as the trade-off region map, could be employed during the optimisation process for a many-objectives algorithm, to direct the search towards regions of interest.