Diversity-based adaptive genetic algorithm for a Workforce Scheduling and Routing Problem

The Workforce Scheduling and Routing Problem refers to the assignment of personnel to visits across various geographical locations. Solving this problem demands tackling numerous scheduling and routing constraints while aiming to minimise total operational cost. One of the main obstacles in designing a genetic algorithm for this highly-constrained combinatorial optimisation problem is the amount of empirical tests required for parameter tuning. This paper presents a genetic algorithm that uses a diversity-based adaptive parameter control method. Experimental results show the effectiveness of this parameter control method to enhance the performance of the genetic algorithm. This study makes a contribution to research on adaptive evolutionary algorithms applied to real-world problems.


I. INTRODUCTION
The Workforce Scheduling and Routing Problem (WSRP) is described as the assignment of personnel to visits, requested by customers, across different geographical locations. This problem combines scheduling and routing problems, both of which are known to be NP-Hard [1]. The scheduling aspect of the problem assigns personnel to visits in order to fulfil work demands and other requirements. The routing aspect of the problem consists of generating routes for the workers to service customers across various locations within given time-windows. The objective is to minimise operational costs while attending the additional requirements expressed by customers, workers and the business. A type of WSRP arises in home health care where nurses and care workers should be assigned to visit patients in their homes in order to carry out some tasks, e.g. administering medication, monitoring serious illness and unstable health status and injections. Genetic Algorithms (GAs) have been shown to be effective approaches to find solutions for problems combining scheduling and routing where exact methods are less effective, e.g. [2]- [4]. A base-line GA was proposed by [5] that identified the best set of operators and parameters for each instance of WSRP. Despite its success in obtaining good solutions, the base-line GA performance was limited by the tuning of parameters being computationally expensive. Although parameter values can be tuned before the algorithm execution, different values might provide better results at different stages of the evolutionary process [6].
An adaptive parameter control approach that seeks to maintain diversity in the population of solutions is proposed here to enhance the performance of a genetic algorithm. The aim is to be able to tackle a wider variety of WSRP scenarios. A total of 42 instances from six different real-world home health care (HHC) scenarios are used in the experiments here. The specific objectives of this paper are: • To identify the features that contribute to the early convergence of the genetic algorithm.
• To propose six different variations of the adaptive parameter control method, with respect to the identified features, that aim to enhance algorithm efficiency.
• To analyses the impact of adaptive parameters on the feasibility of solutions and the algorithm speed.
In what follows, Section II reviews related work. Section III describes the WSRP, its formulation and the instances used in this paper. Section IV outlines the proposed algorithm. Section V presents the experimental study and discusses the obtained results. The paper is then concluded in Section VI.

II. RELATED WORK
Recent research on the WSRP considered here is reviewed next. A mixed integer programming (MIP) with decomposition method [7] required considerable computation time (up to several hours) to solve larger problem instances with hundreds of tasks, indicating the need for faster solution methods. A Variable Neighbourhood Search (VNS) algorithm using problem-specific neighbourhood heuristics was presented in [8]. The VNS obtained high-quality solutions and in less computation time for the same set of problem instances used in [7]. An investigation was presented in [4], [9] comparing various genetic operators within two simple GAs to tackle the subject problem. A more efficient genetic algorithm with tuned parameters and using a customised solution representation to maintain feasibility of solutions was proposed in [5].
A number of studies have applied GAs to real-world problems where scheduling and routing are combined. Examples include [2], [3]. In those works, the focus has been on algorithm design to obtain good solutions. However, less attention has been given to developing an adaptive algorithm for this type of problems. There has been a number of works describing adaptive parameter control within evolutionary algorithms (EAs) in real-world applications [6], [10]. The study proposed by [11] used a method that incorporated global and local selections to maintain a balance of healthy individuals in the population. The effects of changing population size was investigated in [12]. Other works have used an adaptive process for managing dynamic populations in the algorithm. The study by [13] proposed to adaptively resize the GA population. This paper investigates the process of dynamic population resizing tailored especially for WSRP.
The intended contribution of this work is to present an efficient diversity-based adaptive GA for tackling real-world WSRP instances. The proposed method is capable of producing competitive feasible solutions for a set of real-world instances of highly-constrained WSRP scenarios. The proposed GA harnesses the power of the adaptive parameter control method with problem-specific genetic operators to solve the WSRP instances considered.

III. PROBLEM DESCRIPTION
A WSRP solution is a daily plan of visits, i.e. a set of workers W = {w 1 , w 2 , . . . , w |W | } assigned to perform a set of tasks T = {t 1 , t 2 , . . . , t |T | } for customers at different locations. The assignment of a worker to travel to a customer location in order to perform a task is called a visit. Thus, x w i,j is a binary decision variable that indicates if a path connects two nodes (visit i and visit j) or not. The assignment x w i,j = 1 means that worker w travels from visit i to visit j, thus w makes both visits. For visit j, if x w i,j = 1 then y j ≤ r j − 1 where r j is the number of workers required for visit j and y j is an integer decision variable indicating the number of unsatisfied assignments, hence w∈W i∈T j∈T x w i,j + y j = r j . This paper tackles a home health care (HHC) planning problem, in which workers are nurses, doctors, health carers, etc., and customers are patients receiving health care at their home. Several features have been identified as important in solutions to HHC scenarios, such as distance travelled and customers' and workers' requirements and preferences [14]. A good quality solution for an HHC planning problem should have low operational cost as well as all tasks being assigned in such a way as to satisfy the existing requirements. Table I lists the objectives and constraints in the WSRP considered here. See [7] for details of the MIP model of this WSRP. Note that in [7], unassigned visits is considered a soft constraint. However, here this is a hard constraint, hence all visits must be assigned. Additionally, time-conflict constraint is considered in this paper, while the study by [7] avoided time-conflicts. A time-conflict occurs when a worker is assigned to visits overlapping in time. A solution S is a set of assignments of workers to make visits. The objective function includes the operational cost and the penalty cost. The operational cost includes wages plus journey costs for all workers and is given by the accumulated cost d i,j + p w j , where d i,j is the distance travelled between visit i to visit j and p w j is the cost of assigning worker w to visit j. These costs are set by the service provider in the HHC scenarios used here. The penalty cost is the accumulated penalty for the violations of soft-constraints. An assignment can be written as a tuple (x w i,j , y j , a w j , ψ w j , θ w j , τ w j ), where a w j is the arrival time of a worker w to the location of visit j. The assignment is also composed of binary decision variables indicating an assignment of worker w to visit j with violations on area availability (ψ w j ), time availability (θ w j ) and conflicting assignments (τ w j ). The non-satisfaction of preferences is also included in the penalty cost. There are three types of preferences including preferred worker-customer pairing, worker's preferred region and customer's preferred skills. There is a degree of satisfaction for these preferences when assigning a worker w to a task j and is given by ρ w j which has a value that ranges between 0, 3 . For each assignment, the satisfaction value for each preference ranges between 0, 1 , from not satisfied to satisfied. The satisfaction level is reverted to a penalty by subtracting it from the full satisfaction score, which is 3r j for a visit j.
The best solution should have: the least operational cost and the least penalty cost. A weighted sum is proposed to combine the objectives into a single scalar value [5], [8]. The objective function is written as in equation (1), where weights λ 1 , . . . , λ 5 are defined to establish priority between objectives (more about the weights used here later in the paper).
IV. DIVERSITY-BASED ADAPTIVE GA (DAGA) This section describes the proposed adaptive aspects of the GA as an extension to the approach proposed in [5]. The proposal here is an adaptive control mechanism for parameters values, particularly genetic operator rates P c and P m in addition to the population size M . The work presented in [5] showed that each WSRP instance has features that affected the selection of parameter values in the GA. The parameter values and genetic operators performance were dependent on the problem size reflected in the chromosome length. It was observed that an increase in the chromosome length required a smaller population size in the algorithm and vice versa. There was also an inverse relationship between the population size and the number of generations required for the GA to perform well. Therefore, in order to enhance the GA efficiency, an adaptive parameter control mechanism is proposed here. Adaptiveness of parameter values and genetic operator rates has been approached in the literature through rules [15] and randomness [16]. Hence, two elements are investigated in the adaptive parameter control proposed here. One element is feedback through adaptive rates and population resizing. The other element is randomly changing the crossover when no further improvements are observed. This method is used to delay the premature convergence by having diverse performances provided by different crossovers [17].
The proposed DAGA method works as follows. First, an initial population P of M individuals (one-day plans) is created based on the indirect chromosome encoding that constructs feasible solutions. The representation is designed to include all assigned visits |V |, thus chromosome length vary according to the problem-size. For example, if instances are relatively small, a large M is required as a result of the large degree of similarity between individuals in P . For a large instance on the other hand there is less chance to have similar solutions, thus, a small M is sufficient. The population diversity has an effect on the setting of parameter values. Hence, a diversity-based scale is used to calculate the required change in the adaptive features. At the start of each generation, 9M/20 pairs of parent individuals are selected using binary tournaments. Then, a new population is created through crossover and mutation. The suggested crossover is applied with probability P c to each pair of parents to generate two offspring. Each offspring goes through a flat-cost flip (FCF) mutation, with some probability P m . The operator performance influences the information feedback process where the adjustment of a parameter value is done. At the end of each generation t, the population is updated using an elitism strategy. The M/10 best individuals from the current population are kept and along with the 9M/20 offspring individuals generated, the new population of M solutions is formed. After the new population is created, if there are no cost value improvements on the best so far solutions for a number of generations, the GA is considered to have stagnated. Thus, the updates to parameter values occur in response to the feedback process that seeks to maintain a diverse population to improve the effectiveness of the search. Such feedback from the search is used to generate the next population. Details of the diversitybased adaptive control approach to update parameter values during the GA's execution are described next.

A. Diversity Measurements
To measure the population diversity, a metric to measure the level of variation in a population was adopted from [18]. This method has also been useful for similar representation schemes used in the office space allocation problem [19]. The percentage of non-similarity within a set of chromosomes is given by equation (2), where d ( j) is the number of different genes in the j th position and |V | is the chromosome size for an individual in P . The diversity value for a population is then

B. Search Operators Rates Updates
Initial parameter values were selected after an offline tuning method [5]. If the GA stagnates for a number of generations, crossover and mutation rates are adjusted according to the calculations presented by [11]. That adjusting approach is used due to its capability of adapting quickly on highly multi-modal fitness landscapes. A resetting process is also used for the adaptive rates P c and P m which forgets the rates history if the rates values are out of range. This forgetting step means ignoring all the previous feedback process that led up to the inflation of the adapted values. Details for the calculation of parameter rates are described next.
1) Adaptive Crossover Rate P c : In general, crossover encourages the population to converge by reducing diversity. This is because after a number of generations, similarity between individuals increases as a result of only recombining existing genetic material. When similar individuals are recombined, inbreeding occurs and no additional diversity is added through crossover. Thus, a diversity-based adaptive parameter control is used to maintain a diverse population and avoid early convergence. Upper and lower values are identified for the control mechanism to generate values in that range. The adaptive crossover rate here ranges between Q 1 ≤ P c ≤ Q 2 , with Q 1 = 45% and Q 2 = 100%.
The crossover rate P c is then updated according to equation (3). The P c value is updated with a value within the range, based on the population diversity, where 0 < Diversity ≤ Diversity max . The population diversity influence is the ratio between the current diversity Diversity and Diversity max provided this value does not exceed Diversity max , which is the value of the population variance, i.e. how far each individual in the population is from the mean.
2) Adaptive Mutation Rate P m : To introduce diversification into the population, mutation is usually applied. The selected range for the mutation rate P m is Q 1 ≤ P m ≤ Q 2 , where Q 1 = 10% and Q 2 = 60%. The overall mutation rate P m is calculated with respect to the fitness improvements given by a mutation and the current diversity of the population. Thus, the new P m is the average value of the calculations in equations (5) and (6) as stated in equation (4).
Equation (5) calculates the mutation rate with respect to the diversity. A diverse population avoids early convergence, hence more individuals are explored. However, a highly converged population is subject to higher mutation rates. The value of P m is equal to Q 1 when the population has the highest diversity given by Diversity max , i.e. the population is almost in complete disruption.
To encourage further exploration, mutation is applied on individuals with low fitness. Equation (6) generates a value that uses the parent fitness f (S), where f (S) max and f (S) min correspond to the best and worst fitness values of the individuals in the population respectively. The population is maximally diverged when the selected parents are the worst individual in the population.

C. Dynamic Population Resizing
Preserving the population diversity is a key element for increasing the search coverage by performing an in-depth exploration of the fitness landscape [20]. The population diversity is affected by the population size. If the population size is too small the algorithm gets trapped in a suboptimal region of the search space [21]. This is because the algorithm is unable to find better quality solutions as a result of inbreeding. Certain population size might be good at some point of the search but requires a change at different stages of the evolutionary process. Therefore, an adaptive population resizing is implemented here which works as follows. For population P , its size M is changed dynamically during the run when no improvement is observed. The deterministic change occurs with respect to the step size δ, only if M ranges between 100 ≤ M ≤ 550. To preserve the best found solutions, the best M/10 individuals of the population are always kept. The value δ is the step-size which is relative to the current diversity D C urrent and the population size, that is, δ = D (Current) × M .
Two methods of population resizing are used as stated in Equation (7). The first is to increase M by δ individuals in the next population. The addition occurs by constructing new individuals. The second case is to shrink M by removing the δ worst individuals from P .
This method maintains the best individuals while maintaining the diversity level, to obtain better results and/or faster convergence. For example, when M is small, the diversity is lost between iterations. As a result, adding more individuals, regardless of their fitness value, promotes diversity and encourages exploration in the search. On the other hand, when M is large, the diversity is increasing throughout the generations. Thus, decreasing M allows the crossover to focus on better regions of the fitness landscape and henceforth computing results faster. All these changes are triggered by an increase or a decrease in the population diversity. If there is no change in the diversity, M remains the same. Equations (8) ensures keeping the population size within range.

D. Random Crossover Selection
This method allows exploration of the solution space by using different operators when a particular operator fails to improve the search. The idea is to choose an arbitrarily crossover, if there has been no improvements for the last four generations. Different operators require different rates at each stage of the search. Therefore, further improvements are made on the global performance of the whole population by the adaptive crossover rates. The crossovers used in this study are: flat-cost (FCX) and partially-matched flat-cost crossovers Update M according to equation (7) 14: else 15: Update M according to equation (8) 16: end if (PMFCX) [5], single point (1PX), two point (2PX), uniform (UX) and Half Uniform (HUX) crossovers [22]. Algorithm 1 presents the diversity-based adaptive control for the GA. After the new population is created, if there is no improvement on the best so far solution for a number of generations, the GA is considered to have stagnated. Thus, the updates to the parameter values occur in response to the feedback process (according to the current population diversity) to maintain a diverse population when generating the next population. Initially, the crossover operator is changed by selecting a random crossover (step 1). Then, the crossover rate P c and a mutation rate P m are adjusted (steps 2-11) followed by the population resizing (step 12-16).

V. EXPERIMENTAL STUDY AND RESULTS
This section describes and compares several variations of the adaptive aspects against the previously established baseline GA presented in [5]. The algorithm variations investigated are: adaptive rates only (AGA), adaptive rates with dynamic population re-sizing (AGADP), dynamic population re-sizing only (GADP), adaptive rates with random crossover (RCAGA), random crossover with dynamic population resizing (RCGADP), adaptive rates and dynamic population size with random crossover (RCAGADP).
The parameter values are set for each groups of problem instances before the run as suggested by [5]. For A, B: P m = 50%, P c = 50%, M = 500. For C: P m = 30%, P c = 50%, M = 500. For D, E: P m = 10%, P c = 0.1, M = 100. In AGA, AGADP and GADP, a set of crossover operators were chosen for each problem set as follows. For A, B cost-based greedy crossover (CGX) is used. For C partially-matched greedy crossover (PMGreedyX) is used. For D, E, F flatcosts crossover (FCX) is used. During the run, parameters values were updated with respect to the population diversity as explained in Section IV. Based on preliminary tests, the parameter ranges are empirically set as follows: mutation rate P m ∈ 0.10, 0.55 , crossover rate P c ∈ 0.45, 0.100 and the dynamic population size M ∈ 100, 550 . A random crossover is selected and applied over time when the algorithm is stuck at a local optima. To assess the statistical significance of the obtained results, each run was executed 8 times, seeded with the same initial population and all methods had the same  A  B  01  02  03  04  05  06  07  mean  01  02  03  04  05  06  07  mean  Number of Visits  31  31  38  28  13  28  13  26  36  12  69  30  61  57  61  46  Number of Workers  23  22  22  19  19  21  21  21  25  25  34  34  32  32  32  30  Number of Areas  6  4  5  4  4  8  4  5  6  5  7  5  8  8  7  7   C  D  01  02  03  04  05  06  07  mean  01  02  03  04  05  06  07  mean  Number of Visits  177  7  150  32  29  158  6  80  483  454  585  520  538  610  611  543  Number of Workers  1037  618  1077  979  821  816  349  813  164  166  174  174  173  174  173  amount of computation time. The implementation was in Java running on a PC with I7 four-core processor with hyperthreading enabled and 16GB of RAM.

A. Problem Instances
Problem instances from three UK real-world HHC scenarios are used as instances of WSRP 1 in this study. There are 7 problem instances in each scenario for a total of 21 instances. Table II shows the main features of each problem instance. Scenario A instances are considered the smallest, while instances in scenario F are the largest. Problem instances in scenario C are very different to the instances in the other 6 scenarios by having a much larger number of workers than the number of visits.

B. Effect of Adaptive Control on the Population Diversity
For every experiment, the diversity trace was recorded for each algorithm. Figure 1 shows 6 sub-plots of the average diversity change in the first 10 iterations for each problem instance. These graphs reveal that in all algorithms the diversity converges in the initial generations. In all instances of scenarios A and B there was an apparent change in the diversity value while using GADP, RCGADP and RCAGADP. This is a result of the disturbance caused by the dynamic population resizing which increases the number of dissimilar individuals. A closer inspection of the diversity values in generations 100 to 400 is displayed in Figure 2.
For instances in scenarios A, B and C, high initial value of the mutation rate in all methods delayed convergence in GA as it can be seen in the top plots of Figure 2. When GA, GADP and RCGADP were used, higher diversity was maintained in later generations. This indicates the merit of the dynamic population re-sizing with the help of crossovers exchange. On the other hand, algorithm variations with adaptive rates only, i.e. AGA and RCAGA, exhibited loss of diversity in sets A, B and C. This indicates that the adaptive rates require more time to work. This was corroborated for the larger instances in sets D, E and F, in which adaptive rates variations excelled in comparison to dynamic population re-sizing variations. The adaptive rates also enhanced the search by increasing population diversity even though a low mutation rate was set initially.
The bottom plots in Figure 2 show that there is a high level of intensification in the later generations. However, when random exchanges of crossover were used (RCAGA and RCAGADP), they promoted diversity values at all stages of the search. Henceforth, adaptive rates variations were able to enforce a change in the diversity when given more computational time. Thus, dynamic population re-sizing methods were capable of adapting quickly to environmental change only in smaller instances. On the other hand, dynamic re-sizing succeeded in increasing diversity in RCAGADP, but GADP and RCGADP had a loss in the diversity as a result of the low mutation rate used. Thus, it can be argued that the maintenance of mutation rates encourages the diversity further in larger problem sets along with the random crossover exchange.

C. Statistical Analysis of Adaptive Control Methods
To establish statistical significant in EAs, [23] recommended the use of Friedman analysis as a non-parametric statistical test to compare and rank methods. Here, an IBM SPSS 22 two-way analysis was used to compare the variances of seven related-samples, with a significant level of α = 0.05 and 95% as a confidence interval. Table III provides the results generated by the Friedman analysis on the 42 problem instances including: the mean value, the standard deviation, the minimum cost-value, the maximum cost-value and the mean rank. The results presented in the mean rank row show the methods order based on the statistical analysis, where a low rank value indicates a better method and a high rank indicates a worse method. Four additional values were calculated and used to measure the performance of each algorithms Gap% is the average gap percentage to the best-known value, Dev. is the average percentage deviation from the best-known value (best solution of all the algorithms applied), Best is the fraction of instances in a set for which an algorithm matches the bestknown value (best solution of all the algorithms applied). Additionally, the Score is the fraction of the instances for which a competing algorithm 'wins', i.e. produces better solutions than the configuration being scored. This score is calculated as ((q × (p − 1)) − r)/(q × (p − 1)), where p is the number of methods compared, q is the number of problem instances, and r is the number of instances in which the p − 1 competing configurations find a better result. Hence, the best score value is 1, when r = 0, and the worst score value is 0, when r = q × (p − 1). Table III, have a significance level = 0.0, degrees of freedom = 6 and χ 2 = 55.507. This indicates    Table IV shows the Wilcoxon signed-rank test. Column Z indicates the z-ratio as the unit for the normal distribution. Column Sig. (2-tailed) indicates the p value. Column sign indicates the statistical comparison of the null hypothesise (h 0 ) that denotes the significant difference between two methods. The S+ (S-) signs means that a method i was significantly better (worse) than a method j, within a confidence interval of 95%. In comparing the two related methods, the GA was worse than all other methods. This outcome further proves the superiority of adaptive methods over the non-adaptive GA for WSRP. The Wilcoxon signed-rank test showed that AGA, GADP and AGADP have a statistically significant advantage over RCGADP with (Z = -3.353, p = 0.0008), (Z = -3.476, p = 0.0005) and (Z = -3.473, p = 0.0005) respectively and no statistically significance difference across other methods. On the other hand, AGA, GADP and AGADP are worse than RCAGA and RCAGADP while RCAGA was worse than RCAGADP with no significant difference. These results statically prove that adaptive rates preforms the best when combined with a full adaptive method for WSRP scenarios.

D. Performance Comparison of Adaptive Control Methods
Table V presents a comparison between the results of the best adaptive algorithm variations (AGA and RCAGADP) against results obtained by the MIP method with decomposition in [7], the VNS in [8] and the GA in [5]. For each problem instance, the table shows the cost-value f (S) and the computation-time in seconds Cpt in which a solution by a method was found. The best results are highlighted in bold.
In respect of the total number of best solutions obtained by each method, VNS and AGA performed the best producing best results on 45.24% and 35.71% of all instances respectively. In the set A, a full adaptive method obtained better results than AGA. In the set B, better results that those produced by GA were achieved by adaptive methods, but still not better than the VNS results. A similar observation can be made for set C. The adaptive mechanisms seem to have a positive effect on tackling instances in sets D, E and F where a number of best-known values were found by the adaptove GAs. In terms of computation time, RCAGADP produced good results with fast convergence on the E instances. AGA performed the best on sets D and F, arguably by enhancing the search through the adaptive mechanism. Overall, VNS found more best-known results but the GAs outperform the MIP decomposition method in terms of solution cost. It should be acknowledged that for several of the small instances the GAs spend considerably more time than the other methods, although still below 15 minutes which can be considered acceptable in practice. In general, the MIP decomposition method obtains good solutions in short computation-times across all sets of instances.

VI. CONCLUSIONS
Tuning parameter values for each problem instance can work well particularly in the initial stages of the evolutionary search. However, adjusting parameter values is known to be required for shifting the focus of the search on different areas [6]. In general, the goal of an adaptive parameter control method is to change the parameter settings during the search and only when necessary. In this paper, the aim was to investigate the effect of using a diversity-based adaptive parameter control technique for enhancing the performance of a genetic algorithm (GA) tailored for tackling the workforce scheduling and routing problem (WSRP). Three elements were included in the adaptive parameter control method: adaptive mutation/crossover rates, dynamic population size and random crossover exchange. As a result, six GA variations were compared in this paper in which the adaptive control is triggered by stagnation of the search. The (not-adaptive) GA produced poor results in comparison to the GA variants incorporating some adaptive mechanism.
Experimental results from solving a set of WSRP instances with the adaptive GA variants indicate that having such adaptive mechanisms is beneficial. Adapting the genetic operator rates produced better results especially for the larger WSRP scenarios. It is argued that the reason for this is that adaptive rates are adjusted faster according to any change in the environment, and therefore avoiding premature convergence of the search. More research is needed on the crossover selection in order to improve the algorithm efficiency by achieving an effective balance between solutions diversity and crossover selection.