Cooperative Double-Layer Genetic Programming Hyper-Heuristic for Online Container Terminal Truck Dispatching

In a marine container terminal, truck dispatching is a crucial problem that impacts the operation efficiency of the whole port. Traditionally, this problem is formulated as an offline optimization problem, whose solutions are, however, impractical for most real-world scenarios primarily because of the uncertainties of dynamic events in both yard operations and seaside loading–unloading operations. These solutions are either unattractive or infeasible to execute. Herein, for more intelligent handling of these uncertainties and dynamics, a novel cooperative double-layer genetic programming hyper-heuristic (CD-GPHH) is proposed to tackle this challenging online optimization problem. In this new CD-GPHH, a novel scenario genetic programming (GP) approach is added on top of a traditional GP method that chooses among different GP heuristics for different scenarios to facilitate optimized truck dispatching. In contrast to traditional arithmetic GP (AGP) and GP with logic operators (LGP) which only evolve on one population, our CD-GPHH method separates the scenario and the calculation into two populations, which improved the quality of solutions in multiscenario problems while reducing the search space. Experimental results show that our CD-GPHH dominates AGP and LGP in solving a multiscenario function fitting problem as well as a truck dispatching problem in a container terminal.

Cooperative Double-Layer Genetic Programming Hyper-Heuristic for Online Container Terminal Truck Dispatching Xinan Chen , Ruibin Bai , Senior Member, IEEE, Rong Qu , and Haibo Dong Abstract-In a marine container terminal, truck dispatching is a crucial problem that impacts the operation efficiency of the whole port.Traditionally, this problem is formulated as an offline optimization problem, whose solutions are, however, impractical for most real-world scenarios primarily because of the uncertainties of dynamic events in both yard operations and seaside loading-unloading operations.These solutions are either unattractive or infeasible to execute.Herein, for more intelligent handling of these uncertainties and dynamics, a novel cooperative double-layer genetic programming hyperheuristic (CD-GPHH) is proposed to tackle this challenging online optimization problem.In this new CD-GPHH, a novel scenario genetic programming (GP) approach is added on top of a traditional GP method that chooses among different GP heuristics for different scenarios to facilitate optimized truck dispatching.In contrast to traditional arithmetic GP (AGP) and GP with logic operators (LGP) which only evolve on one population, our CD-GPHH method separates the scenario and the calculation into two populations, which improved the quality of solutions in multiscenario problems while reducing the search space.Experimental results show that our CD-GPHH dominates AGP and LGP in solving a multiscenario function fitting problem as well as a truck dispatching problem in a container terminal.

I. INTRODUCTION
I NTERNATIONAL maritime transportation has seen a sig- nificant growth and will continue growing [1].Such continual growth has stretched (and in some cases, overwhelmed) the capacity of seaport container terminals.However, because of geographical or resource limitations, many terminals cannot quickly expand in size or upgrade their equipment to satisfy the increasing demand.Consequently, container terminals are under pressure to become more efficient, and intelligent algorithms that optimize the use of container terminal resources are promising solutions.In improving the operations efficiency at terminals, many port companies have chosen to start by tackling the crucial truck dispatching problem which connects the seaside operations closely with activities at the yard areas.
Although many studies have successfully developed offline optimization approaches to solve the truck dispatching problem, these offline optimization strategies adopt simplified models and are inapplicable to the actual port environment.In particular, most of these models overlooked many uncertain factors that are at play in marine container terminals: crane operators and truck drivers have distinctive operating habits and operating times, ships may not dock on time, and equipment may break down.In our field and simulated tests, although the offline optimization strategies, such as integer programming, genetic algorithm, and A* search algorithm, can achieve good results for the first few tasks, the static model processes subsequent tasks in a fairly arbitrary manner when the operation is disrupted by unknown events halfway through the process.Therefore, considering the practical needs of port companies, this article proposed a novel evolutionary optimization method that can fully handle the uncertainties in a real-world setting.
In one of our previous studies [2], experts were first consulted, and their experience was subsequently used to manually craft heuristics to guide online dispatching, helping the port in our study to significantly reduce the intensive work of truck dispatchers.Despite these desirable results of our manually designed heuristics, considerable expertise is required to construct these heuristics, which only account for some but not all problem scenarios.Every time the container terminal changes its equipment, route scheduling, or task scheduling strategy, these heuristics must be refined by experts, demanding extensive extra time and cost.In addition, due to human cognitive limitations, experts can only explore heuristics over a very small fraction of the overall search space, which is far from realizing the full potential of heuristic algorithms.A datadriven genetic programming (GP)-based heuristic method was investigated to generate heuristics [3] from previous operation data.Despite the superiority of this GP heuristic algorithm compared with manually designed heuristics (see the test result in Section V), we found that arithmetic GP (AGP) without logic operators, such as ">," "<," and "IF-ELSE," cannot fully express stochastic multiscenario problems with a discontinuous solution space.Unfortunately, many problems, such as truck dispatching in terminals, typify these stochastic multiscenario problems which are featured with many scenarios across different periods of time.
In the truck dispatching problem, various problem parameters/features have different degrees of influence in different scenarios.For example, the average quay crane (QC) load time and average QC unload time have more influence in scenarios dominated by loading and unloading operations, respectively.An AGP method would struggle to deal with such cases.This can be illustrated, as an example, by a piecewise-linear function in (1), where the value of x decides the scenarios and the corresponding results.To handle such problems, researchers usually choose the GP with logic operators (LGP).However, directly adding more operators considerably increases the size of the search space [4], making the search time impractically long, and resulting unsatisfactory solutions due to poor convergence quality.As shown in Figs. 10 and 11, both AGP and LGP converge to poor local optima instead of the true global optimum after 300 generations.Although LGP performs better than AGP thanks to the logic operators, unfortunately, they introduce a significantly bigger search space and poor convergence in some scenarios even with the ability to fully express the function ( In this research, we investigate a hierarchical GP encoding structure that can take utmost advantage of logic operators in GP, while at the same time avoid the exponential growth in search space.The proposed algorithm can efficiently handle the complexities and dynamics of real-life seaport terminal truck scheduling.Inspired by research on cooperative coevolution GP [5], [6] and novel GP representations [7], [8], [9], we introduce a cooperative double-layer GP hyperheuristic (CD-GPHH) to divide scenario grouping and dispatch ordering into two different subpopulations, in an attempt to improve readability and enhance solution quality for complex dynamic vehicle scheduling problems.In this proposed method, GP individuals are separated into two cooperative layers: 1) a high-scenario layer and 2) a normal-calculation layer, each of which with independent mutation and crossover strategies.Individuals in the high-scenario layer will decide which normal-calculation individual to employ for generating solutions in a specific scenario.
The goal of this study is to develop an effective GP approach to automatically evolve high-quality dynamic truck dispatching heuristics to support port companies make decisions in real time (a trained GP expression can generate a dispatch decision almost instantaneously), thus, to cope with the challenges of multiscenarios of uncertainties.We tested and compared the AGP, LGP, and our proposed CD-GPHH for both a simple function fitting problem, expressed in (1), as well as a real-world marine container terminal truck dispatching problem.Our work makes two major contributions.First, we demonstrate the effectiveness of using GPHH in solving a real-world online combinatorial optimization problem faced in many large container ports.Second, we propose a novel bilevel solution framework that can explicitly exploit the structures of scenario switches in many realworld problems and, hence, improve upon the previously proposed GPHH.
The remainder of this article is organized as follows.Section II provides some background of the research and a review of the relevant literature.Section III details the model and mathematical formulation of the dynamic truck dispatching problem for marine container terminals.Section IV describes the details of the proposed algorithm.Section V presents the experimental design, results, and analysis.Finally, Section VI concludes this article and suggests future research directions.

A. Marine Container Terminal Optimization
As illustrated in Fig. 1, a marine container terminal can be divided into three major parts: 1) the berth area; 2) yard area; and 3) entry-exit area (the example terminal contains five berths).A fixed number of QCs deployed along berths for loading (and unloading) containers onto (and from) vessels.A single rail is built along the berth line to enable QCs moving between different berths, but they cannot cross each other.The yard is a temporary container storage area and is often divided into similar-sized yard blocks.Each yard block is identified by a unique ID (e.g., 55, 5H at the center of the yard area) and has 1-2 yard cranes (YCs) or similar type of equipment to complete loading and unloading tasks.Both QCs and YCs can only operate unit-sized containers each time, hence, queues are possible at each operation point.Typically, berths are built in deep sea water areas and connected to yards by a small road network consisting of bridges across the shallow water regions, road segments and intersections.Trucks (or more specifically inner trucks) are used to transport containers between QCs and YCs via this road network and strict traffic regulations are enforced to ensure safety and relieve congestion.The entryexit area are gates that control the external truck visits.
In most cases, a container terminal primarily aims to improve the efficiency of the sea-side (i.e., QCs) operations so that vessel service times can be minimized and the overall turnover can be maximized.However, because of the interrelations between sea-side and land-side operations, often QCs, YCs, and trucks are considered as schedulable equipment for optimization.The resulting problem becomes extremely challenging due to the following factors.First, the problem is often in a large scale caused by the number of containers to be handled as well as the number of QCs, YCs, and trucks involved; second, The problem is highly nonlinear because all of the schedulable equipment (QCs, YCs, and trucks) can handle unit-sized containers only each time and queues and waiting becomes unavoidable; last but not least, uncertainties at different stages of operations (processing times by QCs, YCs, and trucks) are common and predefined plans based on deterministic models would mostly fail.
Considerable recent research has focused on the efficiency of QCs, YCs, and trucks to help port companies cope with soaring demand.Schonfeld and harafeldien [10], Kim and Park [11], Kaveshgar and Huynh [12], and other researchers [13], [14], [15] have developed different optimization methods to reduce the makespan of QCs.According to their results, QC makespan can be reduced by adjusting the number, position, or operation sequence of QCs.However, improvements in these studies are extremely dependent on sufficient truck supplies.If truck supply is insufficient for the QCs, the QCs must stop and wait for available trucks, which greatly decreases QC efficiency.Furthermore, after Lai and Lam [16] provided an overview of several YC schedule modules, Zhang et al. [17], Chen et al. [18], and other researchers [19], [20], [21] discovered that the moving distance and travel time of YCs can be decreased through shuffling the task sequence and changing the container storage location.These studies on QCs and YCs have been useful but limited to the yard or berth area, and most of these studies have assumed infinite supplies of trucks, which is unrealistic and unhelpful for ports in reducing ship dock time.Subsequently, researchers have found that to further improve the port's turnover efficiency, QCs and YCs must be jointly optimized, with trucks as the links that connect all terminal equipment.
In contrast to the optimization of QCs and YCs, the optimization of truck dispatching is required for not only increasing local equipment efficiency in the berth or yard area but also improving the efficiency of the entire port.This is because almost all other equipment in the container terminal must interact with trucks.Many methods have been investigated to tackle the problem, including classical integer programming [22], min-max nonlinear integer programming [23], greedy algorithms [24], novel heuristic algorithms [22], and genetic algorithms [25], [26], [27].Most studies reported good performance measured by reduction in ship dock time, empty-truck travel distance, or the overall truck travel distance.
In real-world marine container terminals, QCs or YCs have different operating times when operating different containers, task sequences may be switched, and trucks do not travel at a constant speed.Because of the influence of such real-world stochasticity, offline methods, such as integer programming, require additional time to recalculate dispatching plans whenever the environment changes; otherwise, the results could become inapplicable.Some of the successful offline solutions, such as genetic algorithms [28], metaheuristics [29], and simulation-based optimization [30], can tolerate uncertainty to a limited degree.When the uncertainties become higher, it is very difficult to adapt these algorithms easily [31].
Consequently, a previous study [2] attempted to resolve these issues and proposed an online heuristic-based truck dispatching method which uses heuristics to dispatch a task to each idle truck dynamically according to the real-time values of uncertain variables as well as the states of other key features of the port.Like many greedy algorithms, this heuristic method provides relatively good solutions but has no guarantee in terms of optimality [32].Another drawback of this manual heuristic from domain experts is its lack of adaptivity in multiscenario problems, leading to considerable research efforts in developing hyper-heuristics that perform well across different scenarios and problem domains.

B. Hyper-Heuristic
Hyper-heuristic was initially proposed as a general-purpose, fast-prototype search methodology.It uses heuristics to select or generate heuristics to, in turn, resolve a wide range of problems with acceptable solution quality [33], [34].Hyper-heuristics differ from metaheuristics in that the search in hyper-heuristics is applied to the space of heuristics, instead of the space of solutions.The underlying assumption of hyper-heuristics is that the heuristic space is less problem dependent than the solution space is.Thus, a more general search method can be developed by searching the solution space indirectly through the heuristic space.Although the No Free Lunch theorem [35] states that it is impossible to develop a truly general-purpose, universal optimization method for all problems and instances, an adaptive method suitable for problems that share distinct characteristics and structures can be developed.This underpins hyper-heuristics, which uses different learning mechanisms (both online and offline) to improve the generality of an algorithm to different problems and scenarios.
As illustrated in Fig. 2, a hyper-heuristic adopts a twolevel structure, comprising a high-level heuristic and a set of low-level heuristics that operate on solutions.The high-level heuristic does not interact with problems directly; instead, it either adaptively selects from a set of predefined heuristics to solve the problems at hand or learn to generate new heuristics for the problem.The high-level heuristic typically utilizes experience data collected during the problem solving to assist in selecting or generating appropriate low-level heuristics.Studies have reported that hyper-heuristics perform better than their counterparts when solving multiple complex problems, such as educational timetabling [36], [37], 2-D strip packing [38], [39], multiobjective release planning [40], [41], and vehicle routing [42], [43].Very few studies have used hyperheuristics in solving truck dispatching problem in marine container terminals.Furthermore, most of the existing studies have mainly focused on selective hyper-heuristics.Generative hyper-heuristics that support multiscenario dynamics are less reported in the literature.

C. GP
First proposed by Fogel et al. [44], GP is an evolutionary computation method that evolves a population of programs (often encoded as GP trees) through an evolutionary process (selection, crossover, mutation, and replacement).GP has been used in solving many engineering and optimization problems.Nguyen [45] indicated that, compared with other methods like decision tree, logistic regression, support vector machine, and artificial neural networks, GP has three major advantages when solving optimization problems.First, GP has flexible representations; second, GP has powerful search mechanisms; and finally, GP-generated heuristics are partially interpretable and very efficient in execution which enhances their applicability in practice.Because of these advantages, in this study, GP is considered as a strong candidate solution method for real-world problems.
However, for more complex multiscenario real-world problems, simple GP-generated heuristics cannot meet the requirements.These heuristics only adapt to the part of the situations, but achieve unsatisfactory results in general.Therefore, some researches introduced logic operators in GP and proposed to use GP hyper-heuristic (GPHH) to select and generate heuristics simultaneously.Like project scheduling [46], [47], combinatorial bilevel optimization [48], [49], and resource allocation [50], [51], GPHH has demonstrated its outstanding ability in tackling intricate multiscenario real-world problems which encouraged us to apply it in this online container terminal truck dispatching problem.
Traditionally, a GP tree uses a function operator in every tree node and an operand in every internal terminal node.This tree structure makes it fairly easy to encode, evolve, and evaluate mathematical expressions.Meanwhile, GP algorithms with nontree structures have also been successfully developed, such as linear GP [52], gene expression programming [53], and stack-based GP [54].These aforementioned variety of GP representations are typically efficient in executing genetic operators and have performed well in various problems.However, tree-based GP provides better visualization, which can enhance comprehensibility because the tree structure can readily represent the logic of decision making in complex scenarios and can be translated into the form of a logic tree to interact effectively with real-life decision maker (by, for example, port operators).Our proposed CD-GPHH method adopts a similar tree-based solution structure but, additionally, we explicitly separated decision making into two levels: 1) those that broadly characterize the scenario patterns and 2) those that provide utility guidance.Section IV details our double-layer GPHH structure.

III. PROBLEM DESCRIPTION AND FORMULATION
Formally, the problem in this article is defined as follows.An abstract container terminal is represented as a directed graph: G = (A, C), where C = Q ∪ Y is the set of operation nodes (or points of work operations for all tasks), Q and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Y being the sets of all QCs and YCs, respectively.A is the set of direct driving connections between different nodes.Let d be the truck depot where all trucks depart from at the beginning of the operation and return to once all tasks are completed.The set V = {v 1 , v 2 , v 3 , . . ., v m } represents the set of m trucks available for assignment.A function τ (x, y) maps two different operation points x, y ∈ C(x = y) to the travel time from x to y while respecting the traffic rules of the actual terminal road network.
The following constraints must be satisfied.First, the containers are consolidated into unit-sized tasks, each of which can be handled by QC, YCs, and trucks in one operation.In practice, a task consists of two small containers (i.e., twenty-foot equivalent unit, TEU), or one large container (forty-foot equivalent unit).Each task has exactly three operations in sequence.For import containers, the three operations are unloading from vessel to a truck by QC, transport from sea-side to yard by the truck, and then stacking to the designated yard block by a YC.At any time, an equipment (QC, YC, or Truck) can only handle one task.When the required equipment is busy, the operation must wait.For export containers, the three operations include YC loading a container from yard to truck, truck transporting the container from yard to berth, and finally QC loading it onto the vessel.Second, each QC will execute one type of operation only (i.e., load or unload but not both) for any given vessel.Third, the tasks are given in the form of a set of work instruction lists W = {IL 1 , IL 2 , IL 3 , . . ., IL |Q| }, where IL q is the tasks associated with QC q ∈ Q.For loading-only QCs, the completion of the tasks must follow the same order specified in their corresponding instruction lists, while for unloading QCs, the execution order of the tasks can have a maximum sn deviation from their original position in the order.In this article, we set sn = 3 as suggested by our collaborator.
Denote W = {w 1 , w 2 , w 3 , . . ., w n } be the set of all tasks that must be completed, where n is the total number of tasks.The container count of task w i is denoted as size i (i.e., for a merged task with two small containers, size i = 2; otherwise, size i = 1).The source and destination nodes for each w i are denoted by a i and b i , respectively, and a i , b i ∈ C. Denote s i be the start time of task w i at its source node and e i be its completion time at the destination node.Denote sot i and eot i be the operation time of w i at source and destination nodes, respectively, and their sum as ot i .The operation times at both QCs and YCs are assumed stochastic and are drawn from estimated probability distributions based on the historical data.
To model the problem formally, the assignments of tasks to trucks are defined by the following binary variable in The following auxiliary variable is defined to indicate whether w k is serviced immediately after task w j by truck v i : The order of tasks belonging to a same crane c i ∈ C is described by The main goal of a truck dispatching problem for container terminals is to increase the profit of the port company by improving turnover and reducing the waiting time of ships.A variety of metrics can be used to measure the extent to which this goal is achieved.We adopt the objective to be units per hour, which calculates the total number of containers processed per hour by all QCs in the terminal.This is because most port companies use this as the primary indicator to compare their operation efficiency against their competitors.Note that units-per-hour metric is equivalent to the makespan used in many scheduling problems when the set of tasks is fixed.Given these definitions, our truck dispatching problem can be modeled as follows: The objective defined in ( 5) is the average production speed per unit time (hour), where max E and min S are the end time of the last completed task and the start time of the first initialized task, respectively.The constraint expressed in (6) ensures that each task is assigned exactly to one truck, and the constraint expressed in (7) ensures that each task is followed by a maximum of one other task or by nothing if it is the last task for the truck.For each crane, due to the rules governing container terminal transportation, the constraints in (8) and (9) ensure that the tasks of the same crane cannot start until its preceding task is completed, with the exception of the unload tasks in QCs, for which the operational sequence can be swapped between the sn = 3 neighboring tasks.The constraints in (10) and (11) compute the tasks' start times and end times and ensure that tasks will initialize crane operation after the crane operations of the previous tasks are completed.
Studies have reported that the truck dispatching problem for marine container terminals is NP-hard because it can be reduced to the vehicle routing problem [55]; in other words, the computational time required to search for the optimal solution increases exponentially with the size of the problem.Although previous studies have used various metaheuristics to tackle this problem, these metaheuristics have rested on the assumption of perfectly predictable crane operation times ot i for all tasks in W prior to the problem solving, which, as discussed earlier, are highly stochastic and impossible to fully predict in advance.
Consequently, we defined this problem as an online optimization problem and used GP-based generative hyperheuristics to solve it.We discuss this in the next section.

IV. METHODOLOGIES
We open this section by briefly introducing a dynamic truck dispatching system in a real-world marine container terminal, describing how it interacts with optimization algorithms.Subsequently, we describe four distinct heuristic methods which are designed to solve this problem.Our overall research addresses this challenging problem, and we have implemented manually crafted heuristics and the AGP, which were detailed in our previous study [3], as well as the LGP and CD-GPHH, which is the focus of the present study.
To effectively cope with the dynamically changing business environment and various uncertainties, most existing truck dispatching systems in practice adopt dynamic dispatching methods that must respond to requests within a few seconds.Dynamic dispatching contains two essential parts: 1) a dynamic dispatching system and 2) a dispatch algorithm.By communicating with the terminal operating system (TOS), the dynamic dispatching system can obtain the real-time status of the port, and, through cooperating with the dispatch algorithm, assigns each truck to the most appropriate task according to the real-time status of the port.The dispatching algorithm plays a crucial role in the operation of the whole system.The choice of the algorithm greatly affects the performance of the dynamic dispatching system.In our previous study, we demonstrated the superiority of the GP algorithm over manual heuristics [3].However, in practical tests with a port company, we found that the basic GP algorithm fails to meet practical demands, especially when dealing with constant changes of problem scenarios (inbound dominated, outbound dominated, inbound/outbound balanced, etc.).The proposed double-layer encoding scheme within a GP-based hyper-heuristic framework explicitly separates the scenario grouping from the truck dispatching to enhance the performance of the data-driven GP with the incorporation of logic operators, meanwhile avoid dramatic increase in the size of the search space.The individuals in the scenario grouping layer and the truck dispatching layer to share the same fitness but evolve independently with separate crossover and mutation operators.

A. Terminal Dynamic Truck Dispatching System
The terminal dynamic truck dispatching system interacts with the dynamic dispatch algorithm and the TOS.Unlike static dispatching, dynamic dispatching does not generate schedules for all tasks in advance.Rather, the system constantly interacts with the port environment by sending out task assignments to idle trucks in real time and conducts real-time monitoring of the changes in key environmental parameters, such as vehicle distribution, crane operation conditions, and vehicle queuing status.Per the dispatching workflow illustrated in Fig. 3, this system has a circular process flow where environmental information is updated, trucks are dispatched, and environmental information is updated again.In doing so, the system can feed real-time information to heuristic algorithms (such as traditional heuristic shown in Table V and the proposed GP heuristics) to select proper tasks and support dispatch decision making.In each loop, this system dispatches the single most appropriate task to an idle truck based on the recommendation of the dispatch algorithm module, if there remain unfinished tasks.If all tasks have been completed, the system sends out no instruction and waits for further tasks.

B. Manually Crafted Heuristics
Presently, many container terminals still recruit coordinators to manually optimize dispatching schemes and adjust the number of trucks assigned to each work queue.Relying on the experience of coordinators, most terminals can still maintain a relatively high operating efficiency to fulfill market demands.This indicates that skilled operators have, over the years, developed operational experience and rules that can help the port achieve effective truck dispatching.Therefore, through communication, surveys, and questionnaires, we summarized the experience of these skilled operators into a manually crafted heuristic and applied it to real-world truck dispatching in marine container terminals, detailed in Algorithm 1 as a baseline for evaluating the performance of our current CD-GPHH.
This manually crafted heuristic algorithm uses several user parameters that are set based on coordinators' experience: desired_trucks (the most suitable truck number for the QC), priority (the priority of the QC), truck_limit (the maximum number of trucks for a QC), along with other observed variables in real time: truck_num (the number of trucks working for the QC), and travel_time (the travel time from the current truck to the first (few) task's source node of each QC).These features are then used to calculate a score for each available QC to reflect the preferences of each QC.The algorithm dispatches the idled trucks to the most preferred QCs in a decision tree-like fashion (see Algorithm 1).

C. AGP
Manually crafted heuristics usually focus on the most common scenarios and tend to dispatch trucks evenly among QCs while neglecting the other key parameters, such as the number of remaining tasks, the QCs' and YCs' operation time, and the queues of trucks.The previous data-driven GP heuristic method in [3] was proposed to address of these issues.The main idea was to use GP to evolve a heuristic that shares a similar structure with the manual heuristic but its parameters are trained automatically based on a large real-world data set.
Algorithm 2 presents the main steps and components for AGP, LGP, and CD-GPHH.An initial population is first created according to the preset population size.In the evolution process, the fitness of each individual is calculated according to the objective in (5), plus a penalty term for oversized GP trees.Subsequently, a new generation of population is generated, and a random genetic operator between crossover, 4. Crossover operation in AGP and LGP.mutation, and reproduction will be chosen to generate offsprings of the new population.In this study, we adopted a nonelitist tournament selection method to increase the diversity.The evolutionary process is repeated until the maximum number of generations is reached.The following sections present the technical details of these components.
1) Crossover: The crossover operation takes two parental individuals selected through tournament selection and produces two offsprings using a single-point crossover operation.An example is illustrated in Fig. 4, where a subtree of parent 2 is combined with parent 1 to generate offspring 1, whereas offspring 2 is resulted from a merge of subtree 1 into parent 2.
2) Mutation: The mutation operation takes one individual as input and generates a new offspring by a slight modification.A mutation point is randomly selected to grow a new randomly generated subtree that keeps the whole tree within the depth limitation, as illustrated in Fig. 5.
3) Depth Restriction: To avoid the bloating problem, researchers usually set a depth limit to the GP trees and discard or prune any result that exceeds the limits.We used two approaches in this study.The first method introduces a penalty term into the fitness function to penalize individuals that are too deep or have too many nodes.The second method sets a maximum depth of subtrees for crossover and mutation operations to generate resulting offsprings within the required depth limit.

D. LGP
Although AGP can address the shortcomings of manually crafted heuristics by evolving parameterized heuristics from historical data, the resulting solution can be extremely Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.complex and when handling problems multiple scenarios (i.e., when the distribution of random variables changes over time).For such problems, the use of some discrete utility functions is more elegant and efficient.LGP combining logic expressions with algorithmic trees presented in the previous section, different subtrees can be generated for different scenarios, which improves the performance of the algorithm and produces results that are adaptive to complex multiscenario problems.This ability to adapt to different scenarios using a combination of the logic tree with arithmetic trees follows the general framework underlying hyper-heuristics [56], and the approach is thus also named as a GPHH.
LGP trees with a positive probability of generating several logic trees that sit at the top to select a number of arithmetic trees at the bottom (Fig. 6).LGP shares similar crossover and mutation operations with AGP but has additional operators designed for the logic tree, as shown in Table I.Moreover, a loosely typed GP is used in LGP, which allows arithmetic and logical operations being combined freely.When performing logical calculations, numbers greater than 0 are treated as logically true, and vice versa as logically false.In this way, after the introduction of the "IF-ELSE" operator, different subtrees can be selected for calculation through the logical values of the previous decision tree.Therefore, some multiscenario problems difficult to be encoded by a single depth-constrained arithmetic tree can now be more effectively addressed in LGP.For example, with the assist of the IF-ELSE operator and comparison operators, we can evolve a simple LGP tree to represent the discontinuous function in (1) fairly easily.

E. CD-GPHH
Although LGP has shown promising performance in handling complex discontinuous functions and adapting to multiscenario problems, these abilities tend to be unreliable and only a few individuals in LGP population possess correct structure with multiscenario abilities.This is because introducing logic operators greatly increases the search space where it is extremely difficult for LGP to converge to a good solution within a reasonable computational time.In our previous experiments, LGP can produce some effective results most of the time but lacks required reliability and consistency demanded in industrial operations.Furthermore, even the depth of the GP tree is limited, the interpretability of the resulting solutions is unsatisfactory because the logic expressions are mixed with the arithmetic terms in an LGP tree without a clear structure.When we tested the LGP solutions in a real-world container terminal in the Ningbo Port, they were not generally accepted by port operators, who believed the results too difficult to understand and thus unsuitable for practical use.
In practical problems, performance metrics are typically separable.Take the truck dispatching problem for ports as an example, where operators usually consider different scheduling strategies based on, for example, task categories, yard conditions, the number of trucks available for dispatch, etc.In fact, these factors are high-level scenarios that are generally not directly involved in the formulation of task ranking rules; they instead play a role in the selection of different policies.Some researchers exploit these decision variables by grammar-based LGP to generate individuals with scenario distinguish ability [7], [57], [58], [59].However, because the grammar-based GP only adds grammar filtering to the normal LGP, and not explicitly separate the scenario selection from the calculation layer.As a result, these factors do not always appear in the scenario layer to play its decisive role in scenario selection without presetting.Instead, they are often embedded in the calculation layer which can cause unreliable performance.
Consequently, to reduce the size of the search space and to improve performance, it is necessary to separate scenario information from dispatch rules, leading to our proposed CD-GPHH method which reduces the search space size by separating arithmetic trees and scenario trees in two different layers.The concept underlying CD-GPHH is to evolve the scenario grouping trees and truck dispatch trees concurrently but at two different layers.More specifically, each individual in CD-GPHH has a scenario layer and a calculation layer Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.(Fig. 7).The scenario layer contains logic trees for scenario clustering purposes, and the calculation layer includes arithmetic trees that share similar structures as those in the AGP method.Each logic tree in the scenario layer is bound to an arithmetic tree from the calculation layer and grouped as a rule.In the truck dispatching problem for ports, when a tree in the scenario layer evaluates to true (greater than zero), then the corresponding tree in the calculation layer is invoked to compute utility scores for different truck-task assignments.Notably, thanks to this layered structure, our CD-GPHH is also more comprehensible while enhancing the quality of the resulting solutions.
Note that, in CD-GPHH, trees in both the scenario layer and calculation layer are bound into rules for easier computation during implementation.The algorithm first operates on a specific rule before processing the two trees inside each rule.Moreover, since the number of scenarios (number of rules) was introduced as a hyperparameter in CD-GPHH, we extend the traditional mutation operation in GP to enable it to learn and adjust automatically.These three operations are detailed as follows.1) Crossover: The crossover operation takes two individuals (parents) as inputs.As illustrated in Fig. 8, one random rule of each parent is then selected and single-point crossover operations are applied to both layers independently, resulting in two logic trees and two arithmetic trees that can form four different rules.When the four rules are inserted back to the two parents to replace the previously selected rules, eight new offspring are created.However, to maintain the diversity of the population and prevent the same pair of parents from producing too many offspring, two randomly selected offspring are retained in the next generation.
2) Mutation: The standard mutation in CD-GPHH is applied to one parent selected by tournament.A rule in the parent is randomly selected for mutation.For example, in Fig. 9, rule 2 is selected for mutation.Either the scenario layer tree or the calculation layer tree or both are modified to form a new rule (same as AGP).This new rule is then inserted into a random location of the parent to generate a new individual (hence, this new individual has one more new rule than the parent).Meanwhile, to maintain diversity and dynamically adjust the number of rules, between zero and two randomly chosen rules in this new individual are subject to removal.Finally, if the total number of rules in this new individual exceeds the rule count limit, another randomly selected rule will be removed.
3) Solution Decoding in CD-GPHH: In CD-GPHH, decoding a solution involves testing all logic subtrees (except the last) in the scenario layer in sequence, until a subtree evaluates to true.The corresponding calculation tree will be chosen to estimate the performance of different truck-task assignments.If no logic tree evaluates to true, without evaluating the last logic tree, the last (default) calculation tree will be used.
Notably, the increase in the search space of AGP was not large.This is because AGP has no logic operators and not many complex combinations of logic and arithmetic operators in AGP trees.CD-GPHH achieved tolerable performance by separating each rule into a scenario layer and calculation layer.This example illustrates the effectiveness of the proposed solution structure in our CD-GPHH algorithm with respect to reducing the search space while retaining the scenario matching capability of LGP.

V. EXPERIMENTS AND RESULTS
We evaluated the performance of our proposed CD-GPHH against a conventional AGP as well as an LGP for solving multiscenario problems.We first compared their performance for Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the simple multiscenario function fitting problem described in Section I. Subsequently, we conducted extensive experiments for the real-life truck dispatching problem for marine container terminals.For the truck dispatching problem, we also compared our CD-GPHH against the traditional heuristic method used in real world and a manually crafted heuristic reported in [2].Since parameter tuning is not the key focus of this article, we just adapt the common settings for all three GPbased algorithms, as listed in Table II.The number of rules in CD-GPHH was set to 1-10.

A. Simple Multiscenario Function Fitting Problem
In this experiment, the task is to fit function (1) as defined in Section I.There is one input variable x and one constant corresponds to two terminals in all GP methods.The variable terminals were set to an actual value during the calculation process, whereas a constant integer between 0 and 10 was set as the terminal.During each test, 100 randomly pregenerated instances of different values of x were used as the test set.
Fitness was defined as the standard variance between the fitted function and the original function.In other words, a fitness closer to zero indicates a good solution.
Table III presents the statistical results from all three GPbased methods in 30 tests.CD-GPHH performed significantly better than AGP and LGP.As shown in Fig. 10, comparing the best results against the original function, AGP performs the worst, fitting only the case where x is greater than 7, while LGP fits the range from 3 to 10.When x is less than 3, the fitting errors lead to relatively small variance in y (equivalent to small variance in individuals' fitnesses).LGP, therefore, does not fit well in the range of 0-3 regions in Fig. 10, while CD-GPHH has replicated almost 100% of the original function, indicating the potential benefits of a predefined hierarchical structure for multiscenario problems.The simplified best-performing individuals of each method are as follows.Obviously, the results by AGP and LGP are hard to interpret and differ greatly from the true function in (1).Although LGP evolved the IF-ELSE and relational operators that were in the original expression, the results are rather confusing and do not fit the original function well.In contrast, with the help of its bilevel structure, CD-GPHH obtained a concise and easyto-understand function that is almost identical to the original one.
Among the best evolution results of the three methods in Fig. 11, CD-GPHH has escaped the local optimum trap and converged to the global optimum at the 150th generation, while both AGP and LGP ended up with a poor result.In general, under the evolutionary framework designed in this research, CD-GPHH takes advantage of two cooperative populations and obtains well-performed, concise, and understandable results, confirming its superiority on solving the simple multiscenario function fitting problem.

B. Results for Real-Life Truck Dispatching
The three different GP methods are further evaluated on the truck dispatching problem described in Section III.The 14 features in the proposed CD-GPHH are shown in Table IV.The other settings are given in Table II.To better tackle this complex real-world problem, the four new operators in the Table I were added.The performance of the proposed three GP methods is compared with other traditional heuristic methods (Table V) in this section.
To update the environment state values of these features and evaluate the fitness of each individual in all GPs, we built an event-based simulator based on the mathematical model described in Section III.The simulator interacts with the dynamic truck dispatching system (Fig. 3) to provide functions for evaluating the fitness of each individual and generating new environment data after each truck dispatch.It can simulate all events occurring in real-world truck dispatching in marine container terminals, such as vehicle movement, container loading and unloading, and the real-time data exchange with the dynamic dispatching system.The data sets used in this experiment were extracted from the actual historical operational data at the Ningbo Meishan Port.The data sets simulate a typical situation where one container ship berths at the port to load and unload containers, and the port needs to complete the work of the ship as soon as possible to release the ship from the port earlier.Several instances were generated based on this situation, and each with one ship berth with six QCs.The number of trucks is set to be the actual number of trucks working during the data extraction Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.time period, between 24 and 48, and the traffic map is shown in Fig. 1.
As aforementioned in Section III, due to strict traffic regulations (mainly single-direction road segments), there are very few route options between QCs and YCs.Therefore, the truck travel time is precomputed through the shortest path algorithm on the port road network assuming an average truck speed of 8 km/h.Meanwhile, the operating (load/unload) time of the crane for the container on the truck is uncertain.
We extracted ten sets of historical task data of different time periods (ports have different operating scenarios at different times) with five sets for training (sets 1-5) and five sets for testing (sets 5-10).Each set (both training and testing) contains ten instances in the same time period, with 200 tasks consist of a mixture of both loading and unloading operations.For each training instance, 30 independent runs with different random seeds were conducted, leading to a total of 10 * 30 = 300 runs on each training set, the average results over these 300 runs for each set are presented in Table VI.
Recall that the objective is to maximize the number of tasks handled per unit time (hour).Therefore, larger values indicate better performance.The best-performing individuals from the three GP algorithms are evaluated on test sets 6-10.Each instance in the test sets was run only once (because the evolved GP trees are deterministic).However, since each set contains ten instances, there are total of 10*5 test instances which are not seen during GP training.Therefore, they represent a significant robustness test for all the methods.Table VII provides the average results of all the methods across ten test instances in each set as well as the averages across all five sets.
Below we include the best-performing individuals of AGP, LGP, and CD-GPHH.Note that these trees have been simplified for ease of interpretation.A t-test on the experiment (α = 0.001, p = 0.00) demonstrate the superior performance of all the three GP algorithms than the traditional heuristic algorithms, and CD-GPHH achieved the best.Using manual and fixed heuristic algorithms as benchmarks, the improvement percentages (Imp.Pct.) of AGP, LGP, and CD-GPHH are about 7%/15%, 11%/19%, and 14%/25%, respectively.By further comparing the listed individuals of three GP algorithms, we can find that CD-GPHH not only produced more efficient heuristics but also are much more readable thanks to its double-layer structure.In contrast, AGP, and particularly LGP produced heuristics difficult to understand.These heuristics must often be further modified by the operator in practice, while the heuristics produced by CD-GPHH have better usability.
In order to observe the evolution process of the three GP algorithms, one result of example training set 4 is plotted in Fig. 12.It can be seen that for the real-world multiscenario problem, the performance of AGP without scenario grouping is quite limited, while LGP and CD-GPHH can achieve better results.Especially, CD-GPHH did not suffer from the obvious limitations of AGP and performed best in the end.

C. Truck Dispatching Under Special Scenarios
Although problem instances based on real-life data are important to evaluate the practicality of the proposed method, they are less useful to gain useful insights due to the real-life complexities and the combinatory effect of several uncontrolled factors.In this section, we evaluate the performance of different methods under three different scenarios created artificially.In a container terminal, the multiple scenarios are associated mainly with the following dynamically changing factors.
1) The operation times of the load and unload QC tasks along the berth line are practically known as to follow different distributions.The unloading tasks are often less likely to be disrupted by truck delays due to the less strict precedence requirements.On the other hand, loading tasks must follow the predefined sequences exactly 2) The distribution of operation nodes at YCs for the tasks is also crucial.Clustered operation nodes at a few YCs more likely lead to conflicts, as YCs need to support multiple QCs at the time.Specific policies are required to resolve these conflicts.
3) The available number of trucks for dispatch is also important.When enough trucks are available, the priority should focus on the reduction of empty truck travel distances; otherwise, the priority is to avoid costly QC waiting.Following these considerations, we use operation_type, yard_crane_type, and total_truck_num to distinguish these scenarios, respectively.This is also based on the observation that over 75% of the best-performing CD-GPHH individuals use all three features in the scenario selection layer.Therefore, we created three new data sets (sets 11-13), each containing 20 instances with 2 special scenarios based on these three scenario features, respectively.Individuals were trained (30 runs per instance) with corresponding scenario features and then without to simulate the situations with or without scenarios information.
The statistics in Table VIII demonstrate the significantly enhanced performance from CD-GPHH in the special scenarios data sets.There is no obvious impact on the performance of AGP without the scenario features, however, the performance of LGP dropped by 2.9% when scenario features are excluded, compared with 8.1% from CD-GPHH.This justifies the effectiveness and importance of the scenarios identification used in CD-GPHH on multiscenario problems.
Finally, it was confirmed that CD-GPHH can indeed produce better results than AGP and LGP in real-life multiscenario problems.The generated results can also be understood and then modified by operators.According to our statistics, the average test time and training time of each generation of AGP, LGP, and CD-GPHH individual for 100 tasks are 0.02/0.7 s, 0.02/0.73s, and 0.021/0.78s, respectively.Which proved the improved performance of CD-GPHH does not require a significantly increased computational cost on the basis of AGP Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
and LGP.Although CD-GPHH is not yet adopted in a reallife port, our manually crafted heuristic algorithm has been practiced at the Ningbo Port for years.Statistical analysis conducted by the port showed that the work efficiency is increased by 8.1% and ship docking time decreased by 2.2%.This wellperformed algorithm saved time, allowed operations of more ships, and in turn, increased the profit of the port company significantly.It is our next plan to work with the collaborators to fully evaluate and deploy the proposed algorithm in real world.

VI. CONCLUSION
This research proposed a novel CD-GPHH algorithm that can evolve more efficient, user-friendly, and intuitive heuristics whose practicality and superiority have been demonstrated on a real-world complex truck dispatching problem at marine container terminals.The proposed cooperative double-layer structure in CD-GPHH can better handle the dynamics of scenario transitions, which are common uncertainties in real-life.The separated scenario and calculation layers cooperated to utilize logic and arithmetic operators simultaneously while preventing its search space from growing exponentially.With limited training time budget, the proposed CD-GPHH gains around 8%-10% improvement compared with existing GP methods (i.e., AGP and LGP).CD-GPHH also greatly enhances the usability and readability of the evolved heuristics with separate logic and arithmetic layers.
Nevertheless, CD-GPHH can be further enhanced in future studies by addressing a few weaknesses, such as generalization issues, relative inefficiency in rules evolution, and the existence of redundant subtrees.In particular, evolution may be better guided by involving human operators in the evolution process.Meanwhile, targeted redundancy removal algorithms can be developed specifically for CD-GPHH.

TABLE VIII AGP
, LGP, AND CD-GPHH RESULT IN SPECIAL SCENARIOS TRUCK DISPATCHING PROBLEM (UNITS/H) and hence delays can propagate exponentially, causing significant QC waiting.Therefore, different dispatch policies are required for scenarios with tasks dominated by either load or unload tasks.