Invisible control of self-organizing agents leaving unknown environments

In this paper we are concerned with multiscale modeling, control, and simulation of self-organizing agents leaving an unknown area under limited visibility, with special emphasis on crowds. We first introduce a new microscopic model characterized by an exploration phase and an evacuation phase. The main ingredients of the model are an alignment term, accounting for the herding effect typical of uncertain behavior, and a random walk, accounting for the need to explore the environment under limited visibility. We consider both metrical and topological interactions. Moreover, a few special agents, the leaders, not recognized as such by the crowd, are"hidden"in the crowd with a special controlled dynamics. Next, relying on a Boltzmann approach, we derive a mesoscopic model for a continuum density of followers, coupled with a microscopic description for the leaders' dynamics. Finally, optimal control of the crowd is studied. It is assumed that leaders exploit the herding effect in order to steer the crowd towards the exits and reduce clogging. Locally-optimal behavior of leaders is computed. Numerical simulations show the efficiency of the optimization methods in both microscopic and mesoscopic settings. We also perform a real experiment with people to study the feasibility of the proposed bottom-up crowd control technique.

1. Introduction. Self-organizing systems are not as mysterious as they were when researchers began to deal with them. A number of local interaction rules were identified, such as repulsion, attraction, alignment, self-propulsion, etc., and the largescale outcome triggered by the combination of these local rules is so far sufficiently understood. A good knowledge was also achieved about the control of self-organizing systems, namely the methods to dictate a target behavior to the agents still keeping the local rules active all the time. Therefore, it seems to be the right time for translating the theoretical results into practical methods. Some attempts have been tried on animals, like fish and cockroaches, but systematic applications of the control theory for self-organizing systems on humans is completely missing.
In this paper we are concerned with multiscale modeling, control and simulations of self-organizing agents which leave an unknown area. Agents are not informed about the positions of the exits, so that they need to explore the environment first. Agents are assumed not to communicate with each other and they cannot share information directly. Moreover, they are selfish and do not intend to help the other agents.
The guiding application is that of a human crowd leaving an unknown environment under limited visibility. In that case, we claim that people exhibit two opposite tendencies: on the one hand, they tend to spread out in order to explore the environment efficiently, on the other hand they follow the group mates in the hope that they have already found a way out (herding effect). Controlling these natural behaviors is not easy, especially in emergency or panic situations. Typically, the emergency management actors (e.g., police, stewards) adopt a top-down approach: they receive information about the current situation, update the evacuation strategies, and finally inform people communicating directives. It is useful to stress here that this approach can be very inefficient in large environments (where communications to people are difficult) and in panic situations (since "instinctive" behavior prevails over the rational one). Let us also mention that in some cases people are simply not prone to follow authority's directives because they are opposing it (e.g., in public demonstrations).
In this paper we explore the possibility of controlling crowds adopting a bottom-up approach. Control is obtained by means of special agents (leaders) who are hidden in the crowd, and they are not recognized by the mass as special. From the modeling point of view, this is translated by the fact that the individuals of the crowd (followers) interact in the same way with both the other crowd mates and the leaders.
We first introduce a new microscopic (agent-based) model characterized by an exploration phase and an evacuation phase. This is achieved by a combination of repulsion, alignment, self-propulsion and random (social) force, together with the introduction of an exit's visibility area. We also consider both metrical and topological interactions so to avoid unnatural all-to-all interactions. A few leaders, not recognized as such by the crowd, are added to the model with a special, controlled dynamics.
Second, relying on a Boltzmann approach, we derive a mesoscopic model for a continuum density of followers, coupled with a microscopic description for the leaders' dynamics. This procedure, based on a binary interaction approximation, shows that, in a grazing collision regime, the microscopic-kinetic limit system is the mesoscopic description of the original microscopic dynamics.
Finally, dynamic optimization procedures with short and long-time horizon are considered. It is assumed that leaders aim at steering the crowd toward the exits so to ease the evacuation and limit clogging effects. Locally-optimal behavior of leaders are computed by means of the model predictive control (MPC) technique and a modified compass search. Several numerical simulations show the efficiency of the control techniques in both the microscopic and mesoscopic settings.
Beside numerical simulations, we performed a real experiment with people to study the feasibility of such a control technique.
Relevant literature. This paper falls in several crossing, and often independent, lines of research. Concerning pedestrian modeling, virtually any kind of models have been investigated so far and several reviews and books are available. For a quick introduction, we refer the reader to the reviews [8,37] and the books [26,39]. Some papers deal specifically with evacuation problems: a very good source of references is the recent paper [1], where evacuation models both with and without optimal planning search are discussed. The paper [1] itself proposes a cellular automata model coupled with a genetic algorithm to find a top-down optimal evacuation plan. Evacuation problems were studied by means of lattice models [21,32], social force models [43], cellular automata models [1,48], mesoscopic models [2], and macroscopic models [19]. Limited visibility issues were considered in [19,21,32]. A real experiment involving people can be found in [32].
The control of self-organizing agents can be achieved in several ways. For example, the dynamics of all agents can be controlled by means of a single control variable, which remains active at all times [3]. This kind of control is suitable, e.g., to model the influence of television on people. Alternatively, every agent can be controlled by an independent control variable. This is the most effective but also the most "expensive" control technique because of the large number of interactions with the agents that is required [12]. A more parsimonious control technique, called sparse control, can be obtained by penalizing the number of these interventions by means of an L 1 control cost, so that the control is active only on few agents at every instant [10,11,17]. If the existing behavioral rules of the agents cannot be redesigned, the control of the system can be obtained by means of external agents. Here the literature splits in two branches: the case of recognizable leaders, characterized by the fact that the external agents have an influence on normal agents stronger than the influence exerted by group mates (like, e.g., a celebrity) [6,14,29,41]; and the case of invisible leaders, which are instead characterized by the fact that the external agents are completely anonymous and are perceived as normal group mates [24,28,35,36]. Alternative approaches, based on the control of the surrounding environment rather than of the agents themselves are proposed in [27,38].
Let us focus on the invisible control, also known as soft control. In the seminal paper [24] a repulsion-alignment-attraction model is considered, for which the authors show that a little percentage of informed agents pointing towards a target is sufficient to steer the whole system to it. The papers [35,36] deal with a Vicsek-like (pure alignment) model with a single leader. A feedback control for the leader is proposed, which is able to align the whole group along a desired direction from any initial condition. Invisible control was successfully implemented in real experiments involving animals, where leaders come in the form of disguised robots; see, e.g., [34], which deals with cockroaches and [15], which deals with zebrafishes.
The complexity reduction of microscopic models when the number of interacting agents is large has received deep interest in the last years. The main technique to address the so-called curse of dimensionality is to recast the original microscopic dynamics in the form of a PDE, by substituting the influence that the entire population has on a single agent with an averaged one. Several approaches have been used to derive rigorously this procedure, like the BBGKY hierarchy [33], the mean-field limit [9,16], or the binary interaction approximation [4,42]. Furthermore, only recently this problem has been investigated for optimal control of multi-agent systems [31].
In the case of the control by means of external agents, if the number of leaders is relatively small in comparison to the crowd of followers, it might be convenient to apply the above presented techniques to the followers' population only [13,30]. The resulting modeling setting is described by a coupled ODE-PDE system, where we control the microscopic dynamics in order to steer the macroscopic quantities. Nonetheless, from the computational point of view, one has to face the numerical tractability of the optimal control problem, whose solution requires the iterated evaluation of the macroscopic system. Since the underling dynamics are typically non-linear and imply the resolution of high-dimensional integro-differential terms, ad hoc approaches have to be developed in order to avoid time-consuming strategies. One possibility is to use a fixed dynamics for the system of ODEs, assigned by the modelers or as feedback control derived by a properly designed functional, see, e.g., [5,22].
Paper organization. The paper is organized as follows. In Section 2 we introduce the main features of the model, valid at any scale of observation. In Section 3 we present in detail the model at the microscopic scale. In Section 4 we derive the associated Boltzmann-type equation and the coupled ODE (micro) -PDE (kinetic) model. In Section 5 we introduce the control and optimization problem and in Section 6 we show the results of the numerical tests. In Section 7 we report the results of a real experiment with pedestrians aimed at validating the proposed crowd control technique. We conclude the paper sketching some research perspectives.
2. Model guidelines. In this section we describe the model at a general level. Hereafter, we divide the population between leaders, which are the controllers and behave in some optimal way (to be defined), and followers, which represent the mass of agents to be controlled. Followers cannot distinguish between followers and leaders.
First-vs. second-order model. One can notice that walking people, animals and robots are in general able to adjust their velocity almost instantaneously, reducing to a negligible duration the acceleration/deceleration phase. For this reason, a secondorder (inertia-based) framework does not appear to be the most natural setting. However, if we include in the model the tendency of the agents to move together and to align with group mates we implicitly assume that agents can perceive the velocity of the others. This makes unavoidable the use of a second-order model, where both positions and velocities are state variables. See [26,Sect. 4.2] for a general discussion on this point. In our model we adopt a mixed approach, describing leaders by a firstorder model (since they do not need to align) and followers by a second-order one. In the latter case, the small inertia is obtained by means of a fast relaxation towards the target velocity.
Metrical vs. topological interactions. Interaction of one agent with group mates is said to be metrical if it involves only mates within a predefined sensory region, regardless of the number of individuals which actually fall in it. Interaction is instead said topological if it involves a predefined number of group mates regardless their distance from the considered agent. Again, in our model we adopt a mixed approach, assuming short-range interactions to be metrical and long-range ones to be topological.
Isotropic vs. anisotropic interactions. Living agents (people, animals) are generally asymmetric, in the sense that they better perceive stimuli coming from their "front", rather than their "back". In pedestrian models, e.g., interactions are usually restricted to the half-space in front of the person, since human visual field is approximately 180 • . Nevertheless, humans can easily see all around them simply turning the head. In the case we are interested in, people have no idea of the location of their target, hence we expect that they often look around to explore the environment and see the behavior of the others. This is why we prefer to adhere to isotropic interactions.
Let us describe the social forces acting on the agents. • Leaders.
-Leaders are subject to an isotropic metrical short-range repulsion force directed against all the others, translating the fact that they want to avoid collisions and that a maximal density exists. -Leaders are assumed to know the environment and the self-organizing features of the crowd. They respond to an optimal force which is the result of an offline optimization procedure, defined as to minimizing some cost functional (see Section 5).
-Similarly to leaders, followers respond to an isotropic metrical short-range repulsion force directed against all the others.
-Followers tend to a desired velocity which corresponds to the velocity the would follow if they were alone in the domain. This term takes into account the fact that the environment is unknown. Followers describe a random walk if the exit is not visible (exploration phase) or a sharp motion toward the exit if the exit is visible (evacuation phase). In addition, we include a self-propulsion term which translates the tendency to reach a given characteristic speed (modulus of the velocity).
-If the exit is not visible, followers are subject to an isotropic topological alignment force with all the others, i.e., they tend to have the same velocity of the group mates. Assuming that the agents' positions are close enough, this corresponds to the tendency to go where group mates go.
3. The microscopic model. In this section we introduce the microscopic model for followers and leaders. We denote by d the dimension of the space in which the motion takes place (typically d = 2), by N f the number of followers and by N l N f the number of leaders. We also denote by Ω ≡ R d the walking area and by x τ ∈ Ω the target point. To define the target's visibility area, we consider the set Σ, with x τ ∈ Σ ⊂ Ω, and we assume that the target is completely visible from any point belonging to Σ and completely invisible from any point belonging to Ω\Σ.
For every i = 1, . . . , N f , let (x i (t), v i (t)) ∈ R 2d denote position and velocity of the agents belonging to the population of followers at time t ≥ 0 and, for every k = 1, . . . , N l , let (y k (t), w k (t)) ∈ R 2d denote position and velocity of the agents among the population of leaders at time t ≥ 0. Let us also define x := (x 1 , . . . , x N f ) and y := (y 1 , . . . , y N l ).
Finally, let us denote by B r (x) the ball of radius r > 0 centered at x ∈ Ω and by B N (x; x, y) the minimal ball centered at x encompassing at least N agents, and by N * the actual number of agents in B N (x; x, y). Note that N * ≥ N . . . , N f , and |y k − x|, k = 1, . . . , N l must be evaluated in order to find the N closest agents to x.
The microscopic dynamics described by the two populations is given by the following set of ODEs: for i = 1, . . . , N f and k = 1, . . . , N l , We will consider the case where • A is a self-propulsion term, given by the relaxation toward a random direction or the relaxation toward a unit vector pointing to the target (the choice depends on the position), plus a term which translates the tendency to reach a given characteristic speed s ≥ 0 (modulus of the velocity), i.e., where θ : R d → [0, 1] is the characteristic function of Σ, θ(x) = χ Σ (x), z is a ddimensional random vector with normal distribution N (0, σ 2 ), and C z , C τ , C s are positive constants.
• The interactions follower-follower and follower-leader coincide and are equal to for given positive constants C f r , C a , r, and γ, and otherwise, models a (metrical) repulsive force, while the second term accounts for the (topological) alignment force, which vanishes inside Σ. Note that, once the summations over the agents j H f and H l are done, the alignment term models the tendency of the followers to relax toward the average velocity of the N closer agents. With the choice H f ≡ H l the leaders are not recognized by the followers as special. This feature opens a wide range of new applications, including the control of crowds not prone to follow authority's directives.
• The interactions leader-follower and leader-leader reduce to a mere (metrical) repulsion, i.e., K f (x, y) = K l (x, y) = C l r R ζ,r (x, y), where C l r > 0 and ζ > 0 are in general different from C f r and γ, respectively. Note that here the repulsion force is interpreted as a velocity field, while for followers it was an acceleration field.
• u k is the control variable, to be chosen in a set of admissible control functions.
Remark 3.2. The leader dynamics do not depend explicitly on the agents' velocities. This allows us to consider a first-order model for the leaders. However, a generalization to fully second-order model is straightforward.
Remark 3.3. The behaviour of the leaders is entirely encapsulated in the control term u but for a short-range repulsion force. The latter should be indeed interpreted as a force due to the presence of the others, thereby non controllable.

Formal derivation of a Boltzmann-type equation.
As already mentioned, our main interest in (3.1) lies in the case N l N f , that is the population of followers exceeds by far the one of leaders. When N f is so large, a microscopic description of both populations is no more a viable option. We thus consider the evolution of the distribution of followers at time t ≥ 0, denoted by f (t, x, v), together with the microscopic equations for the leaders (whose number is still small). To this end, we denote with ρ f the total mass of followers, i.e., which we shall eventually require to be equal N f . We introduce, for symmetry reasons, the distribution of leaders g and its total mass The evolution of f can be then described by a Boltzmann-type dynamics, derived from the above instantaneous control formulation, which is obtained by analyzing the binary interactions between a follower and another follower and the same follower with a leader. The application of standard methods of binary interactions, see [20], shall yield a mesoscopic model for the distribution of followers, which shall be coupled with the previously presented ODE dynamics for leaders.
To derive the Boltzmann-type dynamics, we assume that, before interacting, each agent has at his disposal the values x and y that he needs in order to perform its movement: hence, in a binary interaction between two followers with state parameter (x, v) and (x,v), the value of H f (x, v,x,v; x, y) does not depend on x and y. In the case of H f of the form (3.3), this means that the ball B N (x; x, y) and the value of N * have been already computed before interacting.
Moreover, since we are considering the distributions f and g of followers and leaders, respectively, the vectors x and y are derived from f and g by means of the first marginals of f and g, π 1 f and π 1 g, respectively, which give the spatial variables of those distribution. Hence, since no confusion arises, we write x, y) to stress the dependance of this term on f and g.
We thus consider two followers with state parameter (x, v) and (x,v) respectively, and we describe the evolution of their velocities after the interaction according to where η f is the strength of interaction among followers, ξ is a random variables whose entries are i.i.d. following a normal distribution with mean 0, variance ς 2 (which shall be related to the variance σ 2 of the random vector z in (3.2) in such a way that we recover from (4.2) a Fokker-Planck-type dynamic for f ), taking values in a set B, and S is defined as the deterministic part of the term self-propulsion term (3.2), We then consider the same follower as before with state parameters (x, v) and a leader agent (x,ṽ); in this case the modified velocities satisfy where η l is the strength of the interaction between followers and leaders. Note that (4.4) accounts only the change of the followers' velocities, since leaders are not evolving via binary interactions.
Since we are interested in studying the problem in the widest possible framework, and avoid sticking only to the previous choice of the functions S, H f and H l , in what follows we assume that they satisfy (Inv) the systems (4.2) and (4.4) constitute invertible changes of variables from (v,v) to (v * ,v * ) and from (v,ṽ) to (v * * ,ṽ * ), respectively.
The time evolution of f is then given by a balance between bilinear gain and loss of space and velocity terms according to the two binary interactions (4.2) and (4.4), quantitatively described by the following Boltzmann-type equation where λ f and λ l stand for the interaction frequencies among followers and between followers and leaders, respectively. The interaction integrals Q(f, f ) and Q(f, g) are defined as where the couples (x * , v * ) and (x * ,v * ) are the pre-interaction states that generates (x, v) and (x,v) via (4.2), and J f is the Jacobian of the change of variables given by (4.2) (well-defined by (Inv)). Similarly, (x * * , v * * ) and (x * ,ṽ * ) are the pre-interaction states that generates (x, v) and (x,ṽ) via (4.4), and J l is the Jacobian of the change of variables given by (4.4). Moreover, the expected value E is computed with respect to ξ ∈ B.
In what follows, for the sake of compactness, we shall omit the time dependency of f and g, and hence of Q(f, f ) and Q(f, g) too.
4.1. Notations and basic definitions. We shall start the analysis of equation (4.5) by fixing some notation and terminology. First of all, for any β ∈ N d we set We denote with T δ the set of compactly supported functions ϕ from R 2d to R such that for any multi-index β ∈ N d we have: is uniformly Hölder continuous of order δ for every x ∈ R d with Hölder bound M , that is for every x ∈ R d and for every v, w ∈ R d it holds denote the set of measures taking values in R 2d . For any two function f and ϕ from R 2d to R for which the integral below is well-defined (in particular, when f ∈ M 0 (R 2d ) and ϕ ∈ T δ ), we set Finally, we say that (f, y 1 , w 1 , . . . , y N l , w N l ) ∈ M 0 (R 2d ) × R 2dN l is admissible if the following quantities exist and are finite: Consequently, we introduce our definition of solution for the combined ODE-PDE system for the dynamics of microscopic leaders and mesoscopic followers.
Definition 4.2. Fix T > 0, δ > 0, and u : [0, T ] → R 2dN l . By a δ-weak solution of the initial value problem for the equation corresponding to the control u and the initial datum (f 0 , y 0 1 , . . . , y 0 and where v * and v * * are given by (4.2) and (4.4), respectively; 6. for every k = 1, . . . , N l , y k satisfieṡ 4.2. The grazing interaction limit. In order to obtain a more regular operator than the Boltzmann operator in (4.5), we introduce the so called grazing interaction limit. This technique has been deeply studied in [47], and in general it allows a better understanding of the solution behavior for large times, see [6,44]. Moreover, the regularized operator of Fokker-Planck-type will directly show the connection with the microscopic model.
In what follows, we shall assume that our agents densely populate a small region but weakly interact with each other. Formally, we assume that the interaction strengths η f and η l scale according to a parameter ε, the interaction frequencies λ f and λ l scale as 1/ε, and we let ε → 0. In order to avoid losing the diffusion term in the limit, we also scale the variance of the noise term ς 2 as 1/ε. More precisely, we set We now study the Boltzmann equation (4.5) to see if simplifications may occur under the above scaling assumptions. Let us fix T > 0, δ > 0, ε > 0 a control u : [0, T ] → R 2dN l and an initial datum (f 0 , y 0 1 , . . . , y 0 N l ) ∈ M 0 (R 2d ) × R dN l . We consider a δ-weak solution (f, y 1 , . . . , y N l ) of system (4.6) with control u, initial datum (f 0 , y 0 1 , . . . , y 0 N l ) and define g as in (4.1). Let the scale (4.9) holds for the chosen ε. Following the ideas in [18,23], we can expand ϕ(x, v * ) inside (4.7) in Taylor' series of v * − v up to the second order, to get where the remainder R f ϕ of the Taylor expansion has the form 1]. By using the substitution given by the interaction rule (4.2), i.e.
we obtain having denoted with By taking the expected value w.r.t. ξ ∈ B of (4.10), the term η f ρ f f, ∇ v ϕ · θC z ξ is canceled out, since E (ξ) = 0.
Writing (whenever possible) H f , H l , θ and S in place of H f (x, v,x,v; π 1 f, π 1 g), H l (x, v,x,ṽ; π 1 f, π 1 g), θ(x) and S(x, v), respectively, the same substitution yields for the second order term Notice that if we compute the expected value of the expression above, all the cross terms ξ i ξ j vanish, since they are drawn independently from each other. Hence we obtain By means of the same computations, we can derive a similar expression for Q(f, g), ϕ . Indeed, from (4.8) we have having denoted with If we now apply the rescaling rules (4.9), the following simplifications occur Hence, if we let ε → 0 and we assume that, for every ϕ ∈ T δ , holds true, we obtain the weak formulation of a Fokker-Planck-type equation for the followers' dynamic where We leave the details of the proof of limit (4.11) in the following Section 4.3.
Since ϕ has compact support, equation (4.12) can be recast in strong form by means of integration by parts. Coupling the resulting PDE with the microscopic ODEs for the leaders we eventually obtain the system (4.13) Let us finally remark that, assuming f to be the empirical measure ,vi(t)) concentrated on the trajectories (x i (t), v i (t)) of the microscopic dynamic (3.1), we recover the original microscopic model itself.

4.3.
Estimates for the remainder terms. Motivated by the results in [44], we shall estimate the quantity R f ϕ as follows: Hence we get From the inequality we obtain Analogously for R l ϕ we get H l 2+δ f (x, v)g(x,ṽ) dx dv dx dṽ.

Similar computations performed on the terms Ξ
Remembering that solutions of (4.6) have support uniformly bounded in time and finite constant mass (i.e., ρ f (t) = ρ f (0) < +∞), we have proven the following result. Theorem 4.3. Fix T > 0, δ > 0 and u : [0, T ] → R 2dN l , and let, for every ε > 0, (f ε , y ε 1 , . . . , y ε N l ) be a δ-weak solution of (4.6) corresponding to the control u and the initial condition (f 0 , y 0 1 , . . . , y 0 N l ), and where the quantities η f , η l , λ f , λ l , and ς 2 are rescaled w.r.t. ε according to (4.9). Suppose that: 1. E ξ 2+δ is finite, 2. the functions S, H f and H l are in L p loc (R 2d ) for p = 2, 2 + δ. Then, as ε → 0, the solutions (f ε , y ε 1 , . . . , y ε N l ) converges pointwise, up to a subsequence, to (f, y 1 , . . . , y N l ), where f satisfies the Fokker-Planck-type equation (4.12) with initial datum (f 0 , y 0 1 , . . . , y 0 N l ), and for every k = 1, . . . , N l , y k satisfies     ẏ 5. Optimal control of the crowd. In this section we discuss how one can (optimally) control the crowd of followers by means of "invisible" leaders, whose dynamics is obtain by the minimization of certain cost functionals. It is well known that in case of alignment-dominated models (either pure-alignment models or repulsionalignment-attraction models at equilibrium) the invisibility of leaders is not a limitation per se, but nothing is known about more complex models. In particular, the coupling between random walk and alignment gives rise to interesting phenomena, cf. [46]. In order to fit real behavior (see also Section 7), the ratio between random walk force and alignment force should be large enough to assure a complete exploration of the domain, and small enough to make leaders influential. By the way, this also narrows the choice of parameters.
To fix the ideas, we plot a typical outcome of the microsimulator when the crowd is far away from the exit. At the initial time, agents are uniformly distributed in a square, see Fig. 5.1. We refer to this situation as Setting 0 (model parameters for this and following simulations are reported in Table 5   This makes clear how difficult is to control the crowd in this scenario, since leaders have to fight continuously against the natural tendency of the crowd to split in subgroups and move randomly, even after that a consensus is reached, cf. [36]. A natural question is about the minimum number of leaders needed to lead the whole crowd to consensus, i.e. to align all the agents along a desired direction. Numerical simulations suggests that this number strongly depends on the initial conditions. Table 5.2 gives a rough idea in the case of multiple runs with random initial conditions. At mesoscopic level we consider the control of the Boltzmann-type dynamics (4.6), where we account for small values of the scaling parameter ε, in order to be sufficiently close to the Fokker-Planck model (4.13). In this case, the control through few microscopic leaders has to face the tendencies of the continuous density to spread around the domain and to locally align with the surrounding mass. Moreover, their action is weakened by the type of interaction considered. Figure 5.2 shows the results corresponding the kinetic density in the Setting 0. Unlike the microscopic case, no splitting occurs where there is no action of the leaders. Instead, the mass smears out around the domain, due to the diffusion term. Optimization problem. The functional to be minimized can be chosen in several ways. The effectiveness mostly depends on the optimization method which is used afterwards. The most natural functional is the evacuation time, subject to (3.1) or (4.6) and with u(·) ∈ U adm , where U adm is the set of admissible controls (including for instance box constraints to avoid excessive velocities). Another cost functional, more affordable by standard methods [3,14], is for some positive constants µ f , µ l , and ν. The first term promotes the fact that followers have to reach the exit while the second forces leaders to keep contact with the crowd. The last term penalizes excessive velocities. This minimization is performed at every instant (instantaneous control), or along a fixed time frame [t i , t f ] l(x(t), y(t), u(t)) dt, subject to (3.1) or (4.6).
With regards to the mesoscopic scale, both functionals (5.1) and (5.2) can be considered, however at the continuos level the presence of few invisible leaders does not assure that the whole mass of followers is evacuated. The major difficulty to reach a complete evacuation of the continuos density is mainly due to the presence of the diffusion term and to the invisible interaction with respect to the leaders. Therefore a more appropriate functional is given by the mass evacuated at the final time T , subject to (4.6).
Remark 5.1. We will not require that leaders themselves reach the exit, but only the followers.
Model predictive control. A computationally efficient way to address the optimal control problem (5.3) is by means of relaxed approach known as model predictive control (MPC) [40]. We consider a sampling of the dynamics (3.1) every time interval ∆t, and the following minimization problem over N steps l(x(n∆t), y(n∆t), u(n∆t))) generating an optimal sequence of controls {u(0), u(∆t), . . . , u((N − 1)∆t)}, from which only the first term is taken to evolve the dynamics for at time ∆t to recast the minimization problem over an updated time frame. Note that for N = 2, the MPC approach recovers an instantaneous controller, whereas for N = (t f − t i )/∆t it solves the full time frame problem (5.3). Such flexibility is complemented with a robust behavior, as the optimization is re-initialized every time step, allowing to address perturbations along the optimal trajectory.
Modified compass search. When the cost functional is highly irregular and the search of local minima is particularly difficult, it could be convenient to move towards random methods as compass search (see [7] and references therein), genetic algorithms, or particle swarm optimization. In the following we describe a compass search method that works surprisingly well for our problem.
First of all, we consider only piecewise constant trajectories, introducing suitable switching times for the leaders' controls. More precisely, we assume that leaders move at constant velocity for a given fixed time interval and when the switching time is reached, a new direction is chosen. Therefore, the control variables are the velocities at the switching times for each leader. Note that controlling directly the velocities rather than the acceleration makes much simpler the optimization problem because minimal control variation have an immediate impact on the dynamics.
Starting from an initial guess, at each iteration the optimization algorithm modifies the current best control strategy found so far by means of small random variations of the current values. Then, the cost functional is evaluated. If the variation is advantageous (the cost decreases), the variation is kept, otherwise it is discarded. The method stops when the strategy cannot be improved further.

Numerical tests.
In what follows we present some numerical tests to validate our modeling setting at the microscopic and mesoscopic level.
The microscopic model (3.1) is discretized by means of the explicit Euler method with a time step ∆t = 0.1. The evolution of the kinetic density in (4.13) is approximated by means of binary interaction algorithms, which approximates the Boltzmann dynamics (4.5) with a meshless Monte-Carlo method for small values of the parameter ε, as presented in [4]. We choose ε = 0.02, ∆t = 0.01 and a sample of N s = 10000 particles to reconstruct the kinetic density. This type of approach is inspired by numerical methods for plasma physic and it allows to solve the interaction dynamics with a reduced computational cost compared with mesh-based methods, and an accuracy of the order of O(N −1/2 s ), for further details on this class of binary interaction algorithms see [4,42].
Concerning optimization, in the microscopic case we adopt either the compass search with functional (5.1) or MPC with functional (5.3). In the mesoscopic case we adopt the compass search with functional (5.4).
We consider three settings for pedestrians, without and with obstacles, hereafter referred to Setting 1, 2, and 3, respectively. In Setting 1 we set the compass search switching times every 20 time steps, and in Setting 2 every 50, having fixed the maximal random variation to 1 for each component of the velocity. In Setting 1, the inner optimization block of the MPC procedure is performed via a direct formulation, by means of the fmincon routine in MatLab, which solves the optimization problem via an SQP method. The exit is a point located at E = (30, 10) which can be reached from any direction. We set Σ = {x ∈ R 2 : |x − E| < 4}. This simple setting helps elucidating the role and the interplay of the different terms of our model. Followers are initially randomly distributed in the domain [17,29] × [6.5, 13.5] with velocity (0, 0). Leaders, if present, are located to the left of the crowd. Parameters are reported in Table 5.1.
Microscopic model. Figure 6.2(first row) shows the evolution of the agents computed by the microscopic model, without leaders. Followers having a direct view of the exit immediately point towards it, and some group mates close to them follow thanks of the alignment force. On the contrary, farthest people split in several but cohesive groups with random direction and never reach the exit. Figure 6.2(second row) shows the evolution of the agents with three leaders. The leaders' control is fixed and equal to the unit vector pointing towards the exit from the current position. Note that the final leaders' trajectories are not straight lines because of the additional repulsion force. As it can be seen, the crowd behavior changes completely since, this time, the whole crowd reaches the exit. However followers form a heavy congestion around the exit which delays notably the evacuation. This suggests that the strategy of the leaders is not optimal and that it can be improved by an optimization method. The shape of the congestion is circular: This is perfectly in line with the results of other social force models as well as physical observation, which shows the formation of an "arch" near the exits. The arch is correctly substituted here by a full circle due to the absence of walls.
Here we run both MPC and compass search optimization. In order to compare with the fixed strategy, we include a box constraint u(·) ∈ [−1, 1] N l . We choose µ f = 1, µ l = 1e − 5, and ν = 1e − 5. MPC results are consistent in the sense that for N = 2, the algorithm recovers a controlled behavior similar to the application of the instantaneous controller (or fixed strategy). Increasing the time frame up to N = 6 improves both congestion and evacuation times, but results still remain non competitive if compared to the whole time frame optimization performed with a compass search. Surprisingly enough, the latter prescribes that leaders divert some pedestrians from the right direction, so as not to steer the whole crowd to the exit at the same time, see Fig. 6.2(third row). In this way congestion is avoided and pedestrian flow through the exit is increased.
Occupancy of the exit's visibility zone is shown in Fig. 6.3 and exit times are compared in Table 6.1. It can be seen that only the long-term optimization strategy is able to avoid the peak of mass around the exit.  This suggests a quite unethical but effective evacuation procedure, namely misleading some people to a false target and then leading them back to the right one, when exit conditions are safer. Note that most of the injuries are actually caused by overcompression and suffocation rather than urgency.
Mesoscopic model. We consider here the case of a continuous density of followers. Figure 6.4(first row) shows the evolution of the uncontrolled system of followers. Due to the diffusion term and the topological alignment, large part of the mass spreads around the domain and is not able to reach the target exit.
In Fig. 6.4(second row) we account the action of three leaders, driven by a fixed strategy. As in the microscopic case, they move towards the exit. It is clear that also in this case the action of leaders is able to influence the system and promote the evacuation, but the presence of the diffusive term causes the dispersion of part of the continuos density. The result is that part of the mass is not able to evacuate, unlike the microscopic case.
In order to improve the fixed strategy we rely on the compass search, where, differently from the microscopic case, the optimization process accounts the objective functional (5.4), i.e., the total mass evacuated at final time. Figure 6.4(third row) sketches the optimal strategy found in this way: on one hand, the two external leaders go directly towards the exit, evacuating part of the density; on the other hand the central leader moves slowly backward, misleading part of the density and only later it moves forwards towards the exit. The efficiency of the leaders' strategy is due in particular by the latter movement of the last leader, which is able to gather the followers' density left behind by the others, and to reduce the occupancy of the exit's visibility area by delaying the arrival of part of the mass. In Figure 6.5 we summarize, for the three numerical experiments, the evacuated mass and the occupancy of the exit's visibility area Σ as functions of time. The occupancy of the exit's visibility area shows clearly the difference between the leaders' action, for a fixed strategy the amount of mass occupying Σ concentrates quickly and the evacuation is partially hindered by the clogging effect, the 71.3% of the total mass is evacuated. The optimal strategy is able to better distribute the mass arrival in Σ, and an higher efficiency is reached evacuating the 85.2% of the total mass.

Setting 2.
In the following we test the microscopic model (with compass search optimization) in a more complicated setting, which has also some similarities with the one considered in Section 7. The crowd is initially confined in a rectangular room with three walls. In order to evacuate, people must first leave the room and then search for the exit point. We assume that walls are not visible, i.e., people can perceive them only by physical contact. This corresponds to an evacuation in case of null visibility (but for the exit point which is still visible from within Σ). Walls are handled as in [25].
If no leaders are present, the crowd splits in several groups and most of the people hit the wall, see Fig. 6.6(first row). After some attempts the crowd finds the way out, and then it crashes into the right boundary of the domain. Finally, by chance people decide, en cascade, to go upward. The crowd leaves the domain in 1162 time steps.
If instead we hide in the crowd two leaders who point fast towards the exit (Fig. 6.6(second row)), the evacuation from the room is completed in very short time, but after that, the influence of the leaders vanishes. Unfortunately, this time people decide to go downward after hitting the right boundary, and nobody leaves the domain. Slowing down the two leaders helps keeping the leaders' influence for longer time, although it is quite difficult to find a good choice.
Compass search optimization finds a nice strategy for the two leaders which remarkably improves the evacuation time, see Fig. 6.6(third row). One leader behaves similarly to the previous case, while the other diverts the crowd pointing SE, then comes back to wait the crowd, and finally points NE towards the exit. This strategy allows to bring everyone to the exit in 549 time steps, without bumping anyone against the boundary, and avoiding congestion near the exit.
6.3. Setting 3. In the following test we propose a different use of the leaders. Rather than steering the mass towards the exit, they can be employed to fluidify the evacuation just near the exit. This test is inspired by [25,Test 2], where the contribution of the microscopic granularity on a multiscale dynamics is investigated.
We consider again the situation of Setting 1 with N = 200 pedestrians but this time we nullify the repulsion force of the leaders, see Table 5.1. The initial velocity of the followers is (0.5, 0), thus they immediately point towards the exit's visibility zone and reach the exit without the help of the leaders. However, a strong congestion is formed around the exit. Leaders are initially uniformly distributed in a square of side 2 centred at exit and move with a random velocity uniformly distributed in [−0.4, 0.4].
In the microscopic case, the average evacuation time (100 runs) with no leaders is 1304 time steps. Adding two leaders near the exit, the evacuation time drops to 1095 (-16%). In the mesoscopic case, we observe a similar but less evident speed up of the evacuation, which yields to a smaller amount of mass lost because of the diffusion. In a fixed time interval, the total mass evacuated is 90% in the case without leaders, and it increases up to 98% adding two leaders. Varying a little the number of leaders yields to similar results. Conversely, increasing the number of leaders (>5) or their velocity yields to a performance deterioration.
We conclude that introducing a small noise around the exit helps to break the symmetries and speeds up the flow. Such a symmetries (e.g., crystal configurations, small arches, hesitations due to lack of priorities) naturally arise and are the main cause of delay. Their effect is translated in the plateaux that are clearly visible after the peak in Fig. 6.3-left (fixed strategy). As a by-product, we confirmed once again the Braess's paradox, which states that reducing the freedom of choice (e.g., adding obstacles in front of the exit [27,45]) can actually improve the overall dynamics.
7. Validation of the proposed crowd control technique. The crowd control technique investigated in the previous sections relies on the fact that pedestrians actually exhibit herding behavior in special situations, so that the alignment term in the model is meaningful. To confirm this, we have organized an experiment involving volunteer pedestrians.
The experiment took place on October 1, 2014 at Department of Mathematics of Sapienza -University of Rome, Rome (Italy), at 1 p.m. Participants were chosen among first-year students. Since courses started just two days before the date of the experiment and the Department has a complex ring-shaped structure, we could safely assume that most of the students were unfamiliar with the environment.
The experiment. Students were informed to be part of a scientific experiment aiming at studying the behavior of a group of people moving in a partially unknown environment. Their task was to leave the classroom II and reach a certain target inside the Department. Students were asked to accomplish their task as fast as possible, without running and without speaking with others. Participation was completely volunteer and not remunerated (only a celebratory T-shirt was given at later time).
76 students (39 girls, 37 boys) agreed to participate in the experiment. The students were then divided randomly in two groups: group A (42 people) and group B (34 people). The two groups performed the experiment independently one after the other. The target was communicated to the students just before the beginning of the experiment. It was the Istituto Nazionale di Alta Matematica (INdAM), which is located just upstairs w.r.t. the classroom II. Due to the complex shape of the environment, there are many paths joining classroom II and INdAM. The shortest path requires people to leave the classroom, to go leftward in a little frequented and unfamiliar area (even for experienced students), and climb the stairs. The target can be also reached by going rightward and climb other, more frequented, stairs.
In group B there were 5 incognito students, hereafter referred to as leaders, of the same age of the others. Leaders were previously informed about the goal of the experiment and the location of the target. They were also trained in order to steer the crowd toward the target in minimal time. Nobody recognized them as "special" before or during the experiment. Unexpectedly, also in group A there was a girl who knew the target. Therefore, she acted as an unaware leader.
It is important to stress that all the other students continued their usual activities and participants were no officially recorded by the organizers. This choice was crucial to get natural behavior and meaningful results, cf. [26,Sect. 3.4.3]. The price to pay is that we had to extract participants' trajectories by low-resolution videos taken by two observers. Moreover, the area just outside classroom II was rather crowded, introducing a high level of noise in the experiment. Finally, let us mention that some participants broke the rules of the experiment speaking with others and running (confirming the natural behavior...).
Results. In Figure 7.1 we show the history of left/right decisions taken by the students just outside the two doors of the classroom. Group A used both exits of the classroom. The 8-people group which used door 2 was uncertain about the direction for a while. Then, once the first two students decided to go rightward, the others decided simultaneously to follow them. Similarly, the first student leaving from door 1 was uncertain for a few seconds, then he moved rightward and triggered a clear domino effect. At t = 20 s the unaware leader moved to the left, inducing hesitation and mixed behavior in the followers. After that, another domino effect arose.
Group B used only door 1. Invisible leaders were able to trigger a domino effect but this time 4 people decided unilaterally not to follow them, although they were not informed about the destination. At t = 20 s, after the passage of the last leader, a girl passed through the door and went leftward. Then, at t = 22 s, she suddenly began to run toward the right. She first induced hesitation, then triggered a new rightward domino effect.
Discussion. Students have shown a tendency to go rightward, being that part of the Department more familiar, but this tendency was greatly overcame by the wish to follow the group mates in front, regardless of their right/left preferences. This fact was confirmed by the students themselves after the experiment. Indeed, they admitted that they have been influenced by the leaders because of their clear direction of motion. Interestingly, no more than 5 people have been influenced by a single leader. This is compatible with the topological alignment term used in our model. The small value of N here is due to the fact that the space in front of the doors was rather crowded and the visibility was reduced.
It is also interesting to note that only 4 students reached the target alone. This confirms the tendency not to remain isolated and to form clusters. Video recordings also show that (bad or well) informed people behaved in a rather recognizable manner, walking faster, overtaking the others and exhibiting a clear direction of motion. No follower has overtaken other people.
The 5 incognito leaders adopted a good but non optimal strategy, being too close to each other, too much concentrated in the front line, and too fast (cf. Fig. 6.6(second row)). This was due to the fact that organizers expected a clearer environment in front of the doors and consequently a rapider coming out of the people. A better distribution in the crowd would lead to better results, although nothing can be done against a quite noticeable bad informed person.
Conclusions and future work. Both virtual and real experiments suggest that a bottom-up control technique for crowds like the one proposed in the paper is actually feasible and surely deserves further attention. Numerical simulations can give important insights about optimal strategies for leaders, which can be then made ethically acceptable and tested in real situations. The main open issue regards the minimal number of leader needed to reach the desired goal. For that, further theoretical investigation as well as real experiments with a larger number of participants and a clearer environment are needed. The results obtained in Setting 3 should be also experimentally confirmed.