Designing quantum experiments with a genetic algorithm

We introduce a genetic algorithm that designs quantum optics experiments for engineering quantum states with specific properties. Our algorithm is powerful and flexible, and can easily be modified to find methods of engineering states for a range of applications. Here we focus on quantum metrology. First, we consider the noise-free case, and use the algorithm to find quantum states with a large quantum Fisher information (QFI). We find methods, which only involve experimental elements that are available with current or near-future technology, for engineering quantum states with up to a 100-fold improvement over the best classical state, and a 20-fold improvement over the optimal Gaussian state. Such states are a superposition of the vacuum with a large number of photons (around $80$), and can hence be seen as Schr\"odinger-cat-like states. We then apply the two most dominant noise sources in our setting -- photon loss and imperfect heralding -- and use the algorithm to find quantum states that still improve over the optimal Gaussian state with realistic levels of noise. This will open up experimental and technological work in using exotic non-Gaussian states for quantum-enhanced phase measurements. Finally, we use the Bayesian mean square error to look beyond the regime of validity of the QFI, finding quantum states with precision enhancements over the alternatives even when the experiment operates in the regime of limited data.

Engineering quantum states with specific properties plays a part in all experiments and technologies in quantum physics. But designing optimal experiments to engineer such states can be challenging, in part due to the counter-intuitive nature of the theory. This has prompted a number of recent works allocating the task of experiment-design to artificial intelligence and machine learning [1][2][3][4][5]. Many of the computer-designed experiments have surpassed the best human-designed alternatives.
Building on this approach, we introduce a genetic algorithm, named AdaQuantum [79], that designs quantum optics experiments to produce quantum states with the required properties. The algorithm is flexible and modifiable, so that researchers with a range of requirements and available quantum-optics equipment can use it to design and optimise their experiments, as well as facilitating theoretical research in quantum state engineering and experimental design. In addition, the algorithm has a powerful computational engine, so that it efficiently and effectively searches for the optimal experimental designs in a given setting, and can allow for truncations of the Hilbert space up to 170 photons, enabling even more exotic states to be found. A selection of the experiments designed by AdaQuantum are shown schematically in Fig. 5. * Contact email address: Paul. Knott@Nottingham.ac.uk To run AdaQuantum we must specify two things: first, we specify the toolbox of equipment that is available to construct our experiment, or more specifically, which quantum states, operations, and measurements we would like AdaQuantum to optimise over. Second, we must specify a fitness function, which allows us to quantify how good the quantum state outputted by AdaQuantum is for our purposes. In principle, any fitness function that takes as input a quantum state, and outputs a real number, can be used. Given this, AdaQuantum then performs an automated search to find arrangements and parameter settings of the experiential equipment that produce a state that maximises (or minimises) the fitness function. The flexibility built into AdaQuantum means that different fitness functions can easily be implemented, such as the fidelity with useful quantum states such as Schrödinger-cat states or states used in optical quantum computation [6]; or measures of entanglement, coherence, non-classicality, and so on. A flow chart illustrating the overall structure and usage of our algorithm is shown in Fig. 1.
As a demonstration of the great potential of our algorithm, in this work we focus on fitness functions relevant to the field of quantum metrology, and we will see how its application reveals important findings that will help to design better metrological protocols. We first consider noise-free experiments, and use as our fitness function the quantum Fisher information (QFI) of the (pure) output state. The QFI allows us to quantify how precisely a given state can measure a phase shift in an interfer- ometer -a valuable fitness function for optical quantum metrology [7,8]. Next, we apply the two most dominant noise sources in our quantum optics setting -photon loss at the output, and imperfect heralding measurementsand use the mixed state QFI as our fitness function.
The third fitness function that we optimise is the Bayesian mean square error (BMSE). Although the QFI plays a crucial role in quantum metrology, using the theory associated with it requires a number of assumptions that are not always fulfilled in experiments. For example, it often assumes that a large number of repetitions has taken place and that certain prior information was available [9][10][11][12]. The BMSE, on the other hand, factors in and allows us to control both of these explicitly, and thus gives a reliable measure of the phase-measuring capability of a given state in realistic experimental settings [13].
Our main results obtained by AdaQuantum are threefold: First, in the noiseless case we find a number of methods of producing quantum states with QFIs up to 20 times larger than the optimal Gaussian state, which amounts to a 100-fold improvement over the best classical state. We show that these states can be thought of as Schrödinger-cat-like states, as they contain superpositions of the vacuum with a large numbers of photons (around 80). Second, we find methods of producing states that can still beat the optimal Gaussian state, even when realistic levels of the dominant noise sources (photon loss in the heralding measurements and in the final state) are included. Finally, using the BMSE for the problem of measuring a phase shift in an interferometer, we find states that beat both the optimal Gaussian state and the coherent state when the experiment operates with realistic prior knowledge and a reasonable number of experimental repetitions. This will enable experiments to create exotic states with enhanced phase-measuring capabilities.
FIG. 2: The generic blueprint for the quantum optics experiments that we use an algorithm to optimise. An input state, |ψ in , is acted on by a series of operators,Ôi, before heralding measurements are performed to produce an output state, ρout.

I. USING A GENETIC ALGORITHM TO DESIGN EXPERIMENTS
A. Optical quantum state engineering The quantum optics experiments we design follow the blueprint depicted in Fig. 2. In this scheme, we start with an N -mode state, |ψ in , which consists of independent one-and two-mode states. This state is acted on by up to m operators in sequence, which each act on one or two modes, with the two-mode operations serving to mix and entangle the modes. Finally, in N − 1 of the output modes heralding measurements are performed, modelled by POVMs. When each of these heralding measurements are simultaneously successful, the state in the N th mode is the output state, which we then apply our fitness function to. With appropriate choices of experimental elements (which we introduce below), many of the state engineering schemes in the literature fit into our blueprint illustrated in Fig. 2 [14][15][16][17][18].
Although N is flexible, here we concentrate on N = 2 modes because having more modes is more difficult for both experiments and for our computer simulations (later we outline future work to overcome the computational challenges and enable N > 2 modes). Experimentally, having more modes substantially lowers the overall probability for producing the desired output state, and adds significant additional noise.
To design an experiment using the blueprint in Fig. 2, we need a toolbox of states, operations and measurements -the full list of these elements that we consider is given in Table I. Our focus is on producing experimental designs that are feasible with current or near future technology, and hence our toolbox only contains such elements. Other elements can easily be added to this list, and the list can be customised to match the capabilities of a particular laboratory.
Here we only introduce the most important details of the toolbox; more details can be found in Appendix A. Firstly, the input states we include are the single-mode squeezed vacuum |ζ i the two-mode squeezed vacuum States |n , |α , |ζ i , |ζ ij OperationsD(α),Ŝi(ζ),Ŝij(ζ), exp (inφ),Ûij(T ) Measurements |n n|, Bucket, Multiplex, |x λ x λ | Here, i and j are arbitrary modes. The identity operator is also included in all runs of the algorithm.
(TMSV) |ζ ij , the coherent state |α , and Fock states |n ; the parameters z, α and n are constrained by what is possible experimentally.
Next are the operators, of which the most important is the beam splitterÛ ij (T ), where T is the probability of transmission, which serves to mix and entangle the two modes, enabling more exotic and useful states to be produced when part of the entangled state is measured. The other operators we use are the displacement opera-torD(α), the phase shift exp (inφ), and the single-and two-mode squeezing operators, given byŜ i (ζ) andŜ ij (ζ) respectively.
The first measurement we include is the photon number resolving detection (PNRD), given by |n n|. However, due to the difficulty of implementing a PNRD we include two easier measurements, the bucket detector [19] and the multiplex detector [20]. The latter is achieved by separating the mode you want to measure into several modes via spatial-or time-multiplexing [20][21][22], followed by a bucket detection performed on each of these separated modes. Finally we include the homodyne detector, which is the projection onto a line in phase space, characterised by the projector |x λ x λ |.
We focus on the two most dominant noise sources in optics experiments taking the form of Fig. 2: i) imperfect photon detectors, which can be modelled as photon loss prior to detection, and ii) photon loss on the output state. Noise is also present in the initial state preparation and in implementing the operations, but we do not include these as they are typically smaller than the detection and output-state loss (though future work will also incorporate these). See Appendix A 6 for details of how we simulate noise.

B. Our genetic algorithm
We now introduce and describe our algorithm, AdaQuantum, that designs quantum optics experiments. To run our algorithm, the user first has to specify which input states, operators and measurements they would like the algorithm to search over. This is done via the user interface shown in Fig. 9. As well as selecting the toolbox, the user can specify the ranges of the parameters, the loss rates of the detectors, the loss rate on the final state, and a number of other details that we will introduce below.
Once a user has specified their toolbox, they next tell the algorithm what kind of quantum state they require by specifying a fitness function. The fitness function must take as input a quantum state, and output a real number that we wish to maximise (or minimise). In section II we introduce three different fitness functions that quantify the performance of the state for measuring a phase shift, but in principle any fitness function that can be written in the above-mentioned form can be incorporated into the algorithm (more on this below). Next, we ask the question: How can we best arrange the states, operators and measurements, so that the final quantum state maximises the fitness function? This is a search problem; to perform this search, we use a genetic algorithm, which is a powerful and flexible global search metaheuristic based on evolution by natural selection.
To use a genetic algorithm for our problem, we must first encode each potential experiment into a genome. Our genome is a vector containing a mixture of integers and real numbers. For the most part, the integers in the genome encode which quantum optics elements will make up the experiment and the arrangement of these elements. The real numbers then encode the parameters of the different elements, such as the transmission probability of a beam splitter or the magnitude and phase of a coherent state. Given any (valid) genome, AdaQuantum then simulates the corresponding quantum optics experiment, determines what quantum state will be outputted, and calculates the fitness value by applying the chosen fitness function to the output state. See Fig. 5 for examples of different experiments, each of which corresponds to a different genome.
The task we then set the genetic algorithm is to search for a genome that maximises the fitness value. Genetic algorithms start by creating a collection of genomes, which together are known as the population. Next, the fitness function for each genome in the population is evaluated. The "fittest" genomes -i.e. the genomes with the largest fitness value -are then selected, and a new population of genomes is generated by mixing some of the genomes together (crossover ) and by modifying (mutating) others. This next population should, in principle, be comprised of genomes that are "fitter". This process repeats through a number of generations, until it is unlikely that any more generations will result in improvements. At this stage, if the algorithm has been designed appropriately, the fittest genomes will encode optimised solutions. A flow chart of a genetic algorithm is given in Fig. 3, and a more detailed description of how genetic algorithms work can be found in Appendix. B.

C. Using a 3-stage algorithm
The slowest part of running the genetic algorithm is the simulation of each quantum optics experiment, and this must be done a large number of times in order to complete a thorough search of the search space (here the search space is the space of all possible quantum optics experiments that can be constructed with the selected toolbox). The dominant parameter that determines the speed of the simulation is the truncation of the Hilbert space: we have to truncate each state |ψ (and correspondingly the operators and measurements), such that |ψ = ∞ i=0 c i |i becomes |ψ ≈ T i=0 c i |i , where T is the truncation. The larger the value of T the more accurate the simulation will be, but the time taken for each simulation increases (approximately) exponentially as we increase T .
However, while the "best" quantum states we found required a large value of T in order for the simulation to be accurate (e.g. see Fig. 6), we also found that for a significant proportion of the possible experiments an approximate simulation, using a small value of T , can still provide valuable information to guide the search. To exploit this fact, and overcome the challenge of requiring a large T for accuracy but a small T for speed, we settled upon the following 3-stage algorithm: Stage 1: A large number (around 10 7 for our main algorithm runs) of random genomes are created and evaluated using simulations with a small T , e.g. T = 30. A collection of genomes with the best fitness values are selected for the next stage. While this first stage is only approximate due to the small value of T , it still rules out many ineffective experiments and gives the genetic algorithm in the next stage a much stronger starting population.
Stage 2: We run Matlab's inbuilt genetic algorithm [23] with a medium-sized population (around 10 5 for our main algorithm runs). Here T is larger (e.g. T = 80) and the genetic algorithm runs for a set number of generations, usually 10. Stage 3: In the final stage, the simulation is accurate but slow. In this stage, the fitness function will first simulate the circuit specified by the input genome at a very low truncation (e.g. T = 20), and then repeat this, increasing T on each iteration (in steps of 10), until both the mean number of photons in the final state and the fitness value converge (indicating that increasing T does not change our results), or until the maximum value of T is reached (where the maximum value of T is specified by the user, labeled "Max. Truncation" in Fig. 9). This ensures the results are reliable and accurate, while running much faster than if we had chosen the maximum value of T for each genome. Here the population is smaller (around 10 4 for our main algorithm runs). By the end of this stage (which usually runs for around 40 generations), the algorithm should converge to a genome corresponding to a quantum optics experiment that produces the desired quantum state.
We compared our 3-stage genetic algorithm to Matlab's built-in standard genetic algorithm, pattern search, swarm, and simulated annealing algorithms [23], and our algorithm performed significantly better than all. Our tests also showed that removing any one of the 3 stages described above gave sub-optimal results. In Appendix C we overview the genetic algorithm settings (such as the selection, mutation and crossover functions), and give details of the hyperparameters we use.
We took a number of steps to speed up our algorithm [80]. Most importantly, we utilised the algorithm outlined in [24], and available as a Matlab function at [25], which implements our operations by calculating exp(tA)b without needing to explicitly form exp(tA), where t is a scalar, A is a matrix and b is a vector. This produces a dramatic speed up, and allows us to simulate experiments with a value of T as large as 170 photons (in two modes) in the order of seconds. This is a significantly larger value of T than comparable algorithms in the literature [1,26]. As we will see next, this allows us to generate, among other things, quantum states comprised of a superposition of the vacuum with 80 photons.

D. Designing AdaQuantum for flexibility
A key design focus when constructing AdaQuantum was flexibility, so that researchers with diverse requirements can use it to achieve their goals. The first flexible aspect is the toolbox selection, which is done using the user interface in Fig. 9. Once the user has selected the states, operators, and measurements, AdaQuantum automatically designs a genome that reflects this choice. Therefore, different runs of the algorithm with different toolbox selections will have different genome structuresboth the genome length and the positions of the integers and real numbers will change. The number of modes in the experiment can also be modified, as can the number of operations prior to measurement (see Fig. 2), and again AdaQuantum adapts to this. For example, when we select 4-modes the algorithm automatically knows that the input state can contain at most two 2-mode states, and that we will require three heralding measurements in order to produce a single mode output state.
In addition, when the user chooses a loss-free experiment with either number resolving or homodyne detectors, then the output state will be a pure state, whereas if there is loss (on either the final state or the measurements), or if the measurements have uncertain outcomes (i.e. when using the on/off or multiplex detector), then the output will be mixed. These features allow a huge range of experiments to be searched over, with applications for both experimentalists looking for lab-ready experiments, and theorists looking to push the limits of what is possible in hypothetical experiments.
Another key part of the flexibility is that AdaQuantum has been designed so that elements can be easily added, removed or modified from the toolbox, and similarly for the fitness functions. AdaQuantum is available on GitHub [27], which includes a User Guide that explains how to modify the toolbox and fitness functions. Modifying the toolbox will allow experimental elements that we haven't yet considered to be included, increasing the applicability of the algorithm. Perhaps more importantly, adding more fitness functions will allow AdaQuantum to be used to find quantum states for a wide range of applications. As long as a fitness function can be programmed in a form that takes as input a density matrix, and outputs a real number, then it can be incorporated into the algorithm.
Once a fitness function or toolbox element is added/modified, AdaQuantum automatically updates the user interface ( Fig. 9), and again the genome changes accordingly. Therefore, with simple modifications AdaQuantum can, for example, search for non-Gaussian states [28,29], states for multi-parameter estimation [30][31][32][33], or states with certain non-classical properties [34][35][36]. In this paper we focus on fitness functions for quantum-metrology applications (see next section), and in a spin-off project we have already designed fitness functions -and then used AdaQuantum to design experiments -to produce states with a high fidelity to a range of target states, such as cat states [37].

II. RESULTS
In this section, we present the results of running the algorithm with several different fitness functions and toolboxes, which are all relevant to the field of quantum metrology. In Appendix C we give a detailed discussion of the numerous steps and choices involved in running AdaQuantum to obtain these results.  Table. II for details of all the states. The toolboxes Full, Tool 1, Tool 2, and No PNRD are described in the main text. We see large improvements over the CS and SV, even for the experimentally-viable toolboxes Tool 1, Tool 2, and No PNRD. We also compare our results to the squeezed vacuum state with the largest value of squeezing so far produced in experiments, which, to the best of our knowledge, is the 15dB squeezed state in Ref. [38]. For this state, F Q /n = 66.4, which is illustrated as the horizontal dotted green line.

A. Fitness function 1: Pure state QFI
For our first fitness function we consider only noise-free experiments, and we consider the quantum Fisher information (QFI) of the (pure) output state |ψ for measuring an unknown phase φ. This is given by [7] Here |ψ φ = exp(inφ)|ψ is our state after the phase shift φ is applied, and |ψ φ ≡ ∂ ∂φ |ψ φ . The QFI is useful because it can be used to tell us the precision with which our state |ψ can measure the phase shift φ. This is found using the quantum Cramér-Rao bound (CRB) [7]: where δφ is said precision, and µ is the number of repetitions of the experiment. Note that |ψ φ = in|ψ φ . We can therefore write Eq. (1) independently of φ, for example In this way, we can calculate the QFI directly from the output state of our circuit, |ψ , without needing to perform any differentiation, which improves the performance of our algorithm.
The actual fitness function we use here is the QFI scaled byn (the mean photon number of the output state), F Q /n. If we just used the un-scaled QFI, then the algorithm would just try to make bigger and bigger states, as even a classical state has an unbounded QFI in the large photon-number limit. Such an optimisation would soon produce states that require Hilbertspace truncations that are both too large to simulate, and require unrealistic experimental equipment. Using F Q /n overcomes this problem. (Note that it would also be possible to include a restraint in the fitness function that penalizes the deviation ofn from a given number.) We also use F Q /n to compare our results with alternatives in the literature. While a large QFI does amount to a high precision, it does not give a fair comparison when different states have different intensities (where we define intensity as the mean number of photons in the state). To compare different states the key quantity of interest is the precision that a given state can estimate the phase shift (δφ), but in an attempt to keep the comparison fair we choose to fix the total intensity that each state is allowed to use in the estimation procedure. We label the total intensity (the 'resources') R, which is given by R =nµ, where µ is the number of times the state is sent through the phase shift. By taking R as a constant, the CRB in Eq. (2) can be written as We see that by fixing R, we can use F Q /n to compare the performance of different states.
To judge the success of our results, we will compare the states we produce to a coherent state, which is the optimal classical state, and to a squeezed vacuum state, which is the optimal Gaussian state when there is no noise [39,40]. Fig. 4 shows the comparison of these states with our results for the noiseless experiments found by AdaQuantum. Here we chose to display the squeezed vacuum and coherent states by plottingn against F Q /n. But this is not the only way to compare these states with our results. As an alternative, we can compare the states found by AdaQuantum against a squeezed vacuum state with the largest value of squeezing so far produced in experiments, which, to the best of our knowledge, is the 15dB squeezed state in Ref. [38]. For this state, F Q /n = 66.4 (calculated using [41], with |ζ| = 1.73), which we include in Fig. 4 as the horizontal dotted green line.
In Fig. 4 a different experimental scheme has been found for each of the four different toolboxes we have tested. The most expansive toolbox we studied is named 'Full' in the figure. This includes all of the toolbox elements listed in Table I (and described in Section I A). Note that, for this toolbox, squeezing operators are allowed on arbitrary states. In this toolbox, the strength of the squeezing |ζ| is limited to 1.4 in both the squeezing operators and squeezed states; |α| is limited to 5 in both initial coherent states and displacement operators; and number states are limited to n = 5. The limit to the number of photons resolved by PNRDs is n = 10. The best experiment devised by the algorithm, detailed in Table. II, utilises the squeezing operators and heralds on the detection of 10 photons. The resulting state has a F Q /n approximately 20 times higher than the corresponding squeezed vacuum state, and 100 times higher than the coherent state (and note that the squeezed vacuum state has a larger QFI than both the NOON state and the Holland and Burnett state [41][42][43]). This state even beats the largest squeezed vaccum so far produced in experiments (to the best of our knowledge) [38] by a factor of 6 (though if we had allowed this much squeezing in our runs of AdaQuantum, as opposed to limiting the squeezing to |ζ| = 1.4, then our results would likely be improved significantly).
However, acting with squeezing operations on arbitrary states is extremely challenging at present, so we will move on to toolboxes 'Tool 1' and 'Tool 2'. These are both the same as 'Full' but with the squeezing operations removed. They are very similar to each other, the only difference being that 'Tool 2' has the photon detection limited at n = 10, as in 'Full', and 'Tool 1' has the photon detection limited to n = 6. Running the algorithm with these toolboxes produced states with F Q /n still far higher than the squeezed vacuum and coherent state. Finally, we remove the ideal number measurements to form the toolbox 'No PNRD', which instead includes bucket detectors and multiplex detection consisting of 16 bucket detectors, for up to 6 photon heralding. The resulting experiment found by AdaQuantum uses a heralding measurement on the vacuum |0 0| and produces a state with F Q /n still significantly higher than the squeezed vacuum. The states found by AdaQuantum for all these toolboxes are detailed in Table II, and schematics of a selection of the experiments designed by AdaQuantum are given in Fig. 5.
Why do these states give such large improvements over the alternatives? This can be revealed by looking at the number-distribution of the states. Here we focus on the state labeled 'Tool 2', and leave such an analysis of the other states for future work. By writing the state as in Fig. 6 we plot the log of |c n | 2 against n [81]. The figure reveals that this state has a large contribution of the vacuum, and a small but significant contribution of a huge number of photons, around 60-100. Such a state can be seen as a Schrödinger-cat-like state: the human eye can directly detect less than 100 photons [44], so, arguably, this state can be seen as a superposition of a macroscopic state with the vacuum. More importantly for metrology, such a state has a large QFI because it has a large variance (with respect to the encoding Hamiltonian), whilst These designs produce states that optimise the QFI, as introduced in Section II A. The performance of these states is shown in Fig. 4, and the details of the parameters are given in Table. II. The top left design uses the toolbox Tool 1, whereas top right uses toolbox No PNRD. Bottom left: A design to produce a state that optimises the QFI, after loss is applied to the measurement and final state, as introduced in Section II B. The performance of this state is shown in Fig. 7, where each loss rate requires different parameters. Bottom right: A design to produce a state that optimises the Bayesian mean square error (BMSE) with 1 repetition (µ = 1), as introduced in Section II C. The performance of this state is shown in Fig. 8, and the details of the parameters are given in Table. IV.   Fig. 4 (schematics of a selection of the experiments designed by AdaQuantum are given in Fig. 5). The contents of each of the toolboxes are described in the text, and p(%) refers to the heralding success probability as a percentage. To clarify the notation used here, the state found with toolbox Tool 1, for example, is engineered by first creating a pair of squeezed vacuum states (with the parameters given in the table). Next, displacement operators act on both modes. A beam splitter is then applied, which entangles the two modes, and finally a 10 photon heralding measurement is performed. The probability of this heralding measurement being successful is 1.19%. The algorithm's flexibility means that we could easily re-run it to search for states with a higher heralding probability, if this was desired.
retaining a small mean number of photons. This state can therefore be seen as a realistic version of the so-called ON state, which has already attracted much attention in metrology [45,46]. This state is also useful for quantum computing with continuous variables [47]. cn|n , we plot the log of |cn| 2 against n. We see that this state is a superposition of the vacuum with a large number of photons, and hence can be seen as a Schrödinger-catlike state. To see how this state is made, see Table. II.
this noise is applied the resulting state will be a mixed state ρ. We then apply the mixed-state QFI to this state (again scaled by the mean photon number in the state, n). To calculate the mixed-state QFI, we first need to apply the phase shift to ρ, giving ρ φ , and then find the eigenvalues and eigenvectors of this state by writing it as ρ φ = m q m |ψ m ψ m |. We then use the form of the mixed-state QFI developed in [48]: where F Q i is the pure state QFI of eigenstate |ψ i . We use a similar technique as with the pure-state QFI to write the mixed-state QFI so that it does not involve differentiation. It should be noted that we deal with pure states, in the form of vectors, until the measurement stage, when we switch to the density matrix formalism, as this allows us to use the matrix exponential technique discussed above and in [24].
In the noiseless case, we saw that AdaQuantum found quantum states with dramatic improvements over the squeezed vacuum and coherent state. Noise will inevitably diminish such improvements, but up to now it has been an open question whether it was possible to improve over the squeezed vacuum and coherent state at all when realistic experimental equipment and noise are factored in. Note that the squeezed vacuum and coherent state do not require heralding, so they escape the effects of imperfect heralding.
As discussed above, imperfect heralding can be modeled by applying loss to the state prior to heralding. For i.e. these states are formed by sending two squeezed vacuum states through a beam splitter, followed by 2-photon heralding. Note that, despite the fact that the states found by AdaQuantum have the same form for all loss rates, we optimised AdaQuantum for each loss rate separately. I.e. we first fixed the loss rate, then ran AdaQuantum to find states to maximise the QFI for that rate.
simplicity, we fix the loss to be the same for both the heralding measurements and on the output state. Unlike the noiseless case, here we are specifically focused on experiments that can be realistically performed without specialised equipment. Therefore, the toolbox we use here does not use squeezing operations and has the following limits to the parameters: n = 4 for Fock states; |α| = 5 for coherent states and displacement operator; |ζ| = 1 for squeezed states; and photon number resolving up to n = 6. The results produced for varying values of loss are shown in Fig. 7. The value of F Q /n for each of these results is higher than the squeezed vacuum at the same loss rate -this opens up the possibility for experiments to create these exotic non-Gaussian states, and use them to beat the squeezed vacuum state in phase estimation. The heralding probabilities to produce the desired output states are quite reasonable (around 10-20%) and so the experiments proposed here can be carried out experimentally in reasonable time. Note that this experiment requires a 2-photon heralding measurement (which can be performed with multiplexed bucket-detectors) and single mode squeezed states, which can all be achieved with current or near-future technology (for example see [20-22, 49, 50]).

C. Fitness function 3: Beyond the QFI
The third fitness function that we optimise here is the Bayesian mean square error (BMSE). Despite the importance of the QFI as a method of quantifying the metrological performance of different states, in general its usefulness depends on the possibility of recasting the problem at hand in the language of the local approach to estimation theory [9,[11][12][13]51]. In particular, approaching the CRB in equation (2) typically requires either having certain prior knowledge and repeating the experiment a large amount of times [9,10], or just having a large amount of prior information [11,12]. But in realistic scenarios the number of repetitions of the experiment can be small, and the formalism based on the QFI does not take into account the possibility of having a moderate amount of prior information [13]. The BMSE factors in both of these, and hence gives a reliable measure of the phase-measuring capability of a given state in a realistic experimental setting.
To use the BMSE, unlike when we use the QFI, we need to specify what measurement scheme will be used to extract the information about the phase shift from the probe state. For the results in this paper each run of AdaQuantum produces a single-mode state |ψ , which cannot be used on its own to estimate a phase shift because in experiments we can only access the information about relative phase shifts. One approach we can use is to take a pair of such states, |ψ ⊗ |ψ , and then encode a difference of phase shifts as |ψ θ = exp[−i(n 1 −n 2 )θ/2]|ψ ⊗ |ψ . A measurement can then be performed on both modes, which for simplicity we take to be ideal and given by the POVM elements {|M M |}. Examples of measurements are given below.
The probability of finding the outcome M given the unknown value of the phase is p(M |θ) = || M |ψ θ || 2 , and if the previous preparation and measurement stages are repeated µ times and the outcomes M = (M 1 , . . . , M µ ) are recorded, then their probability is p(M |θ) = i p(M i |θ). Furthermore, imagine that we know in advance that the phase can only be found within an interval of width π/12 centred around zero, a state of knowledge that can be represented by the prior probability p(θ) = 12/π, for θ ∈ [−π/24, π/24], and zero otherwise. We can then combine all these pieces of information using Bayes' theorem, which gives us the posterior probability p(θ|M ) = p is the variance of the posterior probability. We note that since the posterior p(θ|M ) contains the information about the phase, the BMSE can be understood as an optimal way of quantifying the quality of this information on average for a given POVM. In particular,¯ mse is a measure of uncertainty. Further details and the justification of this framework can be found in [9,11,13,52]. The task given to AdaQuantum is, in this case, to find single-mode states |ψ that minimise¯ mse for a given number of repetitions and measurement scheme. Since the calculation of the BMSE is more demanding than finding the QFI (see, e.g., [9]), we have chosen a narrow prior to simplify the calculation of the integrals in equation (8). However, this does not imply that we could have used the QFI instead, since it was argued in [13] that the prior width needs to be 0.1 or smaller if the QFI is to be a suitable figure of merit in a similar scheme, and here the width is π/12 ≈ 0.3. Thus our choice corresponds to the regime of intermediate prior knowledge where the BMSE is useful [13]. In addition, note that, unlike for the QFI, we do not need to rescale the BMSE by the average photon-number because the BMSE is already a direct measure of the estimation performance. To keep the comparison fair, all states in this section contain an average photon number of 1.
We will optimise the BMSE using two different strategies, both using the same toolbox as for the mixed-state QFI above. First we focus on a specific and practicallymotivated POVM: counting photons after the action of a 50:50 beam splitter, and we set the algorithm to optimise the BMSE for µ = 4, µ = 8 and µ = 12 repetitions. This search produces a state that takes the form |ψ = N n|Û 12Ŝ12 |0, 0 for µ = 4 and µ = 12, where N is the normalisation, while for µ = 8 we find |ψ = N n|P 1Û12Ŝ12 |0, 0 , where P 1 is a phase shift in the first mode. Table IV provides the numerical parameters that generate these states. The uncertainty associated with two copies of these probes for each number of trials has been represented in Fig. 8.a (individual points), and we have also included the BMSE of both an interferometer with a squeezed vacuum per port (dashed line) and the configuration that mixes a coherent state and the vacuum with a 50:50 beam splitter (solid line) [82]. This figure shows that the states found by AdaQuantum perform better than these two benchmarks, which constitutes a clear demonstration that AdaQuantum is able to optimise a Bayesian figure of merit beyond the regime where the QFI is useful.
More concretely, we can quantify this improvement by introducing the quantity   (9) to quantify the enhancement of the states found by AdaQuantum with respect to two squeezed vacuums (SV) and a coherent state mixed with the vacuum (CS). The details of the experimental configuration are those indicated in Fig. 8 and in the main text.
where¯ r is the BMSE of any of the two reference states that we are employing and a positive I r indicates that there has been an improvement. Its calculation, whose results are summarised in table III, shows an enhancement of between 2% and 7% with respect to the squeezed vacuum, and between 5% and 15% with respect to the coherent state.
The second strategy is to perform an analytical optimisation over all possible POVMs first, and then set AdaQuantum to find experiments that optimise this strategy. (Note that during AdaQuantum's search, the optimal measurement scheme for estimating the parameters has been selected -AdaQuantum just optimises over the state-engineering part of the experiment, i.e. the arrangement of elements in Fig. 2). Following [13], first we recall that the single-shot BMSE satisfies [53][54][55] where Sρ + ρS = 2ρ, ρ = dθp(θ)|ψ θ ψ θ | andρ = dθp(θ)θ|ψ θ ψ θ |. This bound can always be saturated when the measurement scheme is given by the projections onto the eigenstates of S [53][54][55], where S is an operator that only depends on the transformed state and the prior. Therefore, we can set the algorithm to search for states that minimise¯ mse (µ = 1, |M M | = |s s|), where {|s } are the eigenvectors of S. Here we find another state with the form |ψ = N n|Û 12Ŝ12 |0, 0 but with different parameters (see table IV).
As table III shows, the state found by AdaQuantum for the optimal single-shot measurement is 3% better than two squeezed vacuums measured by their correspondent single-shot POVM, and 4% better than the coherent states. In addition, we note that using this measurement scheme in a sequence of repeated experiments is an appropriate strategy when we cannot or we do not wish to correlate different trials [13]. The performance of this state for the first 20 repetitions of the scheme, which has been represented in Fig. 8.b, shows that the state found by AdaQuantum using the optimal single-shot POVM is better than the benchmarks even when the number of repetitions grows. To summarise, we can say that the combination of AdaQuantum and the methodology introduced in [13] provides a robust method to find practical probe states with a strong performance for those systems that operate in the regime of limited data.  IV: Details of the circuits produced by AdaQuantum using the Bayesian framework. The first two are for a photon counting measurement after the beam splitter and that the last one is for the optimal single-shot POVM.

III. DISCUSSION AND CONCLUSIONS
This paper is not the first to apply techniques from machine learning to the task of designing quantum experiments [1-5] -we give a full comparison to these alternatives in Appendix D.
The results in the final section on the BMSE may have important general consequences for quantum metrology. The formal solution to the optimisation problem posed by a general quantum estimation scheme has been known for some time (see [54]). However, finding an analytical form of this solution for specific problems is challenging and generally not possible, which explains why we usually rely on bounds such as the CRB. And while in a sense the limits generated by the latter can be seen as fundamental [12], it can be argued that this is only useful when the bound can be safely applied to a problem in practice, which is not always the case (because realistic experiments put limits on the number of experiment repetitions and the prior knowledge available) [9,10]. Therefore, the fact that our algorithm is able to find useful metrology protocols with more general fitting functions such as the BMSE or its single-shot optimum in equation (10) allows us to see it as a promising route to design quantum experiments using more reliable and general figures of merit, including, for instance, other cost functions different from the square error, or even multi-parameter systems [33,56].
One clear direction for future work is to extend our runs of the algorithm to more than 2 optical modes. We attempted to do this in this project, but the additional simulation time required for > 2 optical modes meant that our global search was not effective. For 2 optical modes, most of the results presented here were generated by running the algorithm for 96 hours on 16 cores of the University of Nottingham's High Performance Computing facility. Running for such times allowed us to run the genetic algorithm with very large populations (in the order of thousands or tens of thousands). When moving to 3 modes, the exponential slow-down in computing time required to simulate this larger Hilbert space meant that such large populations were not possible. To overcome this, one main focus of our future work will be to significantly enhance the global search, by, among other things, exploring a range of metaheuristic search methods, and by improving the way the genome is encoded. We are confident that this will allow effective searches for quan-tum experiments in at least 3 modes, with the potential of finding quantum states with significant enhancements.
A further direction is to extend the fitness functions optimised by the algorithm. Ongoing work is searching for experiments to produce a range of specific states (such as squeezed cat states [57] and GKP states [6]). Beyond this, there are a broad range of fitness functions that could easily be incorporated, including searching for non-Gaussian states, highly entangled states, and states with a negative Wigner function (our GitHub page [27] includes a user guide that explains how new fitness functions can be added).
Another interesting question for future work is to understand why the experiments we presented here perform so well in maximising their respective fitness functions. This might be especially revealing in the states that are robust to noise: why do these specific experimental arrangements found by AdaQuantum create states that still perform well with photon loss?
In conclusion, we have introduced a genetic algorithm for designing quantum optics experiments. We demonstrated the flexibility of our algorithm by optimising three different fitness functions, and in all three the algorithm found improvements over the commonly used alternatives. Perhaps most notably, the algorithm found a realistic method of producing a Schrödinger-cat-like state comprised of a superposition of the vacuum with a large number of photons (around 80), which displays substantial phase-measuring improvements over the competing strategies. We emphasise here that our algorithm can in principle be used to design experiments to produce optical states with any desired properties, so long as these properties can be quantified and hence optimised over. Our algorithm, together with related work in the literature [1][2][3][4][5], highlights the power of utilising methods from artificial intelligence and machine learning to design and optimise quantum experiments.
tum Technology Hub in Quantum Enhanced Imaging (EP/M01326X/1). PK acknowledges support from the Royal Commission for the Exhibition of 1851.
The first measurements we include is measuring the number of photons in a mode -photon number resolving detection (PNRD) -which is implemented with the POVM element |n n|. However, it can be difficult to precisely measure the number of photons so we also include two more-easily-implementable measurements. The first of these is a bucket detector (aka on/off, threshold, or click detector). This measures whether at least one photon is absorbed by the detector (it 'clicks'), but it is unable to determine the exact number of photons. It is represented by the POVM, {|0 0|, I − |0 0|}, where I is the identity. These are commonly found in laboratories, for example in the form of avalanche photodiodes [19].
If exact PNRDs are not available, an approximate photon number measurement can be implemented by using several bucket detectors in a multiplex detector. This is achieved by separating the mode you want to measure (the target mode) into several modes via spatial-or time-multiplexing. The former can be implemented via a series of beam splitters, as in [21], or by optical fibre splitters, as in [20]. The latter can be implemented using a series of interferometers with paths of varying length, as in [22]. A bucket detection is then performed on each of these separated modes. A perfect PNRD is obtained in the infinite limit of multiplexed bucket detectors, but otherwise there will be a non-zero probability that more than one photon enters the same bucket detector in a given run, resulting in uncertainty in the exact number of photons entering the detector.
We use the following POVM, from [20], to simulate this multiplex detection, which takes into account all possible coincidence detection patterns. Here, r is the number of bucket detection events and c is the actual number of photons in the target mode. The weights w r (c) are non-negative and satisfy r w r (c) = 1. These rely on the number of bucket detectors, d, and on S(c, r), which is the Stirling number of the second kind, which counts the number of ways of partitioning c objects into r non-empty subsets. More details on constructing this POVM are found in the Supplemental Material of [20], which uses methods from [61] to compute the weights. The POVM elements E r are given by In our simulations we use a multiplex detector consisting of either five [21] or sixteen [20] bucket detectors, but again this can be easily altered (the choices to use five and sixteen bucket detectors wasn't due to any fundamental significance to these numbers, but rather because they have been used in experiements [20,21]).

Coherent states and displacement operators
Coherent states are among the most readily available states as they are produced by a stabilised laser operating well above threshold [62]. To use them in our toolbox, we must construct them in the Fock basis. They can be defined by the action of the displacement operator on the vacuum, |α =D(α)|0 , wherê for α ∈ C, and whereâ (â † ) is the annihilation (creation) operator. Alternatively, the following description of a coherent state in the Fock basis can be derived using the normally ordered form of the operator, described in [62]: In our simulation we have to truncate the Hilbert space, and therefore in order to simulate accurate states and operators we must limit the magnitude of |α|; here we generally use a maximum value of |α| = 5.

Squeezed states and squeezing operators
Squeezed states have a lower variance (in some quadrature) than coherent states, which makes them useful in a wide variety of applications, such as optical metrology [8]. The single mode squeezed vacuum state is defined by where ζ ∈ C andâ † i andâ i are the creation and annihilation operators on mode i [62]. Similarly, the two mode squeezed vacuum state can be defined by |ζ ij =Ŝ ij (ζ)|0 i |0 j , whereŜ ij (ζ) is given bŷ where, again, ζ is a complex number which defines the state. As with coherent states, the normally ordered forms of the squeezing operators give rise to a construction of the states in the Fock basis [62]. Both the single mode and the two mode squeezed vacuum states can be created by acting on the vacuum with nonlinear optical elements, such as optical parametric oscillation or four-wave mixing. They are further related because the action of a beam splitter on a two mode squeezed state is to produce two single mode squeezed states [62].
Squeezed states are harder to generate than, for example, coherent states, but they are still feasible and are an important resource in many experiments, so we include these in our toolbox and in most of our runs of the algorithm. We also include squeezing operators in the toolbox, although we omit these from most of our runs, as implementing a squeezing operator on an arbitrary state is difficult and rarely available in laboratories, although it is possible [63].

Beam splitters and phase shifts
The phase shift operation is given by exp (inφ), wheren =â †â is the photon number operator. This is often straightforward to implement as a phase shift can be induced on a mode through changing the length of the optical path it has to travel, when compared to the other modes in the system, for example using a thermo-optic effect [64].
The beam splitter is a vital element of our toolbox as it allows interactions between the modes. They are also readily available, and can be implemented by a partially reflecting mirror (e.g. a half-silvered mirror) or an interface between two glued-together glass prisms, among others [65]. The beam splitter operation is given bŷ where the probability of a photon being transmitted through the beam splitter is T = cos 2 θ [66]. A '50:50' beam splitter, which mixes the two input modes in equal superposition, refers to the case where T = 0.5.

Homodyne detection
To define homodyne detection we first require the generalised quadrature operator [62]: where λ is the quadrature angle. Setting λ = 0 and λ = π/2 in the above expression gives the usual 'position' and 'momentum' quadrature operators, respectively. Homodyne detection is a projection onto the eigenstates of these quadrature operators. However, it is not possible to construct normalised eigenstates of the quadrature operators. Instead we construct states, |x λ , defined by a complex number x λ = |x λ | exp iλ, which obey the eigenvalue equation These are not orthogonal but instead their overlap is given by the Dirac delta function These eigenstates may also be constructed by operating on the vacuum state as follows [62] |x λ = π 1/4 exp − 1 2 Perfect homodyne detection is the projection onto a line in phase space, characterised by the projector |x λ x λ |.

Simulating noisy experiments
Photon loss can be modelled by mixing the noisy mode with the vacuum using a beam splitter of transmissivity T = 1 − γ (and hence loss rate γ) [66]. However, this is computationally expensive as it requires simulating an additional mode, so here we simulate loss using Kraus operators as follows [67] where ρ out is the output state, |ψ in is the input state, and The Kraus operator K k represents the loss of k photons to the environment with loss rate of γ.
We can include loss in the measurements by modifying the POVM elements to [68,69] For the bucket measurement, we use [69,70] where E 0 corresponds to a measurement of no photons and E 1 is the measurement of one or more photons. Finally, for the multiplex detector, we construct a set of POVMs by using Eq. (A1) but with |c c| replaced with E n=c from Eq. (A12).
The reproduction step creates genomes of the next generation ('children') using the parents of the current generation through three methods. The first are elite children, which form only a small percentage of the next generation. These are the parents with the highest fitness scores, which survive to the next generation unchanged. The second are crossover children. Here, two parents are chosen at random and each gene of the child is produced by copying the gene from one of the two parents. The final method to create children is mutation. Here, a single parent is chosen and random changes are made to its genome to produce the child. In Section C 2 we introduce the specific mutation and crossover functions we use.
These children together form the next generation and the process repeats until the stopping conditions are met (see [73] for a visual introduction to genetic algorithms). Possible stopping conditions include meeting a maximum number of generations over which the best fitness score does not change, within tolerance. There is no proof that genetic algorithms must converge [23], but they have been known to perform well in situations where standard, gradientbased, optimisation algorithms have failed. In quantum circuit design, genetic algorithms have been successfully used to design quantum logic gates [74]. In other fields, examples of problems where genetic algorithms have been used include a NASA design of a radio antenna to pick up signals in space [75], computer models for walking for bipedal creatures [76] and optimising the aerodynamics of hypersonic space vehicles [77].
Appendix C: Running AdaQuantum to obtain our results

Parameters that affect the running speed
In general it is not easy to say quantitatively how the speed and performance of the algorithm changes when changing different parameters, such as the number of modes or the size of the selected toolbox (which in turn determine the genome length). The reason for this is that the dominant contribution to the run-time is the truncation. But, as discussed in Section I C, in Stage 3 of our algorithm the truncation changes with each simulation.
With this in mind, increasing the number of allowed operators (labeled m in Fig. 2) in itself would (approximately) linearly increase the run-time. But in addition, having more operators allows for the creation of larger states (e.g. by applying multiple displacement operators), so the point where we truncate in general needs to increase to accommodate the higher-energy states that will be produced. For the most of the results in this paper, we allowed for m = 3 operators. We also ran the algorithm for m = 5 for some of the no-loss runs, but didn't see any improvement. One important point is that we included the identity in all runs of the algorithm, which explains why many of the results in this paper only have 2 operators. Another dominant factor in the simulation speed is the toolbox selection, particularly in the choices of parameters. Allowing for large states and operators that increase the energy slows down the simulation significantly. However, in our runs we always limited the size of the output state to either 1 or 2 photons on average (this can be done in the user interface for selecting the fitness function), which means that states that are too large after the measurement are discarded anyway.
All the runs of AdaQuantum presented here are for two-mode experiments. We attempted three-mode runs but were unable to outperform the two-mode experiments. The reason for this is that the simulation becomes exponentially slower when we increase the number of modes, and furthermore the search space becomes much larger. As elaborated on in Section III, improving the genetic algorithm so that it can effectively search for three-mode experiments will be a central focus of our future work.

Hyperparameters and settings for the genetic algorithm
Mutation and crossover are central to genetic algorithms, and the choices of mutation and crossover functions, as well as the hyperparameters of these functions, plays a large part in determining the effectiveness of the search. In our runs we experimented with three different crossover functions that are all inbuilt into Matlab [23]: scattered, single point and two point crossover. None of the crossover functions exploit the division into states, operators and measurements, for example by mixing the input states for one experiment with the operators and measurements of another. It would be interesting to see whether exploiting such a division improves the results. In general, it isn't intuitively clear why crossover works, given the nature of quantum optics experiments, but our tests demonstrated that it was an important element of the search process, increasing diversity and enabling a larger space to be explored. Despite this, we suspect that more in-depth tests (that we reserve for future work) might reveal that the advantage of crossover comes more from the occasions when it mixes similar experiments with each other. In the next subsection we give the specific values of the genetic algorithm settings and hyperparameters for the results presented in this paper.
FIG. 9: The user interface that opens when first running our program, allowing the user to choose the elements of the toolbox they have available, and the parameter limit for these elements. After pressing "Next", the user is taken to a similar user interface that allows the fitness function to be specified. After this has been chosen the genetic algorithm runs and generates optimised experiment designs.
The mutation functions we used mutated both the experiment arrangements and the parameters, and therefore our search is performed simultaneously over these quite diverse characteristics of an experiment. Matlab did not have a mutation function with enough flexibility for our purposes, so we introduced two mutation functions, power mutation and power selection. Both are based on [78]. In short, power mutation mutates every gene in the genome by a random distance, whose maximum magnitude is determined by the value of a hyperparameter named power, for which power= 1 will mutate each gene to a completely random new value, whereas power= ∞ doesn't mutate at all. power selection only mutates a fraction (given by rate) of the genes. See [27] and [78] for further details.
Throughout we used tournament selection to select the parents for the next generation.

Obtaining the results in this paper
We mainly optimised our search algorithm using the toolbox labelled 'Tool 1' in Fig. 4 and Table. II. The main reason for this being that this allowed us to compare directly with the results in [1]. For our initial runs, using Matlab's inbuilt genetic algorithm with a population size of 200, the search often converged to the same results in [1]. But after increasing the population size, using our 3-stage algorithm, and optimising the hyperparameters, we soon found the greatly-improved results presented in this paper. Having found a variety of genetic algorithm settings and hyperparameters, we then ran AdaQuantum for all the toolboxes presented in this paper. To obtain each result, we ran the AdaQuantum for 96 hours on 16 cores of the University of Nottingham's High Performance Computing facility. For most of the toolboxes and fitness functions, we ran the algorithm twice with different settings/hyperparameters, then selected the better of the two results. The settings and hyperparameters are shown in Table. V. The population sizes in stages 1, 2 and 3 were 10 7 , 10 5 and 2x10 4 , respectively.
It is important to stress here that the focus of this paper is not on optimising the genetic algorithm. Rather, our goal was to create a search algorithm that worked reliably for a number of different fitness functions, and to find a range of hyperparameters and settings such that the algorithm works consistently. Researchers who use AdaQuantum with  a different fitness function will likely have to experiment with different settings/hyperparameters, but our choices in Table. V should serve as a useful guide. In ongoing work we are performing a thorough optimisation of the settings/hyperparameters using a range of toolboxes and fitness functions, and comparing this to a range of different search algorithms. We made a number of observations when running the algorithm. Some of the runs for pure states (in particular for the larger toolboxes) did not fully converge after the 96 hours time, suggesting that improvements over the results here might be possible. Furthermore, for most of the pure-QFI fitness functions, different runs of the algorithm often produced different experiments with similar fitness values, suggesting there are whole classes of states with large QFI values. In contrast, for the mixed state runs and the BMSE the algorithm often converged quickly (long before the 96 hour running time) to the same or similar experiments, even for a range of hyperparameters/settings. cannot be directly applied to their research, and vice versa, though there is no fundamental reason why future work cannot apply global search algorithms to their setting, and reinforcement learning to ours.
Finally, in Refs. [4] and [5] a machine learning algorithm can be used to optimise the parameters in an optics experiment. [4,5] have the advantage that they use gradient decent to learn the parameters, rather than our stochastic algorithm, but this seems to come at the expense of the speed of their simulation (we have not done a full comparison, but the online version of the algorithm they use, Strawberry Fields [26], has to truncate the Fock basis at 20 photons). Furthermore, we optimise over the arrangement of elements in addition to the parameters; our algorithm is specifically designed so that the fitness function can be easily modified; and we have realistic experimental setups in mind (though we believe these latter two features could be incorporated into [26]).