Extreme values of CUE characteristic polynomials: a numerical study

We present the results of systematic numerical computations relating to the extreme value statistics of the characteristic polynomials of random unitary matrices drawn from the Circular Unitary Ensemble (CUE) of Random Matrix Theory. In particular, we investigate a range of recent conjectures and theoretical results inspired by analogies with the theory of logarithmically-correlated Gaussian random fields. These include phenomena related to the conjectured freezing transition. Our numerical results are consistent with, and therefore support, the previous conjectures and theory. We also go beyond previous investigations in several directions: we provide the first quantitative evidence in support of a correlation between extreme values of the characteristic polynomials and large gaps in the spectrum, we investigate the rate of convergence to the limiting formulae previously considered, and we extend the previous analysis of the CUE to the C$\beta$E which corresponds to allowing the degree of the eigenvalue repulsion to become a parameter.


Introduction and background
Questions related to quantifying statistical properties of high and extreme values taken by the characteristic polynomials of random matrices have recently attracted considerable attention [2,11,22,23,25,28,31,36]. The main motivation was the suggestion of a close analogy [22,23] between the statistics of the (logarithm of) the modulus of characteristic polynomials of large random matrices [26] (originally, from the circular unitary ensemble, or CUE, of random matrix theory) and an important class of log-correlated random processes and fields, namely those characterised by having a logarithmic singularity on the diagonal of the covariance kernel, which have been the focus of considerable attention in the past few years. Such processes and fields appear with surprising regularity in many different contexts, ranging from the statistical mechanics of branching random walks and polymers on trees [9,13] and disordered systems with multifractal structure [10,17] to models of random surfaces underlying the probabilistic description of two-dimensional gravity [32,39]. In particular, the extremal values of log-correlated Gaussian processes have been the subject of intensive study, leading to considerable progress ranging from non-rigorous [18,19,21], through tending-to-rigorous [34,35], to fully rigorous [14,38,40,41] analysis.
The characteristic polynomials of random matrices are of considerable interest in their own right, but in addition they underpin an influential model [29] for the statistical properties of the Riemann zeta function on the critical line [28]. One of the main points of the articles [22,23] is that on the global and mesoscopic scales, one should think of the logarithm of the zeta function on the critical line as behaving statistically like a log-correlated field. Large values of the Riemann zeta on the critical line are of considerable and long-standing interest [28], and this new perspective has attracted a good deal of attention [1,3,31,37].
It is of course important to make clear that every step of the translation should be taken with appropriate adjustment and caution. Log-correlated fields are necessarily random generalised functions (distributions), and establishing in which way the logarithm of the modulus of a characteristic polynomial tends to such a highly singular object is a non-trivial task (see, for example, [26]). Such studies necessarily involve regularisations, and one of the most natural rigorous frameworks seems to be provided by the theory of multiplicative chaos, which has its origins in informal ideas of Mandelbrot [33] which were developed into a comprehensive, mathematically rigorous theory in works by Kahane [27], and have been further extended in recent years [6,39]. The convergence of characteristic polynomials and the Riemann zetafunction on the critical line, at the appropriate scale, to the Multiplicative Chaos measures was established in several recent papers [7,42]. Together with other related results, e.g. [24,36] this provides strong support for the correspondence in question.
Despite these developments, up to now only a few preliminary attempts have been made to investigate the predictions of the theory by direct numerical simulations of large random matrices [22,23,25,28]. In this paper we present the results of an extensive numerical investigation of large CUE matrices when the matrix dimension N = 2 M is large. In our experiments we diagonalised numerically more than 10 7 matrices for M = 2, 3,4,5,6,7,8,9 and somewhat fewer matrices for M = 10, 11 and M = 12 (more than 10 6 for M = 10, more than 10 5 for M = 11, and 50 000 for M = 12), and extracted the relevant information from their spectra.
Our first goal is to check manifestations of the main mechanism underlying the extreme value statistics of log-correlated processes, the so-called 'freezing transition' [9,18,38]. In fact, some features of this phenomenon are seen already at the level of sequences of independent (rather-than log-correlated) random sequences. Indeed, it was first discovered at the level of the so-called uncorrelated random energy model (REM) introduced originally by Derrida [12], which is a toy model for such a freezing transition. In REM the energy values {E n } N n=1 , are taken to be i.i.d. random variables with a Gaussian distribution where J > 0 is a global energy scale. Defining for a temperature T > 0 the REM partition function as Z N = N n=1 e −En/T one then can study the associated free energy F = −M −1 T log Z N , and show that its mean value becomes independent of T below some finite critical temperature T = T c .
Although the log-correlated models share the existence of a freezing transition with the simple REM, finer features implied by freezing are quite different, and the two models actually belong to different universality classes. The difference is reflected, in particular, in much broader fluctuations in the number of points in the process exceeding a high threshold in the log-correlated case [20], and eventually in the statistics of extreme values. From that angle, we always compare the results for CUE simulations with the corresponding numerical simulation of the simple REM at comparable ensemble sizes (and beyond). Our investigations shed some light on the convergence properties of the extremal value distribution, the distribution of partition function moments, and the associated mean free energy.
Arguably, the most well studied, and, in a sense, paradigmatic example of a 1D processes with logarithmic correlations is the so-called Gaussian circular-logarithmic model suggested originally in [18]. The process has a simple representation as a (formal) Fourier series given by the real part of ∞ n=1 v n e −int √ n , with i.i.d. complex random Gaussian coefficients v n with zero mean and unit variance. The same object can be, with due interpretation, viewed as a 2D Gaussian free field defined on a disc with Neumann boundary conditions and sampled along a circle of unit radius, see [4,40]. As such, it has intimate connections with conformal field theory, and appears in that context very naturally [4]. This line of research very recently provided the first rigorous proof [40] of the conjectured explicit distribution of its extreme values, see equation (13) below. Fortunately, it is precisely the model which is expected to represent the limiting statistics of the log-mod of CUE characteristic polynomial, so will serve as a benchmark for our numerics.
Let us finally mention that, given the widespread interest in the family of β−ensembles of symmetric three-diagonal random matrices 5 introduced by Dumitriu and Edelman [15] and their circular analogues (CβE) [30], it seems natural to ask if the large-N statistics of the extreme values for characteristic polynomials in this family will have similar properties to the β = 2 case for any fixed β > 0. The first steps in this direction were taken by Chhaibi et al [11] who proved that, up to a simple rescaling of parameters, the first two (non-random) terms in the asymptotics of the maximum value, see equation (15), are indeed common to all members of the family. In appendix B we extend to arbitrary β > 0 heuristic computations given in [23] for the CUE. On this basis we conjecture that the distribution equation (13) of the first nontrivial random term should be also universal with respect to changes of the parameter β.
The structure of the paper is as follows. We introduce some notation and definitions in section 1.1. We present our numerical results for the extreme value statistics in section 2. In section 3 we present data supporting the conjectured freezing transition for the free energy. We summarise our conclusions and outlook in section 4. The paper has two appendices. In the first appendix we compare our numerical data with some recent exact formulae for the moments of the partition function. In the second appendix we present a heuristic calculation for the CβE which generalises that given in [23] for the CUE.

Setting: extreme values of the characteristic polynomial, moments and free energy for CUE and REM
Let U N ∈ U(N) be a unitary N × N matrix taken at random from the CUE, i.e. uniformly with respect to the Haar measure on U(N). Denoting its eigenvalues by {e iφn } N n=1 , its characteristic polynomial is given by (1 − e i(φn−θ) ). (2) We will be interested in the statistical distribution of |p N (θ)| with a focus on its extremal (maximal) value Let us express the square of the maximum as |p N | 2 max = e −aN +bN y where a N and b N are coefficients that depend only on the matrix size N which will be discussed later in detail. The limiting distribution, as N → ∞ of the variable y has been in the centre of recent conjectures [22,23] that will be described more fully in section 2. For a detailed description of the technical background we refer the reader to [23]. In the present paper we will just describe the setting, the main definitions and the main conjectural predictions without much detail how these were obtained. We then focus on the detailed description of the comparison of numerical data with these predictions.
The distribution of absolute values |p N (θ)| of the characteristic polynomial and its extremal values may be characterised in terms of the moments where V N (θ) = −2 log |p N (θ)|. In analogy to the partition function in statistical physics we refer to β as the inverse temperature and introduce the normalised free energy The additional normalisation with log N is to ensure the existence of a finite limiting value as N → ∞. The extreme values are obtained in the low temperature limit β → ∞ of the free energy via Our numerical investigation focuses on the statistical properties of log |p N | max and the moments Z N (β). Both properties only depend on the spectrum of CUE matrices. Numerically we used the fact that the eigenvalues of CUE matrices have the same joint probability distribution as explicitly known ensembles of banded (5-diagonal) unitary matrices [30] which are much easier to construct than full CUE matrices. We used standard octave/matlab routines for random number generators and numerical diagonalisation.
As mentioned above we compare our results to REM in order to show the characteristic differences between the two universality classes (and for benchmarking). In REM we set the energy scale to throughout this manuscript. This choice ensures that the critical inverse temperature for the freezing transition in REM is β crit = 1 in coincidence with the critical inverse temperature in CUE.
We will compare the statistics of extreme values of the characteristic polynomial of random unitary matrices in the form −2 log |p N | max to the statistics of ground state energies E min = min N n=1 E n for realisations of the REM (due to the symmetry of the model this is equivalent to comparing the maximal energy with 2 log |p N | max ). For convenience (in order to have the same notation for CUE and REM) we set log |p N | max = −E min /2 when considering REM.
Analogously we compare moments Z N (β) for CUE with partition sums with the normalised free energy 6 Our choice of parameters and normalisation implies that the theoretical prediction for the expected free energy in CUE and REM follow the same curve

The distribution of extreme values
The logarithm of the maximal values |p N | 2 max for REM is known to be distributed according to a Gumbel distribution, see e.g. [8]. To be more specific, after an appropriate rescaling the integrated probability distribution for the random variable y converges to In the case of the CUE, appropriate rescaling leads to an integrated distribution [18,22,23] where K 1 (x) is the modified Bessel function of second kind and order one. The Gumbel and the CUE distributions are related by a simple convolution In other words if y 1 and y 2 are two independent Gumbel-distributed random numbers then their sum y = y 1 + y 2 follows the CUE distribution. The scaling parameters obey (15) in both cases. The constant c however takes different values c = 1 2 for REM; 3 2 for CUE.
The different value of this constant is a key signature of the long range correlations in the CUE model. We tested this numerically by rescaling the data according to (11) using a N ≡ a fit and b N ≡ b fit as fitting parameters such that the rescaled data has the same mean and variance as the Gumbel distribution (for REM) or the CUE distribution (for CUE). From the fitted parameters we evaluated the quantities From (15) we see that c fit should converge to c with corrections of order o(1/ log log N) while d fit should be of order O(1). In figure 1 we plot c fit and d fit . The plots are consistent with the expected behaviour. Indeed, allowing deviations of order O(1/ log log N) the data is consistent with c fit → 3/2 for CUE and c fit → 1/2 for REM (the difference between O(1/ log log N) and o(1/ log log N) is too delicate to be resolved numerically). A more detailed analysis of the distribution may be obtained by comparing the integrated distribution functions directly. For this we use the rescaled variable which has vanishing mean and unit variance. In figure 2 we plot the integrated distribution from the REM and CUE data against the predicted REM (Gumbel) and CUE distributions. The latter have been rescaled accordingly to have vanishing mean and unit variance and, with minor abuse of notation, we will write One can see in figure 2 that the predicted curves and the curves obtained from numerical data are all very close to each other. Zooming into the details of the curves reveals that neither the REM curve has fully converged to the REM prediction (at N = 2 20 ) nor has the CUE curve converged to the CUE prediction (at N = 2 12 ). We have analysed the convergence of the integrated distributions by considering the differences of the data (I N (x)) for REM and CUE and the three predicted curves. In figure 3 we plot these differences for increasing values of the size N of the eigenvalue spectrum. For REM the trend in these differences is clearly consistent only with the REM prediction. For CUE the picture is less clear because it was not feasible to diagonalise matrices larger than N = 2 12 . We 'only' diagonalised 50 000 matrices of that size N = 2 12 which gives much larger noise levels compared to smaller values of N (at N = 2 8 we diagonalised 10 7 matrices). Nonetheless the numerics remains consistent with the CUE prediction.
A more detailed analysis of the convergence may be obtained by plotting the maximal dif- figure 4. This confirms again that the REM data is consistent only with the REM prediction. For CUE the available data indicates that the difference to the REM prediction saturates at a finite value. The CUE data shows very slow convergence and the limited data at N = 2048 and N = 4096 results in a large error term-nonetheless the data is consistent with convergence of the CUE data to the conjectured prediction for CUE. Based on the data plotted in figure 4 one may estimate that one may require N × N matrices with N = 2 20 or even larger in order   Difference between predicted integrated density distributions for log |p N | max and numerically obtained distributions for various values of the size N of the eigenvalue spectrum. The data is rescaled to vanishing mean and unit variance using the variable x (see equation (18)). The left panels ((a) and (b)) show the difference between the two predictions and CUE data (for N = 2 4 , 2 8 , and 2 12 ). The right panels ((c) and (d)) show the difference between the two predictions and REM data (for N = 2 4 , 2 8 , 2 12 , 2 16 , and 2 20 ).
to get a clearer support for the conjectured convergence. In order to get a sufficiently smooth integrated distribution function one has to fully diagonalise about 10 6 to 10 7 matrices. While there are specialised algorithms for sparse or banded matrices that obtain a fraction of the spectrum quite quickly we here need the full spectrum and computing the required amount of data is well beyond our limits.
We would like to mention that the extreme value distribution for characteristic polynomials of the Gaussian unitary ensemble (GUE) ensemble has also been discussed recently [25]. GUE spectra have the same logarithmic correlations as CUE spectra and the corresponding distribution of extreme values of characteristic polynomials shows similar deviations from the non-correlated REM spectra as CUE. The predicted curve I GUE (x) is different but very close to the predicted curve I CUE (x) (closer than to the REM curve I REM (x)). The origin of the difference is well understood-finite GUE spectra have Gaussian tails not present in CUE (see for [25] for further details). We have checked whether our CUE data is able to distinguish between the GUE and CUE predictions-however the two are too close to be resolved.

Correlations of the position of the maximal modulus of the characteristic polynomial and the spectrum
The spectrum {e iφn } N n=1 of the unitary matrix U gives the zeros of the characteristic polynomial |p N (θ)|. One may expect that the position θ max where |p N (θ max )| = |p N | max statistically occurs preferably in large intervals that are free of zeros. This kind of correlation may be measured in various ways. Most directly one may consider the level spacing distribution at θ = θ max . For any unitary matrix of dimension N the mean level spacing over its complete spectrum is (trivially) 2π/N . Numerically we find that typical values of level spacings at θ max for CUE matrices are about twice as large for the matrix sizes we used. This is a clear indication for correlation between the position θ max and large spacings for which we will give a more precise description in the following. Indeed our numerical analysis allows for a more detailed analysis of this deviation by considering the full statistical distribution of level spacings at θ max for CUE matrices. While it is very hard to make strong analytical predictions about the properties of this distribution (and we are not aware of any relevant results) it is numerically straightforward from the large amount of CUE spectra that we have computed. For each spectrum we have obtained the scaled levels spacing s = N (φ n+1 − φ n ) /(2π) where φ n+1 > θ max > φ n. The integrated level spacing distribution I ls (s; θ max ) is then the ratio of the number of spectra where the rescaled level spacing is smaller than s over the total number of spectra. In figure 5 we plot this quantity for N = 2 8 , 2 10 and 2 12 and compare it to the Wigner surmise. If θ max was a typical point in the spectrum one would expect to see a distribution close to the Wigner surmise and the expectation value of the level spacing would be close to unity. The plots in figure 5 show however strong deviations from the Wigner surmise and the expected level spacings are much larger than unity (and growing slowly with N). This is evidence for strong correlations between the position θ max of the maximum of the modulus of the characteristic polynomial and the spectrum close to this value.
From the correlations between the position θ max and the increased level spacings at this point one may expect further correlations between the maximal values of the characteristic polynomial and the spectrum. Let θ = θ − θ max ∈ [0, 2π). The spectral counting function may be written as In this form it counts the number of states above θ max and it directly relates the spectrum of U to the characteristic polynomial. The fluctuations in the spectral counting function The three thick (blue) curves give the CUE data.
are expressed in terms of the argument of the characteristic polynomial p N (θ). One may expect that the position of the maximum is correlated with large fluctuations. Numerically we obtain the values θ max and θ min where N fluct (θ) takes its maximal and minimal values. In figure 6 we plot the integrated density I fl (θ) of these values as a function of θ . In absence of correlations one expects a straight line θ /(2π). However, the plots show deviations from a straight line that imply strong correlations between the position of the maximum of the modulus of the characteristic polynomial and the positions of the extrema in the fluctuations of the spectral counting function. The plots are consistent with the previous observation of large level spacings at θ = 0. At the beginning (end) of a large level spacings one may expect that N fluct (θ) has a statistical tendency to be positive (negative). At θ = 0 we have found exceptionally large spacings and thus may expect a statistical correlation that a maximal positive fluctuation occurs before and that a maximal negative fluctuation (i.e. its minimum) occurs just above θ = 0. This is clearly shown in the plotted integrated densities. In addition these plots show that these correlations are long ranged and certainly do not decay on the scale of the mean level spacing 2π/N (the scale for spectral n-point correlation functions in CUE). The correlations we have found numerically point to interesting effects that are currently not understood on a theoretical level. Analytical approaches to these kinds of correlations would be highly desirable.

The free energy and its distribution
In section 1.1 we introduced the partition sum Z N (β) and the free energy F N (β) for CUE and REM. Numerically they are straightforwardly obtained from the eigenvalue spectra that we have obtained for both models (by numerical integration over the spectral angle θ). In figure 7 we plot the expected free energy E[F N (β)] as a function of the inverse temperature β for CUE and REM and some values of N. In both cases we see freezing but the convergence to the predicted curve for N → ∞ above the freezing transition β 1 is quite slow. In the freezing regime the free energy is dominated by the maximal value of the modulus of the characteristic polynomial. In the previous section we have confirmed the prediction that the latter obey 2 log |p N | 2 max ∼ 2 log N − c log log N + o (1) where c = 3/2 for CUE and c = 1/2 for REM (see (16)). For the free energy this implies convergence at a slow rate log log N/ log N in the freezing regime. We test this estimate numerically by considering the difference of the numerically obtained free energy at β = 3 to the theoretical value E[F(3)] = −2. We compare this to a fitted curve of the form where the additional parameter g fit is fitted to the data at N = 2 12 for CUE where g fit ≈ 0.69 and at N = 2 20 for REM where g fit ≈ −0.67. In figure 8 we plot the difference of the conjectured expectation value −E[F(3)] → 2 for N → ∞ and the numerical expectation value at finite values of N. In the plots we have rescaled the difference by a factor log N/ log(log N)). For both models the plots are consistent with a saturation at the value c = 3/2 or c = 1/2 for CUE and REM with higher order deviations ≈ gfit log log N . Next we consider the distribution of the partition function Z N (β) for the CUE model (we do not show that data for the well-understood REM model, because it does not give additional insight to what we have learned from the comparison so far). For inverse temperatures below the freezing transition β < 1 an analytical prediction is available for the complete distribution. For this purpose one rescales the partition function where Z e (β) = N 1+β 2 is the Barnes G-function. In the limit N → ∞ the prediction for the integrated probability distribution of the rescaled partition function is In figure 9 we plot the numerically obtained integrated distribution function I N,β (z) for CUE for β = 0.5 and β = 0.8. For β = 0.5 one sees clear and quick convergence to the predicted curve. For β = 0.8, somewhat closer to the freezing transition, convergence is slower but consistent with the prediction. In the freezing regime β > 1 a theoretical prediction for the distribution as N → ∞ of the partition function is known only in the form of a Laplace transform (see equation (33) in [18]) to the leading order Here the parameter sets the overall scale and ν red (the reduced parameter) is of order O(1). This implies that the appropriately rescaled partition functioñ has a finite limiting distribution I β (z). The Laplace transform of the latter is obtained from (26) by replacing ν → ν red , i.e.  . We plot the curves as a function of z 1/β 2 where z is the appropriately rescaled partition function (see Equation (24)). This makes the predicted limiting distribution independent of β. The different plots are for N = 16, 64, 256, 1024, and 4096 (with increasing strength of color). The dashed black curve is the predicted curve for N → ∞.
scale ν red is not known theoretically and it is not straightforward to extract numerically from data at finite N because of the generally slow convergence of the model and the fact that the distribution at finite N depends not only on the scale factor ν but also on the shape, which is currently not known theoretically. As a practical way to obtain a value for ν red we consider the theoretical expectation value of logz as a function of β and ν red . Numerically the simple scaling implies that one can extract the full dependence of E[logz] on the parameter ν red for any Figure 10. Integrated distribution of the rescaled partition function for inverse temperatures at the critical value β = 1 (green curves) of the freezing transition and in the freezing regime at β = 2 (orange curves) and β = 3 (red curves). We plot the curves as a function of z 1/β where z is the appropriately rescaled partition function (see equation (28)). The different plots are for N = 256, 1024, and 4096 (with increasing strength of color). The dashed lines give the theoretical prediction given by (26) where the overall scale parameter ν red has been fitted by the procedure outlined in the main text. given value of β by obtaining the inverse Laplace transform at ν red = 1. At β = 3 the scaling just gives We compare this to the numerical fit (23) to the data for the free energy shown in figure 8 at the same value β = 3 where the factor 3 on the right hand side is β. This results in an approximate value ν red ≈ e 0.2 ≈ 1.22. We do not claim that our practical approach gives the correct value as N → ∞. We will use this scale for comparing the integrated distributions of z.
In figure 10 we plot the numerically obtained integrated distributions I N,β (z) of z at the critical point (β = 1) and in the freezing regime (β = 2 and β = 3). The curves in the freezing regime are consistent with the existence of a limiting distribution on this scale though the convergence to the predicted limiting distribution may be very slow. Note that the predicted curves contain one fitting parameter ν red . In figure 10 we have used the value ν red = 1.22 obtained from the procedure outlined above. Setting ν red = 1 leads to a limiting curve where the ordinate is stretched by a factor ν 2 red ≈ 1.5. While this looks much closer to the data at the finite values of N we should reiterate that no theoretical predictions about the leading corrections of the shape of this curve are available and one should expect these deviations to decay slowly (e.g. as 1/ log N or even slower). Furthermore note that our numerics indicates that at the critical temperature β = 1 the scaling (28) is no longer valid. This is consistent with the prediction that a different power of log N is expected at the transition point (see equation (45b) of [9]).
From (26) one may deduce that the tails (at large values of the partition function) behave like where the logarithm is related to the strong correlation in CUE. In figure 11 we plot the ratio of the numerically obtained tails 1 − I N,β (z) and the predicted behaviour for β > 1. In order to confirm the prediction one should see the appearance of a saturation at a finite value as z increases. While the available numerical data at finite N cannot confirm this prediction the numerics is consistent with the appearance of such a saturation as N → ∞. Indeed the curves for β = 2 and β = 3 indicate that the behaviour in the tails for finite N may be of the form 1 − I β (z) ∝z −1/β−ν β (N) logz where the additional exponent ν β (N) > 0 decays to zero as N → ∞.

Conclusions and outlook
The numerical results we have presented here are consistent with the conjectures put forward in [22,23]. Taken together with recent proofs of some of these conjectures, there is growing evidence supporting the underlying philosophy that the characteristic polynomials of random matrices behave statistically like logarithmically-correlated Gaussian fields. However, the convergence to the limiting formulae predicted is clearly very slow, and much more extensive numerical experiments will be required in order to examine the details of the various conjectures. The rates of convergence we have found in our computations are suggestive. It would be extremely interesting if they could be verified by a more refined asymptotic analysis than that carried out in [22,23].
We believe the correlations we have found between the extreme values of characteristic polynomials and eigenvalue spacings to be interesting and worthy of theoretical study. There has for some time been a folklore belief that these correlations should exist, but as far as we understand ours is the first quantitative study of them. It may be possible to analyse these by extending the heuristic asymptotic analysis of [22,23] to mixed moments involving both the modulus and the argument of the characteristic polynomial (as in the moment calculations of [29]). It should also be possible to do a similar quantitative analysis for the Riemann zetafunction, where extensive data exist.
It should be clear that the computations we have described here are first steps in what we believe may be a worthwhile new line of research. We hope that that they will inspire further numerical studies as well as new theory.
be the kth moment of the partition function. A conjecture in [22,23] states M N,β,k ∼ N 1+β 2 k 2 . If β and k are positive integers one may calculate these moments directly for arbitrary matrix dimension from the autocorrelation function of order k for the characteristic polynomials [5]. It is striking that exact formulae can be written down for this quantity, especially at the point of the freezing transition. As a benchmark for our numerics we compare some low moments with the analytically known values. Analytically one finds [5]  We have here introduced as well the reduced moments M red N,β,k . The reduced moments obey M red N,β,k → 1 as N → ∞ and are of order unity for finite values of N. The following tables compare the known exact values of the reduced moments M red N,β,k with the corresponding reduced moments Z N (β) k red obtained from the numerical data. In each case the given errors are one statistical standard deviation. We only present data that has sufficiently converged (i.e. the standard deviation is sufficiently small). The statistical errors are consistent with the scaling with N of the prediction for higher moments (which determine standard deviations) ∝ N β 2 k 2 −1/2 .

Appendix B. On the maximum of characteristic polynomial for β-circular ensemble
Consider the circular β− ensemble with j.p.d. of real variables θ i ∈ [0, 2π), i = 1, . . . , N given by [30]  where q is the inverse temperature (we can not use β here for the inverse temperature as that is reserved for the Dyson index). Integer moments of the partition function are then given by    If the parameter β is rational (that is: β = 2s/r where s and r are relatively prime) the paper [16] conjectured the generalisation of Fisher-Hartwig formula for N → ∞, which in the case of (B.7) reads (see (3.13)-(3.14) in [16] with identification R = n and q j = 2q/β, ∀j) as well as c n = 0, ∀n): of [16]. Substituting the above to (B.3) and defining q = q 2 β we get:  where we introduced the 'typical value' for the partition function We see that the 'freezing temperature' is now given by the condition q = 1 so that q = β 2 . The 'free energy' is given by, to the leading order for N → ∞: (B.12) Correspondingly, the leading order of the maximum of the characteristic polynomial is given by 2 β log N. To find the subleading order one can follow the same procedure as for β = 2 and find that the typical measure of 'high points' , that is those points where 2 log |p n (θ)| > 2x log N is given by and equating this to N −1 we find the 'threshold of high values' to be log log N log N (B.14) which implies two first terms of the maximum to be 2 β log N − 3 4 log log N , in full agreement with [11]. Finally, the correction term of the order of unity will be obviously given by the same distribution as for β = 2, as moments of the partition function are the same, up to a trivial rescaling by 2 β whenever necessary. We therefore come to the conclusion that as N → ∞ the characteristic polynomials of circular ensemble for any β > 0 is essentially described by the same random Gaussian logarithmically correlated process rescaled by the parameter 2 β .