Eigenvalue and Eigenvector Statistics in Time Series Analysis

The study of correlated time-series is ubiquitous in statistical analysis, and the matrix decomposition of the cross-correlations between time series is a universal tool to extract the principal patterns of behavior in a wide range of complex systems. Despite this fact, no general result is known for the statistics of eigenvectors of the cross-correlations of correlated time-series. Here we use supersymmetric theory to provide novel analytical results that will serve as a benchmark for the study of correlated signals for a vast community of researchers.


Introduction
The theory of complex systems ultimately deals with the identification of patterns of simple behaviours accounting for the emergence of universal dynamics in the time series measured in a vast range of disciplines, including condensed matter physics, medicine, finance, signal transmission, biology, and more recently computational social sciences [1]. A time series is a series of values scanned over time of a given observable of a system [2] such as the sea level [3], the temperature of a lake [4], the neuron activity in electroencephalography (EEG) [5,6], the response in a unit of volume of a magnetic resonance imaging experiment [7], the gross domestic product of a country [8], the price or return of a stock [9,10], the volume of an order in the market [11,12], the infected individuals in a region affected by an epidemics [13], and the online activity of a user [14].
The basic analysis that is ubiquitously performed when dealing with multiple time series are covariance and correlation analysis, especially with the aim of identifying the main factors accounting for time variability and parsimoniously representing the state space of the system, through denoising and dimensionality reduction. The generality of this statistical approach constitutes the basis for Principal Component Analysis (PCA) [15,16], clustering analysis, and many other data mining algorithms [17]. In these techniques one distinguishes between eigenvalue and eigenvector statistics and both of them carry important information as we know, for example, from the theory of quantum disordered systems. Therefore it is even more surprising that only few results are available for the cross-statistics between eigenvalues and eigenvectors, when dealing with the covariance and correlation matrices of noisy time series.
The spectral density of the eigenvalues is up to now the major quantity where the theory provides robust and general results [18][19][20][21]. For instance, the Marcenko-Pastur distribution (MPD) [22] usually serves as a blueprint for describing the influence of white noise in the time series on the spectral density. Any deviation from the MPD, for instance outliers, can be considered as system specific information so that the MPD serves as a filter. However, some eigenvalues encoding relevant information might be obscured by the bulk of the spectrum described by the MPD. Then PCA may remove relevant data that should be taken into account. To distinguish those system specific eigenvalues from the eigenvalues of the MPD one needs to take into consideration the eigenvector statistics. An important step in this direction is made in the present Letter. We derive an analytical formula for the first moment of a fixed eigenvector component conditioned to a chosen eigenvalue. Moreover, we state a conjecture on their general moments and distributions for a correlation matrix of noisy time series. Our results provide insights and pave the way for a much more informative spectral decomposition in time series analysis, allowing not only to focus on the spectral density but also on the individual contribution of each component to the spectrum, leading to a much deeper understanding of a system's dynamics.
Random Matrix Model Specifically, we study the statistics of the eigenvectors and the eigenvalues of the matrix C = W W T , with W ∈ R p×n representing p time series of length n or, in the case of PCA, p descriptors with n variants, and W T being the transpose of W . Thus C can be interpreted as the covariance matrix between the time series aggregated in W or the covariance between the descriptors respectively. The real rectangular matrix W in our model is composed of four matrices where W 0 ∈ R p×n is a deterministic real matrix and W 1 ∈ R p×n is a Gaussian random matrix distributed by (2) The two real symmetric matrices C L = C T L ∈ R p×p and C R = C T R ∈ R n×n are positive definite and represent a spatio-temporal correlation between the various time series. Here, the matrix C L can be identified with a time arXiv:1904.05079v1 [cond-mat.stat-mech] 10 Apr 2019 correlation, the matrix C R with the spatial correlations, and additionally, at difference with many common models, we include an offset W 0 . Hence W is a non-centred and doubly correlated Gaussian random matrix. This form allows the model to capture in detail the case of factor models ubiquitous in statistics and econometrics.
Though our model is quite general, it is still not the most general Gaussian random matrix model. We assume that the spatio-temporal correlations of the multivariate time series factorize in the two matrices, C L and C R . Therefore time-dependent spatial correlations, like the two epoch model [23], are not considered here. The random matrix model defined above can be also considered as a simple deformation of the standard real Wishart ensemble of random matrices, in which the orthogonal invariance is broken in several ways. Such non-invariant deformations of the standard random matrix ensembles were introduced and studied in different contexts including wireless communication [24], vibration analysis [25], signal processing [26] and neural networks [27]. There is a growing interest to the statistical properties of the eigenvectors in these ensembles. While there are some recent results about the statistics of the eigenvectors in the deformed Gaussian Orthogonal and Unitary ensembles [28][29][30][31][32][33][34], we are not aware of similar results for the Wishart ensemble except for Ref. [32], in which the ergodicity of the eigenvectors was proven for the special case C L = 1 1 p , C R = 1 1 n .
In the following, we will not simply focus on the computation of the spectral density of the eigenvalues, analysed in [35,36] with the same supersymmetric (SUSY) approach as in the present work, but also calculate a detailed eigenvector statistics of the matrix W W T = U ΛU T , whose eigenvalues represented by the diagonal matrix Λ = diag (λ 1 , . . . , λ p ) and the eigenvectors represented by the columns of the matrix U = {U ab } ∈ O(p). The full information about the statistics of the eigenvector components is contained in the conditional density where b = 1, . . . , p refers to a particular eigenvector component and is the mean density of the eigenvalues and . stands for the ensemble average over the distribution of W 1 . In the case of a factorisation of the eigenvector and eigenvalue statistics, as in the Wishart ensemble, one finds the Porter-Thomas distribution [37] which is independent of the component b and the eigenvalue λ due to the Haar distributed eigenvectors. This simplification cannot be expected to hold in our nontrivial model as well as in a realistic situation. The computation of (3) or its arbitrary moments where q is a positive integer, is technically a very challenging problem. In this Letter we focus on the analytical derivation of the first moment I 1,b (λ) and make a conjecture about an arbitrary moment I q>1,b (λ) and I b (µ|λ) in the conclusions. The moments of the eigenvectors are also a standard tool to characterise properties of complex quantum systems and are used to distinguish different phases in condensed matter physics [38]. Hence, we expect that it may give valuable insights for time series as well.
Before we start with the analytical calculation of I 1,b , we want to point out that the eigenvector components U ab are basis dependent. Thus the conditional distribution I b (µ|λ) strongly depends on the reference frame. In this work such a frame is chosen as the eigenbasis of C L , allowing us to investigate the broadening of the eigenvectors due to the white noise W 1 and its strength σ. Another natural and valuable reference frame could be the eigenbasis of Eigenvector Statistics with SUSY The first moment of the eigenvectors, see (6) for q = 1, can be computed by taking the imaginary part and the limit of a regularization → 0 of the quantity where λ + = λ + iε. Defining the (p + n)-dimensional unit vector e b with unity at the position b and zero otherwise, this quantity can be generated by differentiating with respect to the auxiliary parameter α, at α = 0. α is chosen to be real to guarantee convergence later on. Following the standard steps of the SUSY method [35,36], we represent first the generating function Z b (λ) by the supersymmetric Gaussian integral, average over the random matrix W 1 and finally apply the Hubbard-Stratonovich transformation [39]. In this way, we derive the following supersymmetric representation for I 1,b (λ) (see the Supplemental Material [40] for details), where symmetric in the boson-boson block and self-dual in the fermion-fermion block and their eigenvalues run along complex contours that are detailed in the Supplemental Material [40]. The supersymmetric Green function G has the form with J = diag (1 1 2 ; τ 2 ). The representation (9) is exact, but rather involved and technical. An expression for the mean level density can be obtained by summing over b = 1, . . . , p and should be compared with the corresponding result in [35,36,41]. The above expression simplifies a lot in the limit n, p → ∞, which is considered next.
Macroscopic level density and limiting statistics In most applications, one is interested in the limit n, p → ∞. In this limit the integral in Eq.(9) can be evaluated using the saddle-point approximation. To derive the saddlepoint equation, it is convenient to introduce the supermatrices S = T + − iT − + σ 2 L and R = T + + iT − + σ 2 L, which can be considered as independent. The saddlepoint solution contributing most to the integral is given by the diagonal matrices S 0 = s 0 1 1 2|2 and R 0 = r 0 1 1 2|2 with the complex parameters s 0 and r 0 that satisfy the coupled equations [40] The mean level density is up to a normalisation constant given by where we assume p ≤ n without loss of generality. The case p > n only yields an additional Dirac delta function at the origin. The formula (12) reduces to the MPD [22] in the case of the Wishart ensemble, i.e., C L = 1 1 p , C R = 1 1 n and W 0 = 0. We illustrate the result for ρ(λ) in Fig. 1 for the one-factor model, which is described in the next subsection.
The result for I 1,b (λ) can be expressed in terms of the same matrix Q and reads which constitutes the main result of the present Letter. The normalisation is fixed by the condition p b=1 I 1,b (λ) = 1. We note that for a Haar distributed vector one has I (Haar) 1,b (λ) = 1/p. One-factor model To illustrate our findings we apply our general results to the one-factor model supplemented with Gaussian noise. Specifically, we set W 0 = wx T , where w and x are column vectors of length p and n, respectively. The correlation matrices are chosen to be diagonal C L = diag (l −1 1 , . . . , l −1 p ) and C R = 1 1 n . The vector x represents a common factor, e.g. the market mode in financial time series analysis, and the component w j quantifies the relative weight of the common factor on the jth time series, before normalization.
We plug the matrices of the one-factor model into the saddle-point equation (11) and simplify the resulting expression via the Sherman-Morrison identity for the inverse matrices [42], i.e. (A+uv This leads to the coupled equations Solving these saddle-point equations we can derive the spectral density and the moments of the eigenvectors, simply by plugging the following matrix elements in Eqs. (12)-(13) We illustrate these results in Figs Conclusions The general result in Eq. (13) provides a powerful analytical methodology to quantify the expected value of the square of specific components in a given eigenvalue interval for a wide range of random matrices. We tested numerically these analytical results in detail for the one-factor model (see Figs. 1-2). Our general formulation allows an arbitrary number of factors to be added in the matrix W 0 . Although our analytical results were derived in the limit n, p → ∞, they show a very good agreement with the results of numerical simulations at finite n and p. The rate of convergence to the limiting statistics will generally depend on the input W 0 , C L , and C R .
In the present work we derived analytically a closed result only for the first moment I 1,b (λ) of an eigenvector under the condition of a fixed eigenvalue. However we conjecture that all higher moments are related to the first moment as follows: which corresponds to a locally rescaled Porter-Thomas A similar result has been also found for the conditioned eigenvector statistics of the deformed Gaussian Unitary Ensemble (GUE) in [29,30]. The only difference is the prefactor in (16), which is equal to (2q)!/(2 q q!) in our case and given by q! for the complex eigenvectors in the deformed GUE [29,30]. These numerical values result from the averaged moments of real and complex normalized vectors, respectively. We have tested this conjecture numerically for q = 2 and found a nice agreement, see Fig. 3.
We are confident that our analytical results are of general relevance for the spectral decomposition of time series and could lead to unprecedented understanding of the full statistics of the eigen-components in signal analysis. A strong deviation of the moment I q,b (λ) from the constant (2q)!/(2 q q!) hints at an eigenvector-eigenvalue pair that contains system specific information. This knowledge can improve PCA and other techniques to reduce highly dimensional data without loosing relevant information.
ACKNOWLEDGMENTS MK acknowledges financial support by the German research council (DFG) through CRC 1283: "Taming uncertainty and profiting from randomness and low regularity in analysis, stochastics and their applications". PB acknowledges support from the London Institute for Mathematical Sciences (LIMS).

I. DERIVATION OF THE SUPERSYMMETRIC INTEGRAL REPRESENTATION FOR THE MOMENTS OF THE EIGENVECTORS
The quantity I 1,b defined in Eq. (7) can be computed by differentiating the generating function (8) with respect to iα 2 and setting α = 0. The normalization is given as lim λ→∞ Z b (λ) = 1. In order to construct a representation of Z b (λ) in terms of the supersymmetric integral we use the identity where we employed the matrix ψ which is an (n + p) × 2 dimensional matrix of real Grassmann variables and φ is an (n + p) × 2 dimensional ordinary real matrix. The two matrices are introduced in order to cancel the resulting determinants from the Gaussian integral. To ensure integrability we have introduced the constant matrix J = diag (1 1 2 ; τ 2 ), where τ 2 is the second Pauli matrix.
To simplify the notation, we define the diagonal (2|2) × (2|2) supermatrix L = diag (−1 1 2 ; 1 1 2 ) and the (n + p) × (2|2) rectangular supermatrix E b = (αe b , 0; 0, 0). Moreover, we rearrange the matrices ψ and φ in the p × (2|2) supermatrix V L and the n × (2|2) supermatrix V R as follows Both matrices are two real rectangular supermatrices V L = V * L and V R = V * R with dimensions p × (2|2) and n × (2|2) respectively. The first two columns of V L and V R are real variables while the last two columns are Grassmann variables. In this way, we find The average over W 1 yields Since the action contains a quartic term in the matrices V L and V R , the next step is to perform the Hubbard-Stratonovich transformation, which allows one to decouple such terms. Up to the normalization the result reads The parametrization of the two (2|2)×(2|2) supermatrices T ± needs to be chosen carefully to guarantee the convergence of the integral. They are given by equipped with the flat Berezinian measure The ordinary matrices B 1 and B 2 are negative definite and symmetric and can be diagonalized with orthogonal matrices O 1 , O 2 ∈ O(2) as follows with b 1 , b 2 two positive definite diagonal matrices. The matrix C(B 2 ) has the form The matrices F 1 and F 2 are Hermitian self-dual matrices and η 1 and η 2 are two 2 × 2 rectangular matrices whose entries are independent real Grassmann variables. The shift of B 1 in T + by the imaginary part 1 1 2 + B 2 2 solves a convergence problem in the Gaussian terms in (S6). In particular the Gaussian integrals over the supermatrices V L and V R are absolutely convergent and yield where G is defined as in (10). Hence, G µa,νb has four indices with µ, ν = 1, . . . , 4 and a, b = 1, . . . , n + p. To fix the normalization we take λ → ∞ and notice that G becomes approximately λ −1/2 + 1 1 n+p ⊗ LJ. Therefore we end up with the intermediate result (S12) Coming back to our original problem we notice that we are interested in the first derivative with respect to iα 2 at α = 0. In particular, the quantity I 1,b (λ) is given by which coincides with Eq.(9).

II. SADDLE-POINT EQUATION
For deriving the saddle-point equation we only need to consider the exponential function and the superdeterminant in the integral (S13). The term G 1b,1b is only a polynomial prefactor which does not influence the saddle-point solution.
It is easier to study the saddle-point by introducing the supermatrices S = T + − iT − + σ 2 L and R = T + + iT − + σ 2 L, which can be considered to be independent. Then the action, i.e. the function that need to be minimised, is 1 2σ 2 (n + p) Str SR + 1 2(n + p) Str ln (S14) Differentiating it with respect to S and R yields two coupled equations =0. (S15) The operator tr 1 is the partial trace over the first tensor space which is here the space of ordinary n × n and p × p matrices, respectively. The saddle-point equation is rotation invariant, i.e., when (S 0 , R 0 ) is a solution then this is also true for (R 0 S 0 R −1 0 , R 0 ) as well as (S 0 , S 0 R 0 S −1 0 ) and any kind of combination. This can be seen by multiplying both equations from the left and the right with R and R −1 , which is equivalent to replacing S by RSR −1 . Assuming that the saddle-point solution (S 0 , R 0 ) is unique, we conclude then that S 0 and R 0 must commute. The uniqueness of the solution should follow from the fact the contour of integration, which was shifted by the term i , can't cross the poles and the fact that the Berezinian (the Jacobian in superspace), that is |b is not suppressed only when the multiplicity of the eigenvalues in the Fermion-Fermion blocks is equal to those in the Boson-Boson block. The Fermion-Fermion blocks are doubly degenerate due to their Hermitian self-duality. Thus also the Boson-Boson blocks are doubly degenerate, which implies for (2|2) × (2|2) supermatrices that we can diagonalize S and R simultaneously and the solution has to be diagonal and degenerate, i.e., S 0 = s 0 1 1 2|2 and R 0 = r 0 1 1 2|2 . Substituting this ansatz into Eq. (S15) we derive Eq. (11), which is (S16) The regularization only determines which saddle-point has to be chosen, especially which sign the imaginary part carries. Assuming the correct sign of the imaginary part we neglected this regularization in Eq. (11).