Evolution of moments and correlations in non-renewal escape-time processes

The theoretical description of non-renewal stochastic systems is a challenge. Analytical results are often not available or can only be obtained under strong conditions, limiting their applicability. Also, numerical results have mostly been obtained by ad-hoc Monte--Carlo simulations, which are usually computationally expensive when a high degree of accuracy is needed. To gain quantitative insight into these systems under general conditions, we here introduce a numerical iterated first-passage time approach based on solving the time-dependent Fokker-Planck equation (FPE) to describe the statistics of non-renewal stochastic systems. We illustrate the approach using spike-triggered neuronal adaptation in the leaky and perfect integrate-and-fire model, respectively. The transition to stationarity of first-passage time moments and their sequential correlations occur on a non-trivial timescale that depends on all system parameters. Surprisingly this is so for both single exponential and scale-free power-law adaptation. The method works beyond the small noise and timescale separation approximations. It shows excellent agreement with direct Monte Carlo simulations, which allows for the computation of transient and stationary distributions. We compare different methods to compute the evolution of the moments and serial correlation coefficients (SCC), and discuss the challenge of reliably computing the SCC which we find to be very sensitive to numerical inaccuracies for both the leaky and perfect integrate-and-fire models. In conclusion, our methods provide a general picture of non-renewal dynamics in a wide range of stochastic systems exhibiting short and long-range correlations.


I. INTRODUCTION
A general property of diverse systems, ranging from superconducting quantum interference devices (SQUIDs) [1], to lasers [2] to excitable cells [3][4][5][6] is that time intervals between specific events are not statistically independent. The theoretical description of such non-renewal stochastic processes [7] poses a significant challenge, as it implies that the present state of the system depends, in general, on the whole past evolution or parts of it, and not just on the previous state. Analytical approximations to tackle such memory effects have included the assumption of stationarity [8], small stochasticity [9] and time-scale separation [10][11][12] between stochastic and deterministic parts of the dynamics. Even if these approximations allow for some insight into the parameter dependence of e.g. serial correlations and can be used to understand experimental data, as exemplified in [9,[13][14][15][16] in the context of excitable systems, it is desirable to understand the statistics of model systems without making simplifying assumptions. Regarding stationarity, real systems rarely operate in a stationary state due to transients that arise from deterministic or random perturbations. A prominent example are cortical neurons. An average cortical neuron receives random inputs from approximately 10 4 other neurons, whose activity is modulated by non-stationary sensory and other inputs, resulting in transient neuronal dynamics [17] that * wilhelm.braun@cantab.net only become stationary after a certain time. It is therefore important to understand how statistical properties of inter-event times evolve and become invariant following a transient regime due to internal dynamics and external inputs. Keeping with illustrations from neural dynamics, it is well known that physiologically relevant processes underlying neural coding rarely only have one welldefined timescale [18,19]. This has lead researchers in various theoretical fields to consider multiple time-scale dynamics [20]. An important example of a system with multiple time scales is neuronal adaptation, where a neuron's firing rate adjusts in response to a stimulus. Adaptation with multiple timescales, or even no time scale as in the case of power-law adaptation [21][22][23], is now known to be biophysically relevant, and even optimal for some tasks [24]. Recently, it was also shown that a neuron model with adaptive firing thresholds exhibiting multiple timescales is the optimal choice for the prediction of spike times in cortical neurons [25,26]. Therefore, a theoretical description of adaptation without a single well-defined timescale is an important goal. In this paper, we show how to describe two-dimensional non-renewal dynamics by an iterated first-passage time (iFPT) approach. This approach allows us to determine stationary statistical properties of the system as well as providing a description of the transition to stationarity. We furthermore show how to compute serial correlations in the time series generated by the firing times of the system. While our approach is general and applicable to any system where first-passage times [27] play a role, we illustrate its versatility with two important examples, namely spike-triggered neuronal adaptation with a sin-gle exponential current and a power-law current without an intrinsic timescale, respectively. Using the underlying time-dependent FPE to describe the system, we only need to apply mathematically convenient standard absorbing boundary conditions to obtain stationary distributions, e.g. that of the adaptation current upon firing. Moreover, the methods developed here can easily be extended to models with correlation-generating deterministic input currents as recently considered in [28].

II. MODEL
We consider a stochastic differential equation (SDE) driven by an external signal s(t): dX(t) = µ(X(t))dt + φ(X(t))dW (t) − s(t)dt . (1) X is defined on the domain (−∞, x th ]. If X reaches x th , the system is said to have generated and event, and X is instantaneously reset to 0. For all examples in this study, we chose the Ornstein-Uhlenbeck process (OUP) given its prominence in the field of stochastic systems. For the OUP, we fix the correlation time τ m = 1 γ , bias current I 0 and noise intensity σ as follows: µ(X(t)) = γ(I 0 − X(t)), φ(X(t)) = σγ. W (t) is a standard Brownian motion and we set x th = 1. Given that the OUP is the basis for integrate-and-fire (IF) neuron models, which are among the most popular neuron descriptions [29], we refer to events as spikes and to s(t) as a time-dependent adaptation current in the present study. The general dynamics of s(t) obeys a single autonomous ordinary differential equation and s is increased by a fixed amount κ when X = x th : s → s + κ, which is the mechanism for spike-triggered adaptation [30]. When s(t) is also reset to its starting value s(0), the model is a renewal model and its firing statistics may be studied using standard techniques, see e.g. [31] for a recent review.
Here we focus on two forms of the adaptation current. The first one is power-law adaptation, for which This ODE has the general solution s(t) = t α + 1 . Therefore, the current s in this case has a power-law time dependence with no intrinsic time scale [21]. The second adaptation current is given by a single exponential decay with time scale τ a : which has the general solution s(t) = s(0)e − t τa . The time to the first spike event is the following firstpassage time (FPT):  In the non-renewal case we are studying here, subsequent firing times will in general not have the same distribution as T 1 . We define the kth interspike interval (ISI) as The first moment of the kth ISI is given by τ 1 k = E(T k ). The second moment of the kth ISI will be denoted by τ 2 k = E((T k ) 2 ) and the kth firing rate is given by the inverse of the corresponding mean ISI r k = 1 The kth standard deviation m 2 (k) is then given by The values of the peak adaptation current after the kth firing are defined for k ≥ 1 as where t − indicates that we take the left-sided limit. For simplicity, we choose s to be started from a point (s (0) 0 = κ), instead of from a biophysically more realistic initial distribution. However, the methods we are going to describe in the following are also valid when s is initially started from a distribution. The central challenge is to obtain the distributions G k and F k for k ≥ 1, which are the distributions of s (k) 0 and T k defined by Eqs. 7 and 5, respectively. An example realization for the case of power-law adaptation is shown in Fig. 1. The knowledge of these distributions is key to understanding the non-renewal dynamics, as they form a hidden Markov model of the underlying non-Markovian dynamics [15,16,32]. Therefore, once these distributions are known, the non-renewal dynamical system breaks up into coupled renewal dynamical systems, which are much more tractable mathematically. This gives rise to the iFPT approach which we now explain.

III. THE IFPT APPROACH
Being a diffusion process, the system given by Eqs. 1 and 2 is governed by a two-dimensional time-dependent FPE [33]. The FPE for the probability density function p(t; x, s)dxds = P( where A is the diffusion matrix and F the drift vector, which can be obtained in a straightforward way from the SDE for X, Eq. 1, and the ODE for s, Eq. 2. Explicitly, we have and The IF property of X entails that we have an absorbing boundary at X = x th for all times t: p(t; x = x th , s) = 0.
With this boundary condition, we can compute the cumulative distribution function (CDF) of the first-passage time T 1 , CDF 1 (t) = t 0 F 1 (λ)dλ, by time evolution of the FPE on a computational domain Ω ⊂ R 2 , which we choose to be a rectangle extending to sufficiently negative values in the x-direction [34]: where p 1 is the solution to Eq. 8 with the initial condition . For the computation of F 1 , we then only need to differentiate Eq. 10.
To describe adaptation, one needs to compute the statistics of the peak adaptation currents, as defined by Eq. 7. Hence, we need to characterize the hidden Markov model generated by the ISIs T k and the peak adaptation currents s (k) 0 . Whereas the dynamics of s and X as a whole is non-Markovian, the distribution of the ISI T k is completely determined by the distribution G k−1 , and the se- Knowing the values of T k and s  is fixed, and the distribution of the ISI T k+1 can be obtained by solving an FPT problem with s (k) 0 as initial condition for s. The central observation now is that for the second ISI, X again starts from 0, whereas s starts from a distribution G 1 . This is because X evolves stochastically, and therefore reaches the threshold x th at different times T 1 , corresponding to different values of s(t = T 1 ) + κ immediately after the firing event. To compute the second ISI, we therefore need to know G 1 . This can be iterated: to compute the distribution F k of the kth ISI, we need the distribution G k−1 . This is the central idea of the iFPT approach. To set up the iFPT approach, we first observe that between threshold crossings of X, s evolves deterministically. Therefore, when we know the PDF of the first FPT, by conservation of probability, we also know the distribution of s after the first firing event: where t(s) is the inverse function of s. The support of G 1 is shifted, because of the jump of size κ that s undergoes when X reaches its threshold x th . For the second ISI T 2 , s is started from the distribution G 1 (s) instead of a point, whereas X is started from a point again (Fig. 1). This means that to obtain F 2 , the FPE is started from a distribution: . This generalizes to values of k larger than 1. For the kth ISI distribution F k (t), we therefore must choose We show how to obtain the distributions G k for k > 1 in Section V. Linear splines are used to create a mesh function approximating Eq. 12 on the computational domain Ω. Due to this approximation, Eq. 12 then has to be normalized appropriately, so that Ω p k (0; z)dz = 1.
The FPE is then solved again, and the distributions F k (t) are obtained analogously to Eq. 10: CDF k (t) = 1 − Ω p k (t; z)dz, i.e. by timestepping the FPE to obtain the CDF of the kth ISI followed by a numerical differentiation. This constitutes the iFPT approach.
To quantify the accuracy of our numerical methods, we also compute the relative disagreement ∆ between results obtained by the iFPT approach and direct MC simulations. It is defined for a quantity Z by We performed MC simulations for two different simulation setups, the first one without any boundary correction (plain MC), and the second one with a boundary correction according to Giraudo and Sacerdote (MC-GS) [31,35,36]. This boundary correction is applied to reduce the systematic overestimation of FPTs when using the Euler-Maruyama scheme. We compute the relative disagreement given by Eq. 13 using either MC simulations with or without boundary correction; we observed that the order of the relative disagreement is unchanged, but in general, the plain MC algorithm gives rise to larger disagreements than MC-GS. A decrease in the relative disagreement is expected, because the GS correction method should yield an improved weak error of O(h) [37], in contrast to the plain MC simulation, which has a weak error , where h is the time step for the discretization of the SDE, Eq. 1. In the following, the timestep for MC simulations is chosen to be h = 10 −3 and we choose M = 10 6 independent realizations. The plain Euler-Maruyama scheme then gives rise to a weak error of O(h 1 2 ) ≈ 3 · 10 −2 when estimating moments of first passage times, which is one order of magnitude larger than the MC error proportional to 1 √ M = 10 −3 . Therefore, in our simulations, the plain MC error is negligible in comparison to the error introduced by the finite-time discretization of the SDE (Eq. 1).
For the numerical solution of the FPE (Eq. 8), we choose a finite element discretization method [39] and evolve the system using either a stabilized Crank-Nicolson (CN) scheme [40] in Fig. 2 or an Euler timestepping scheme [39] in Fig. 3.
The relative disagreement between MC simulations and finite-element solutions stays largely constant across different lags k when we use the CN scheme instead of the Euler scheme as can be seen by comparing the lower panels of Figs. 2 and 3; the sizes of the disagreement are comparable in magnitude. This suggests that the remaining small discrepancy between MC simulations and PDE results can be largely explained with the errors associated with the MC simulation method. In particular, note that for the examples we show, the MC-GS weak error of size O(h) is comparable in magnitude to the numerator of ∆(Z) (Eq. 13), i.e. the absolute disagreement. We will see what effects this has on the computation of correlations in Section VI.

IV. TRANSITION TO STATIONARITY
We show the evolution of the rate and standard deviation of ISIs in Figs. 2 and 3. The rates decreases, whereas the standard deviation increases until both quantities reach a stationary value. Given that these quantities are derived from moments of the ISI distributions, the distributions also converge towards a stationary form. The convergence towards stationarity of ISI and peak adaptation current distributions (F k and G k , respectively) is shown in Figs. 4 and 5. As expected for adapting models, the ISIs (whose distributions are shown in Fig. 4) increase with higher k, which means that the rate r k decreases. This is well captured by the iFPT approach, with a maximal relative disagreement smaller than 3% in Fig. 2 and smaller than 2% in Fig. 3. Also, the width of both the ISI distributions and the peak adaptation current distributions ( Fig. 5) increases, which is reflected by the increase of the variances of F k and G k shown in Fig. 2  the speed of adaptation can be controlled by adjusting the bias current I 0 and the noise level σ as well as the adaptation strength (size of the kick κ and, in the case of single exponential adaptation, the timescale τ a ). We have carried out additional MC simulations (not shown) to obtain insight into how these parameters influence the speed of the transition to stationarity. A higher bias current and a higher noise level will in general lead to a less rapid transition to stationarity. This can be understood as follows: X is driven to threshold more rapidly, causing the inhibition to build up quickly, reaching values that are higher than those typically found around the peak of the stationary distribution. This slows down the transition to stationarity, because s needs to decay first. Moreover, a large kick size κ and a rapidly decaying adaptation current will cause a quick transition. For the latter case, this is easily understood as we are then nearly dealing with a renewal system: after a short initial period, the effect of the adaptation current on the firing time statistics is negligible. For the former case, we note that the larger the kick size κ, the more pronounced the inhibitory effect of adaptation within one ISI, which means that X takes longer to reach threshold before a large out-of-equilibrium average value of the current s (a   Fig. 2 for power-law adaptation and as in Fig. 3 for exponential adaptation. Both panels show results for plain MC simulations. value that is larger than those typically found around the mode of the stationary distribution) can build up. The system reaches stationarity rapidly because it is quasideterministic as the dynamics of s dominates the system, and the stochastic fluctuations of X will only cause small perturbations. For both power-law and single exponential adaptation currents, it is possible to reach the stationary regime already after one or two firing events as in Fig. 2, or to have a long transient regime as in Fig. 3. The initial condition s 0 for the adaptation current can also be chosen to control the speed of the transition. If it is placed far away from the mean of the stationary distribution, the transition will take a longer time; also, it is possible to obtain a non-monotonic behaviour of the rate as a function of the interval number when s (0) 0 is placed far above the aforementioned mean. The first mean ISI will then be the longest statistically, in contrast to the examples we show in Figs. 2 and 3.

V. STATISTICS OF THE kTH ISI
The iFPT approach can be iterated beyond the first two firing events to obtain the distribution for the third ISI, F 3 (t). However, for the computation of the third ISI, no equation similar to Eq. 11 can be used to obtain G 2 because s was started from a distribution to obtain F 2 . Indeed, for one fixed time T 2 , there are many different starting values s (1) 0 due to the stochastic dynamics of X. Importantly, the ISI T 2 and the initial condition s 0 , a large ISI T 2 is more probable and vice versa) so that we can obtain the value that s reaches after the second firing by the following observation (focusing on power-law adaptation): given that T 2 = λ and s  0 (see Fig. 1). This emphasizes that once two values in the triplet (T 2 , s 0 ) are fixed, the third one is determined. In the following, we again use λ to denote a fixed FPT and ν to denote a fixed initial value of the adaptation current s. Analogously, we generally have for s whereas for exponential adaptation (Eq. 4), we have An alternative way is to fix the value of s The function h is defined by solving the equation f (λ, ν) = θ for λ. To actually compute the density G k , FIG. 6. Conditional FPT densities H(λ, ν) given by Eq. 17 relating the initial value of the adaptation current to the distribution of the following ISI. Computed using numerical solutions to the FPE Eq. 8. Left: Power-law adaptation, computed using a CN timestepping scheme. Right: Exponential adaptation, computed using an Euler timestepping scheme. Parameter values as in Fig. 2 for power-law adaptation and as in Fig. 3 for exponential adaptation.
we need to relate the above observations to densities that we can compute with the iFPT approach. To that end, we now define the conditional density H: We have used T 1 and s in the definition Eq. 17 to stress that, for the purpose of the computation of H, we only need to solve the FPT problem for T 1 using different values of the initial condition s 0 . It will become apparent below that we only need to compute H once, because it does not depend on the firing index k. For a fixed value of ν, H(λ, ν) is an FPT probability density. Our notation emphasizes that H is a function of two variables. ν sets the level of initial inhibition, i.e. the starting value of s. We show the function H for both power-law and exponential adaptation in Fig. 6. We see that with increasing starting value ν, the mode of the FPT distribution shifts to larger times. For power-law adaptation, the shape of the FPT distributions does not change much, whereas for exponential adaptation, the distributions become broader with increasing ν. With this at hand, we now show how to practically compute the distributions G k for k > 1. We can obtain the CDF of s The function h defined in Eq. 16 ensures that for a fixed value of ν, we collect all times λ so that s (k) 0 ≤ θ, which ensures that f (λ, ν) ≤ θ for fixed values of θ and ν. T max is chosen so that H(T max , ν) ≈ 0 ∀ν ∈ supp (G k−1 ). This means that T max should be chosen in the tail of the FPT distribution. Note that for the iFPT approach, one only has to compute H(λ, ν) over the support of G k for k ≥ 1 once [41], and then multiply it with the adaptation current distribution of the previous iteration G k−1 . This function then needs to be integrated according to Eq. 18, and the PDF G k can be obtained by numerical differentiation. We show results for G 2 and G 3 using Eq. 18 in Fig. 5. The agreement between MC simulations and Eq. 18 is excellent. Iterating these ideas into the stationary regime, the ideas of the previous paragraph can be used to directly compute the stationary density of the peak adaptation current. We assume that stationarity is reached after k * − 1 firing events. Then s have the same distribution. Consequently, the stationary density of the peak adaptation current after firing satisfies the twodimensional integral equation (cf. Eq. 18) where Q(θ) ≡ P s (k * ) 0 ≤ θ denotes the CDF of the stationary peak value for the adaptation current s (Eq. 2). We have checked that this equation is indeed satisfied by the stationary distributions for the peak adaptation current obtained from MC simulations (data not shown). Therefore, Eq. 20 can serve as a tool to check whether a given distribution for the peak adaptation current is stationary, or alternatively as a way to compute Q(θ) directly if the function H is known.

VI. CORRELATIONS BETWEEN INTERSPIKE INTERVALS
We now show how to compute serial correlations with the iFPT approach. We define the SCC [15,42] between the nth ISI T n and the (n + k)th ISI T n+k according to where and Q 2 (n, k) = m 2 (k)m 2 (n + k) = Var(T n )Var(T n+k ) .
Here, Var(T n ) denotes the variance of the nth ISI distribution, and m 2 is the standard deviation given by Eq. 6. Note that the definition Eq. 21 does not make use of the notion of stationarity, so that the SCC depends on both the position n of the ISI in the spike train as well as on the lag k between ISIs. Since we have already computed the distributions of the kth ISI, we can readily compute the variances and means in Eq. 21, i.e. the terms given by Eqs. 22 and 23. It is slightly more complicated to compute the first term in the numerator, E (T n T n+k ), because we need the joint density p 2 (T n , T n+k ) of T n and T n+k . In the present study, we focus on k = 1. By definition, we have where as previously F n (T n ) is the density of the nth ISI and p 1 (T n+1 |T n ) is the conditional density of T n+1 given T n . Because T n+1 is statistically determined only by s (n) 0 , we can define this conditional density as where p(T n+1 , y|T n ) denotes the joint density of T n+1 and s (n) 0 = y conditioned on the previous ISI T n . We can rewrite this as follows: = p(T n+1 |y, T n )p(y|T n ).
Now, as we have previously shown, the statistics of T n+1 is completely determined when s (n) 0 = y is fixed, hence p(T n+1 |y, T n ) = p(T n+1 |y) ≡ H(T n+1 , y). Therefore, we have This can be further simplified by noting that p(y|T n ) = p(y,Tn) Fn(Tn) and therefore E(T n T n+1 ) = dT n dT n+1 dy T n T n+1 H(T n+1 , y)p(y, T n ) .
For n = 1, Eq. 24 can be simplified because s   Fig. 2 for power-law adaptation and as in Fig. 3 for exponential adaptation.
, where f is defined by Eqs. 14 and 15. Hence, we have for n = 1 We have shown in Section V how to obtain the conditional FPT density H. To evaluate Eq. 24 for general n, we still need to compute the joint density p s and T n are generated by the system. We show an example of these densities in Fig. 7. The most notable feature is an inverse proportionality between s (n) 0 and T n . The longer e.g. T 2 , the less likely it is for the value of s after the second firing, s 0 , to attain a high value. Eq. 24 is formally correct, but not very practical for actual computations. This is because to apply the iFPT approach, it is desirable to obtain all quantities needed for the SCC using solutions of the FPE only, and no MC simulations. These, however, are required to obtain an approximation for the joint density p(s (n) 0 , T n ) in Eq. 24. We therefore propose an approximation to compute E (T n T n+1 ) using the available densities F, G and H only. To that end, we note that p 3 (T n+1 , T n , y) = p(T n+1 , T n |y)p(y) .
If we now assume that T n and s (n) 0 are independent, we can approximate this as follows: This results in an alternative, approximative expression for the expectation E (T n T n+1 ): Eq. 27 is therefore equivalent to Eq. 24 if p(y, T n ) = G n (y)F n (T n ). Although Fig. 7 demonstates that p does not factorise (the joint density is negatively sloped), we show below that Eq. 27 approximates Eq. 24 very well. Given that Eq. 27 does not require additional MC simulations, the small error introduced by Eq. 27 is well offset by the large reduction in computational cost. There exists a third alternative expression for the expectation of the product of ISIs suggested for a different, but related, model, in [43]. It reads in our notation [44] E (T n T n+1 ) = dT n dT n+1 dy T n T n+1 H(T n+1 , f (T n , y))H(T n , y)G n−1 (y) , where f is given by Eq. 14 or Eq. 15. The term H(T n , y)G n−1 (y) that appears in Eq. 28 is the same as the one on the right-hand side in Eq. 18, which we used to obtain the CDF of s (n) 0 . For n = 1, Eq. 28 reads , because s is started from a point s (0) 0 , so that formally G 0 (y) = δ y − s (0) 0 , which collapses the integration over y in Eq. 28. Thus, for n = 1, Eq. 24 and 28 coincide. This is also true for higher values of n. A proof for this equivalence is presented in Appendix A. Eq. 28 only makes use of the quantities H and G, which can be computed using the iFPT approach as explained in the previous section. We show comparisons between MC simulations and the three expressions for E (T n T n+1 ), Eq. 24, Eq. 27 and Eq. 28, in Fig. 8. The results presented in Fig. 8 are in agreement with the observation that the two expressions Eq. 24 and Eq. 28 are equivalent. We find that the agreement of Eq. 28 with MC simulations is comparable to Eq. 27, particularly for exponential adaptation. Interestingly, the formally correct Eq. 24 and the approximate Eq. 27 give comparable results; Eq. 24 slightly deviates from MC simulations and Eq. 27 when n gets larger. The maximal relative disagreement between MC and iFPT results is less than 2% (Fig. 8, bottom panels). We will see below that the SCC is best approximated by using the exact result Eq. 24 (or equivalently Eq. 28), as we expect.
We attribute the discrepancy between MC simulations and the exact result Eq. 24 to the error caused by the numerical integration over the MC approximation of the joint density p(y, T n ). We checked that applying a kernel density estimation [45] to the MC results for p(y, T n ) did not alter these results. Similar results for Q 1 and Q 2 (Eqs. 22 and 23) are shown in Figs. 9 and 10. The agreement is good, with the maximal relative disagreement always less than 5%. The relative disagreement for the statistics of the product of two adjacent ISIs, Q 1 (n, 1), is in general larger than the error for the moments, as can be seen by comparing Figs. 2 and 3 with Fig. 9. Indeed, for the case of power-law adaptation, we observe an increase of roughly one order of magnitude in the relative error even when the more accurate CN scheme is used (see e.g. left panels of Fig. 9). An exception is the computation of the joint expectation, shown in Fig. 8, where, depending on which methods are compared, the relative disagreement is comparable in size to the one for the computation of the moments shown in Figs. 2 and 3. This increase of the relative disagreement makes the computation of the SCC using the iFPT approach hard, because the two expressions in the numerator of Eq. 21 are quite close to one another for the parameter values we have chosen here, meaning that the numerator is small and indeed of the same magnitude or even smaller as the relative disagreement, e.g. −2.6·10 −2 in the left panel and −8.6 · 10 −4 in the right panel of Fig. 8 for n = 1 for the MC-GS simulation method. The increase in the relative discrepancy is caused by error propagation, because for the second-order statistics, one has to multiply two quantities that both come with an individual error. Therefore, whereas the iFPT approach can in principle also be used to compute serial correlations present in the spike train, obtaining reliable results can in general be a computational challenge. When the negative serial correlations are stronger, so that the difference in the numerator of Eq. 21 is larger, the iFPT approach should give more accurate results. We stress that the dominant source of error is not the computation of the joint expectation E (T n T n+1 ) of ISIs, but the product of the expectation of ISIs and the variances, which can be seen by comparing the lower panels of Fig. 8 with those of Figs. 9 and 10. Finally, we show the SCC at lag 1 obtained by MC simulations and PDE numerics in Fig. 11. The agreement is worse than for all previously considered quantities, but still reasonable. To verify the MC simulations, we checked that our MC simulation setup was able to reproduce known analytical results for the SCC obtained in [15] for certain limiting cases. The anti-correlations between adjacent ISIs (SCC(n, 1) < 0) strengthen until they reach a stationary value. Thus, we see that MC and PDE results for the SCC in general do not agree as well as one would expect from the good agreement of the expecations E(T n T n+1 ) in Fig. 8. The deviation is likely more pronounced for parameters that lead to small negative SCCs, which we have for both The PDE results were obtained using a CN scheme (powerlaw adaptation) or an Euler timestepping scheme (exponential adaptation). The vertical error bars show the MC error for a > 99.99% confidence interval. Bottom: Relative disagreements defined by Eq. 13, where for power-law adapation, the GS boundary corrected MC algorithm was used, and for exponential adaptation, the plain MC algorithm was used. For the iFPT quantity, Eq. 24 was used. Parameter values as in Fig. 2 for power-law adaptation and as in Fig. 3 for exponential adaptation. models considered in this section. In the next Section, we will compare this with results for the perfect integrateand-fire model, where parameter values are chosen so that the SCCs are more negative and hence the agreement is better. This is because the two terms in the numerator of Eq. 21 are close to each other for small SCCs, and hence a small error in them impacts the accuracy of the SCC computation quite dramatically.
We show in the next section that our methods reproduce known stationary analytical results for the SCC when we consider the perfect integrate-and-fire model with single exponential adaptation in a parameter regime where we have large negative correlations, thus demonstrating that our methods are sound, but SCC calculations are very sensitive to numerical inaccuracies. Filled circles: Eq. 22, CN timestepping scheme for power-law adaptation and Euler timestepping scheme for exponential adaptation. Bottom: Relative disagreement defined by Eq. 13, a plain MC algorithm was used to obtain the relative disagreement. Parameter values as in Fig. 2 for power-law adaptation and as in Fig. 3 for exponential adaptation.

A. The perfect integrate-and-fire model
The adapting perfect integrate-and-fire (PIF) model driven by white Gaussian noise and a single exponential adaptation current is one of the simplest models for spike-triggered adaptation. For small noise intensity, analytical expressions for the stationary SCC exist. We here study this stationary limit case and compare analytical formulas to results obtained with the iFPT approach. The model reads (we follow the notation of [46] and [47]) The adaptation mechanism works in analogy to the previous model (Eq. 2): whenever X reaches the threshold X = 1, s receives a kick of size ∆ ≡ ∆ τa and X is instantaneously reset to 0. Filled circles: Eq. 23, CN timestepping scheme for power-law adaptation and Euler timestepping scheme for exponential adaptation. Bottom: Relative disagreement defined by Eq. 13, a plain MC algorithm was used to obtain the relative disagreement. Parameter values as in Fig. 2 for power-law adaptation and as in Fig. 3 for exponential adaptation. Parameter values as in Fig. 2 for power-law adaptation and as in Fig. 3 for exponential adaptation. where Thus, we can compute the SCC in closed analytical form as a function of the system parameters. This formula serves as an important benchmark for our numerical results. In particular, we expect that after the described transition to stationarity, the SCC given by Eq. 21 will approach the stationary SCC given by Eq. 31. This is confirmed in Fig. 12. In particular, the agreement between MC simulations and the exact formula Eq. 24 is very good (the relative disagreement between PDE numerics and the analytical result is less than 6% for the   Fig. 2 for power-law adaptation and as in Fig. 3 for exponential adaptation. Even if the joint expectations shown in Fig. 8 agree well, this does not imply that the SCC will be well approximated; the small correlation values (i.e. the difference between E(TnTn+1) and Q1(n, 1)) for the two examples lead to large discrepancies in the SCCs, which are especially significant for the case of power-law adaptation.
stationary value); the agreement of MC simulations with the approximation Eq. 27 is a bit worse, but still reasonable. Thus, we conclude that our methodology can be used more generally to compute the evolution of the moments and SCCs. However, as seen in the previous section, to obtain a good agreement between MC simulations and PDE numerics, the computational effort might be rather large. In particular, we note that the PIF example shown in Fig. 12 gives rise to stronger negative SCCs, which means that the error propagation has less of an effect, but is still present, even when moments of firing times between MC and PDE numerics disagree by less than 1% (data not shown). We finally note that it is also possible to analytically compute the stationary SCC at higher lags and for different models (e.g. the leaky integrate-and-fire model in the presence of weak noise or for small adaptation currents) using the approach described in [47], or, using a different approach, in [15].

VII. SUMMARY AND CONCLUSIONS
In this paper, we have developed a numerical method for the computation of moments and correlations in general two-dimensional non-renewal escape time processes. Our approach relies on the numerical solution of a twodimensional time-dependent FPE with initial conditions obtained from marginal distributions of previous states of the system. Crucially, the computation scheme presented in this study is general insofar as it can be applied to any stochastic process with a known reset condition (Eq. 1) and any deterministic signal (Eq. 2). As an important application, we have described the transition to stationarity in a stochastic IF neuron model with spike-triggered adaptation, which causes non-trivial ISI correlations. A different mechanism for introducing positive correlations between ISIs has recently been reported in [28] and can equally well be analyzed with the presented methodology. Moreover, our approach enables us to determine the non-trivial timescale of transition to a stationary adapted state by counting the number of intervals needed for this transition. Experimentally, the transition to stationarity is often characterized by the behaviour of the instantaneous firing rate [17,18] [48]. The instantaneous firing rate is usually obtained by averaging the neuronal activity bin-wise for a fixed time. This differs from the firing rate used here as given by the inverse of the mean ISI (Eq. 5). In other words, while the instantaneous firing rate is measured in real time, our firing rate relates to interval numbers. This entails that for a given time t, the firing rate contains contributions from, in general, past firing events that may have occurred at any point k in the spike train. Knowing the joint distributions of all ISIs T k , it is at least in principle possible to reconstruct the instantaneous firing rate, whereas given the instantaneous firing rate, we cannot reconstruct the joint distributions of the individual ISIs T k . Despite the difference in the definition of the firing rate, it might be an interesting topic for further study to classify the time scales of the transition to stationarity both experimentally and based on the theory presented here. The computation of ISI moments using the iFPT approach is computationally inexpensive, giving rise to small relative disagreements between solutions of the FPE and direct MC simulations. In contrast, the computation of correlations is harder. We observed that we lost one order of magnitude in accuracy compared to the simulation of the moments for the quantities Q 1 (Eq. 22) and Q 2 (Eq. 23), which makes the reliable computation of SCCs a computationally challenging task. We conclude that even a relative disagreement of ISI moments between Monte Carlo estimations and PDE solutions of the order 10 −3 is not enough to reliably estimate the SCC using PDE numerics only (but this might be specific for the examples we have considered), indicating that more refined numerical methods or larger computational ressources, or indeed both, are needed. When the difference between the two terms in the numerator of Eq. 21 is large, the small error made by the numerical solution of the PDE should have a less detrimental influence on the final result. The need for more refined numerical methods is further substantiated by the fact that the more accurate asymptotically stable CN timestepping scheme did not result in a significant decrease in the relative disagreement between PDE results and both plain MC and MC-GS simulations, for both moments of firing intervals and the SCC. In this paper, we have only discussed the error associated with MC simulations, because it is readily available. The numerical solution of the FPE is of course also subject to numerical errors and future work will likely benefit from a discussion about how to systematically reduce these errors. In this context, it might be beneficial to compare the finite-element methods used here to other methods for solving PDEs, such as finite difference and finite volume methods [49]. A systematic error estimation study might be made more difficult by the fact that the diffusion matrix (Eq. 9) is not positive definite [50,51].
There is an alternative method to compute the ISI distributions given the distributions of the peak adaptation currents using the formula H(t, y)G k−1 (y)dy .
This is an integral equation frequently used in the context of randomized FPT problems [52,53], where usually F k and the kernel H are given, and one tries to find a matching distribution G k−1 of starting points. Using Eq. 32, we do not have to solve a time-dependent PDE for each ISI, but must compute H once as the solution of a time-dependent FPE with varying initial conditions for s, similar to the computation of F 1 . The averaging that the iFPT approach amounts to is particularly clear in this formulation. The densities G k are obtained as discussed above (see Eq. 18). The approach relying on Eq. 32 might be computationally less expensive, but we found that it is not as exact as solving a time-dependent FPE for each ISI, especially at larger times. This is likely caused by errors when computing H, as the numerical integration in Eq. 32 can be performed accurately and efficiently. However, Eq. 32 could be useful for analytical explorations when H is known. We finally emphasize that our approach did not use the complicated boundary conditions for stationary IF models, where the probability flux at threshold gives rise to a discontinuity of the probability flux at reset [9,54]. In contrast, our approach allows for the computation of transient and stationary distributions of the adaptation dynamics in an iterative fashion, requiring the solution of a two-dimensional time-dependent PDE. The only boundary condition that has to be taken into account is an absorbing boundary condition for the probability density at the threshold x th . This makes the problem tractable using finite-element approximation methods for time-dependent PDEs, resulting in a general description of two-dimensional IF models with spike-triggered adaptation. The approach we have described in this paper can in principle also be used to gain analytical insight into these system, however, quantities such as H and the solution of a two-dimensional time-dependent PDE seem to be unavailable in closed analytical form except in the most simple cases.
where y = s (n−1) 0 and we have replaced H(T n , y)G n−1 (y) = p(y, T n ). Note that in Eq. A1, p is the joint density of s . We need to assume that f is invertible with respect to the second argument, which is the case for both power-law (Eq. 14) and exponential adaptation (Eq. 15) considered in this paper. The integral then becomes dT n dT n+1 ds (A6) Therefore, Eq. 24 and Eq. 28 are equivalent.