Extraction of Synaptic Input Properties in Vivo

Knowledge of synaptic input is crucial for understanding synaptic integration and ultimately neural function. However, in vivo, the rates at which synaptic inputs arrive are high, so that it is typically impossible to detect single events. We show here that it is nevertheless possible to extract the properties of the events and, in particular, to extract the event rate, the synaptic time constants, and the properties of the event size distribution from in vivo voltage-clamp recordings. Applied to cerebellar interneurons, our method reveals that the synaptic input rate increases from 600 Hz during rest to 1000 Hz during locomotion, while the amplitude and shape of the synaptic events are unaffected by this state change. This method thus complements existing methods to measure neural function in vivo.


Introduction
Neurons typically receive a barrage of thousands of excitatory and inhibitory events per second. As these inputs determine to a large extent the spiking activity of the neuron, it is important to know the properties of synaptic input and how it changes, for example, with behavioral state (e.g. sleep, attention, locomotion), with plasticity, or with homoeostasis. Consider a neuron receiving synaptic input while the total current 1 is being measured in voltage-clamp, Fig.1a. While in vitro, or in cases where activity is artificially lowered, individual excitatory and inhibitory inputs can be resolved (top), in vivo the rates are typically so high that this is impossible. Instead, the total synaptic current trace is wildly fluctuating and single event extraction methods will fail.
Nevertheless, information can still be extracted from the statistical properties of the recorded in vivo currents. Although individual synaptic events might not be distinguishable in the observed current trace, the trace will still bear signatures of the underlying events. Intuitively, the mean current should be proportional to the product of the synaptic event size and the total event frequency. But it is possible to extract other information as well. For instance, when the synaptic events have short timeconstants, the observed current trace will have more high frequency content than when the synaptic time-constants are slow. Similarly, when the input is composed of many small events, the variance of the current trace will be smaller than when it is composed of a few large events.
An early application of these ideas was used at the neuro-muscular junction (Katz and Miledi, 1972) and in the retina to measure visually evoked transmitter release (Ashmore and Falk, 1982). Other earlier methods have estimated of both the excitatory and inhibitory conductances using across trial averages of different magnitude current injections (Borg-Graham, Monier, and Frégnac, 1996;Häusser and Roth, 1997;Anderson, Carandini, and Ferster, 2000;Wehr and Zador, 2003;Rudolph et al., 2004;Greenhill and Jones, 2007). More recently, conductances have been estimated from a single trace by applying a diverse range of probabilistic inference methods. In some of those studies the size of the excitatory and inhibitory inputs is assumed to be identical, fixed, and known a priori, while the synaptic inputs were assumed to be δ-functions, with instantaneous rise and decay time and Poisson statistics (Kobayashi, Shinomoto, and Lansky, 2011). Some of the assumptions were relaxed in Paninski et al. (2012), where the number of inputs in a time window followed either an exponential or truncated Gaussian distribution, but the synaptic decay time constant had to be known a priori. Finally, Lankarany et al. (2013) further generalized the distribution of the number of inputs in a time window by making use of a mixture of Gaussians. This method allowed a good estimation of the conductance traces even when the distribution of synaptic amplitudes has long tails.
These methods typically attempt to recover the global excitatory and inhibitory conductances. Here instead we focus on recovering the statistics of the constituent synaptic events. Specifically, we introduce a method that aims to infer the event rate, synaptic time-constants, and distribution of synaptic event amplitudes from the power 2 spectral density and statistical moments of the observed current trace. We applied our method to voltage-clamp traces of electrotonically compact interneurons recorded in the cerebellum of awake mice. We find that during voluntary locomotion, the excitatory input rate increases from 600 to 1000 Hz, while the synaptic event amplitudes remain the same. Our method thus provides a novel way to resolve synaptic event properties in vivo. Semi-automated single event analysis Figure 1: Inference of synaptic properties. a) A neuron receives input from a number of synapses under a Poisson rate assumption. The events have identical shape, but the amplitude a varies between events. Right: For in vitro experiments synaptic events rates are typically low and the individual events can be extracted and quantified. However, for in vivo experiments, rates are high and individual events are not distinguishable. b) Analysis based on semi-automated single event extraction produces incorrect results when the total rate exceeds 500 HZ. From left to right: estimated event frequency, estimated mean event amplitude, estimated standard deviation of the event amplitude. Model parameters: a k = 50pA. rise-time τ 1 = 0.3 ms and the decay-time τ 2 = 2 ms.

Results
A common method to extract synaptic properties is to identify and analyze isolated events from current traces, but in vivo this fails because the events will overlap, Fig.1a.
To demonstrate the problem explicitly we simulated a neuron randomly receiving excitatory synaptic events (see below for model details). For illustration purposes we assume momentarily that the amplitude of all events is identical (50pA). From the total current recorded in voltage clamp we attempt to reconstruct the frequency of events and the distribution of their amplitudes.
We used single event dectection software (see Methods) to find putative post-synaptic currents (PSCs). At low input frequencies (50Hz), most of PSCs were correctly identified and the resulting estimation of the synaptic input amplitude distribution was correct. However, at higher frequencies, when the event interval became shorter than the synaptic decay time, the event frequency was grossly underestimated and reached a plateau, Fig. 1b had to be manually post-processed to correct mistakes in event detection. This manual processing is time consuming -even an experienced researcher spent more than 1 hour to analyze a 10 second trace. Thus at high input frequencies single event analysis is not only incorrect, it is also time consuming. While somewhat better result might be achievable when the PSCs are identified by their rising phase, such methods will still fail at high rates.

Generative model for the observed current trace
Unlike the in vitro situation, the synaptic properties are not directly accessible from in vivo recordings. Instead, the data indirectly and stochastically reflects the synaptic properties. We therefore use a generative model to couple the data, in particular the statistics of the current trace, to the underlying synaptic properties. We define the generative model as follows: the synaptic inputs are assumed to arrive according to a Poisson process with a rate ν, Fig.1a (also see Discussion). The synaptic events are 4 modelled with a bi-exponential time-course as this can accurately fit most fast synapses (e.g. Roth and van Rossum, 2009) with rise-time τ 1 and decay-time τ 2 . While we initially assumed that all PSCs have the same time constants, the effect of heterogeneous time-constants is studied below. The total current is where t k denotes the time of event k, and a k is the amplitude of that event. Unlike the schematic example above, the event amplitudes were drawn from a synaptic amplitude distribution P (a) (with a ≥ 0). This distribution captures the spread of amplitudes across the population of synapses, as well as variation in single synapse event amplitudes due to randomness and non-stationarities such as short-term plasticity.
Although our method is general and not restricted to any specific distribution of synaptic amplitudes, we consider for concreteness the amplitudes to be distributed as either: 1) a log-normal distribution (LN ) with raw moments a n ≡ ∫ ∞ 0 P (a)a n da = e np 1 +n 2 p 2 2 /2 . Or, 2) a stretched exponential distribution (SE ) with moments a n = p n 1 p 2 Γ ((1 + n)/p 2 ) /Γ(1 + 1/p 2 ) where Γ(·) is the Gamma function.
Or, 3) a zero-truncated-normal distribution (TN) where h = −p 1 /p 2 , and ϕ(·) and Φ(·) are the density of a normal distribution with zero-mean and unit variance and its cumulative. The mean µ a = p 1 + p 2 ρ and variance (for higher moments see e.g. Horrace, 2013).
These three probability distributions (examples are shown in Fig.2) are commonly used in the experimental and theoretical literature (Song et al., 2005;Barbour et al., 5 2007;Buzsáki and Mizuseki, 2014). The stretched exponential distribution has a maximum at zero amplitude, while the two other distributions have an adjustable mode, but differ in the heaviness of their tails. The LN and the SE distribution(for p 2 < 1) are heavy-tailed, while the TN distribution is not. Conveniently all these distributions are characterized by the two parameters p 1 and p 2 . These parameters determine the mean and variance of the events but, as can be seen from the equations for the moments above, the exact relations are different for each of the three candidate distributions.

Moments of the synaptic current
Next, we calculated the current trace I(t) that results from the random inputs. The statistics of the current follow from the distribution of synaptic event amplitudes and the time-course of the events according to Campbell's theorem (Rice, 1954;Bendat and Piersol, 1966;Ashmore and Falk, 1982). The cumulants κ n of the current probability distribution P (I) follow from the event distribution and the synaptic time-course as In this equation the raw moments a n of the synaptic event amplitude distribution P (a), are given above for the different candidate distributions. Furthermore, for the biexponential synaptic kernel f (t) (Eq. 1) the integrals are ∫ ∞ 0 [f (t)] n dt = n!τ 1 Γ(n τ 1 τ 2 )/Γ(1+ n + n τ 1 τ 2 ). Finally, the moments of the current trace M I are expressed in the cumulants κ n . In practice we use the first four moments of the current distribution, We can thus express the statistical moments of the distribution of the observed current trace, Eq.7, in the underlying model.

Power spectrum of the synaptic current
Also the power spectral density (PSD) of the current I(t) can be expressed in the model parameters. The current is the convolution a Poisson process, which has a flat power spectrum, with the synaptic kernel. As a result, the PSD is the magnitude of  Figure 2: Bayesian network representing the dependencies between the variables. Orange nodes represent variables that have to be inferred from the data, green nodes stand for variables that are measured directly from the data. The blue nodes are additional contributions to the current in typical experiments. The top left graph shows the PSD fit (red line is the fit with Eq. 8) and the bottom right graph is the probability distribution of I(t), used to calculate the observed moments M I . All variables are described in Table 1. the Fourier transform of the synaptic kernel Eq.1 and for non-zero frequencies equals (Puggioni, 2015) Note that being a second order statistic, the PSD depends on the mean and variance of the amplitude distribution P (a) only.

Inference procedure
Now that we have expressed both the PSD and the moments of the current distribution in the model parameters, one could proceed using classical fitting techniques, such as least square fitting, to find the synaptic parameters that best fit the data. However, we use a probabilistic approach that yields the distribution of parameters that best fit the data. A probabilistic approach is advantageous because: 1) We expect strong correlations between the model parameters, this can cause traditional fitting to fail (see e.g. Costa, Sjostrom, and van Rossum, 2013).
2) The probabilistic approach naturally yields the probability distribution of possible fit parameters.
3) The probabilistic approach  We first present an idealized model, which ignores some distortions typical of in vivo recordings. Fig. 2 shows the Bayesian network and the dependencies among the variables (nodes). The green nodes stand for variables that are measured directly from the data: the PSD and the first four moments of the current M I = [µ I , σ I , skew I , kurtosis I ].
Together the data are succinctly denoted D. The orange nodes represent variables that are to be inferred. The 5 parameters of the model are the rate ν, the mean synaptic amplitude µ a , its variance σ a , synaptic rise-time τ 1 and decay time τ 2 , as well as the type of distribution S a , Table 1. The set of parameters is denoted θ.
Written formally, the joint probability of the Bayesian network in Fig. 2 is where we introduced the vector of moments of the synaptic amplitude distribution, M a (see below).
From this equation the parameter distribution given the data P (θ|D) follows as P (θ|D) ∝ P joint (θ, D). We now describe P joint and the probabilistic dependencies among the nodes term by term. The first two terms infer the values for the synaptic 8 time-constants from the PSD. Since we cannot obtain an analytic expression of the likelihood of the PSD, we use empirical Bayes to set the prior on the time constants of the post-synaptic current (Casella, 1985). We fit the shape of Eq. 8 with a least square method to the PSD to find τ 1 and τ 2 (see top left inset in Fig. 2; note that the values ν, µ a , σ a are not needed to perform this fit). Since we found the relative cross-terms of the Hessian matrix between τ 1 and τ 2 to be very small (< 0.005), we model the time constants with independent Gaussian distributions with mean and variance given by the mean and the Hessian of the PSD fit. A common criticism of empirical Bayes is that it uses data for both prior and inference, thus double counting the data. Here however, the PSD data is used to set the prior on the time constants, but it is not used as evidence in the inference process, Fig. 2.
The next term in Eq. 9 is f (M a |µ a , σ a , S a ). This is a deterministic function, because the moments of the synaptic amplitude distribution M a are fully determined by µ a , σ a and the type of amplitude distribution type, see Eqs. 3-5. The parameters µ a , σ a and ν are given uninformative uniform priors spanning a reasonable and positive range of values.
The final term in Eq. 9, the likelihood of the moments of the current P (M I |τ 1 , τ 2 , M a , ν) cannot be calculated analytically. Although Eq. 6 gives the expected value, M I is a stochastic quantity that due to the Poisson process is different on each run and thus requires simulation. However, below we present a method to speed up its calculation.

Inclusion of in vivo variability and other experimental confounds
In vivo voltage clamp recordings show a number of effects that need to be included in the model via additional parameters. The first additional feature is the baseline current (i 0 ) of the voltage clamp that has to be subtracted from the current. In in vitro situations one can estimate it by finding the baseline of the current trace, but due to the high input rates this is challenging for in vivo recordings. Instead a prior probability of P (i 0 ) was included. It was normally distributed with mean and variance estimated with an informed guess, reflecting the uncertainty in the value of i 0 .
The second feature is the inclusion of high frequency noise coming, for instance, from the recording set-up or from the stochastic opening and closing of ion channels.
Its standard deviation σ H is measured experimentally and we model it as a zero mean Ornstein-Uhlenbeck (OU) process where B t is a Wiener process and the cut-off frequency is 1/(2πτ H ) = 600 Hz.
Finally we include low frequency fluctuations typically present in in vivo synaptic activity (e.g. Schiemann et al., 2015). We relax the constant rate assumption by adding a modulation term to the Poisson rate, which is modeled as an OU process with power σ 2 L and cut-off frequency f L = 1/(2πτ L ) of 5 Hz As a result in the expression for the PSD, Eq. 8, the rate ν is replaced by (ν + P SD OU (f )), where the power spectrum of the OU process is given by To find the variance of this slow noise, we fit the PSD with Eq.8 in a range above f L and calculate its integral σ th (the theoretical standard deviation of the modulation-free trace). The value of f L was set heuristically as the minimal value that that resulted in a good fit. Since the observed variance of the signal σ 2 obs is the sum of σ 2 th , σ 2 H and σ 2 L (the slow component is independent from the underlying process), it follows that σ 2 L = σ 2 obs − σ 2 H − σ 2 th . These three additional features are depicted by the blue nodes in Fig. 2. The joint probability for the full model becomes P joint =P (τ 1 |PSD)P (τ 2 |PSD)P (M a |µ a , σ a , S a )P (µ a )P (σ a )P (ν)× P (i 0 )P (σ H )P (σ L |PSD)P (M I |τ 1 , τ 2 , M a , ν, i 0 , σ H , σ L ) (12)

Description of the sampling algorithm
In the Bayesian framework, the posterior probabilities of the parameters of the model can be estimated by sampling from P joint , for instance using a suitable Markov chain Monte Carlo algorithm. However, this approach is very slow, because the likelihood does not have a closed form and has to be estimated with multiple simulations after each MCMC sample. As the estimation of the likelihood takes about 1 minute on a standard PC, a typical MCMC run of ∼ 100000 samples would take approximately 2 months.
We introduce a speed up that can be used whenever a likelihood can only be obtained by sampling from the generative model, but its means can be calculated analytically.
The idea is to fit the likelihood with a kernel density estimate (KDE). Assuming that the shape of the likelihood does not depend much on the parameter values, the same KDE can be exploited to approximate the likelihood for different parameter values. As a result we can keep the shape fixed, but we translate it to a new location determined + + Analytically calculated mean  To perform the inference, we first initialize the parameters {τ 1 , τ 2 , M a , ν} by Least-Square fitting Eq. 7 to the observed moments and Eq. 8 to the observed PSD. Next, we run the generative model multiple times to calculate the shape of P (M I |τ 1 , τ 2 , M a , ν) using an exponential KDE. Finally, during the main MCMC run where we sample P joint , we keep the shape fixed but at each step we translate it to the location of the analytically calculated average moments (Eq. 6, red crosses in Fig 3).

Validation on simulated data
To validate the method we simulated 10s current traces with known parameters and we apply our inference method to recover their values. One parameter at a time was in vivo PSCs in cerebellar interneurons (Szapiro and Barbour, 2007, and below). We first assumed that the shape of the synaptic amplitude distribution (LN, SE, or TN) is a priori known. Fig. 4a compares the estimated parameters vs. their true value. The inference works well in a physiologically plausible range and the true value is almost always within the confidence interval. The largest error bars occur when either the mean event amplitude is small or the std dev. is large, i.e. the CoV is large.
The approach also yields the inferred joint distribution for a given parameter setting.
The posterior distribution of the parameters contains the true values in the region of maximal density, Fig. 4b. Unlike single point estimates (e.g. maximum a posteriori, MAP estimates), one can also evaluate the dependencies between the parameters. In particular we observe a strong anti-correlation between event rate and event size (bottom left panel). In other words, the model compensates for changes in the rate by changing the estimate for the event size; their product is approximately invariant.

Model selection
Next, we tested whether the method is able to recover the correct amplitude distribution (LN, SE, or, TN) when it is not known a priori. The Bayesian framework offers straightforward tools to assess the likelihood of a model, such as the Deviance Information Criterion (DIC) (Spiegelhalter et al., 2002). The higher is the DIC, the less likely is the model suitable to describe the data, and this would be the simplest way to choose the most likely distribution. However, the DIC value is a random variable that varies from trial to trial. Thus rather than selecting the lowest DIC, we use Bayesian model comparison based on the distribution of the DIC values. We generated 100 traces using a given amplitude distribution and run the inference algorithm assum- where in the second line we assumed that each amplitude distribution is a priori equiprobable. Thus, for each point in the space (∆ LE , ∆ LT ), we select the distribution TruncNormal Exponential LogNormal a b c Figure 5: Inference of the underlying weight distribution of simulated data. a) The distribution of DIC differences for the three simulation weight distributions. As the shapes of the distributions differ, we used Bayesian model selection. The contour lines are linearly spaced. b) The resulting maximum likelihood solution that tells which underlying distribution is most likely. c) Performance of the algorithm to recover the correct weight distribution (expressed as fraction correct, based on 100 runs).
which has the highest probability according to Eq.13, see Fig. 5b. This method is able to correctly identify the amplitude distribution with ∼ 90% accuracy, Fig. 5c.

Robustness of method
We examined the robustness of the method in a number of ways. First, we explored how the posterior of the parameters depends on the length of the trace. Longer traces should lead to less uncertainty and yield narrower, more precise distributions, because more statistics are collected. However, short intervals are preferable, because they allow the analysis of shorter periods in in vivo traces and allows one to see more rapid modulation in the synaptic inputs. Indeed, longer traces lead to less uncertainty on the parameters, Fig. 6a. The analysis shows that 10 second long recordings are in general enough to obtain a reasonable estimation of the parameters.
Next, in vivo PSCs rise-and decay-times might vary across synapses as different synapses may have different kinetic properties and may be subject to different amounts of dendritic filtering (Williams and Mitchell, 2008). To test whether our model performs well when the shape of the PSCs varies, both time constants that determine the PSC b) Robustness to heterogeneity in the synaptic time-constants as expressed in the CV of the rise-and decay-time constants, that were both independently drawn from a truncated normal distribution. c) Robustness to in vivo variability when an inhomogeneous low frequency (< 5 Hz) component is added to the Poisson rate. The parameters' estimation is plotted against the contribution (in percentage) of the low frequency modulation to the total standard deviation. i) Result after correcting the PSD at low frequency (text). ii) Without the correction, substantial biases arise.
shape were independently drawn from truncated normal distributions for each PSC.
When the heterogeneity of the time-constants were modest (CV≲ 0.3), the model correctly extracted µ a , σ a and ν, but at larger values the frequency estimate in particular diverges, Fig. 6b cut-off frequency. When we applied the correction described above, the model performs well even in presence of considerable fluctuations in the synaptic input rate, Fig. 6c.i.
Including the correction is important, as without it large biases arise, Fig. 6c.ii.

Inference method applied to cerebellar in vivo data
We applied our inference method to in vivo recordings obtained from cerebellar interneuronsin the molecular layer (basket and/or stellate cells). These neurons are ideal to test our method as they are electronically compact (Kondo and Marty, 1998), although some cable filtering can be observed in older animals (Abrahamsson et al., 2012). The voltage clamp held neurons at -70mV to isolate excitatory inputs. The head-restrained mice displayed bouts of self-paced voluntary locomotion on a cylindrical treadmill, Fig. 7a.
All traces (n = 8) were 90 seconds long and contained at least 10 seconds of movement. Locomotion modulates subthreshold and spiking activity in a large number of brain regions (Dombeck, Graziano, and Tank, 2009;Polack, Friedman, and Golshani, 2013;Schiemann et al., 2015). In cerebellar interneurons, locomotion is associated with increased excitatory input drive, Fig. 7b. In particular we were interested in what underlies this increased drive. For instance, it could be caused by increased frequency, increased amplitude as an effect of neuromodulation, or recruitment of a distinct set of synapses.
To apply our method we extracted the PSD and distribution from the current trace, Fig. 7b. We corrected for the low frequency modulation as described above, while the high frequency noise was measured directly from the system. The subsequent inference showed that the increase in excitatory synaptic current is associated with an increased input frequency, shown for a representative trace in Fig. 7c Observed current distribution in the moving and quiet periods. Note that due to the high input frequency, periods with zero current are very rare. Right: Samples of the recorded Power Spectral Density. c) Posterior distribution of the input parameters of a representative interneuron (under Log-Normal assumption, which was the most likely distribution for this neuron). While the estimation of µ a and σ a during movement, has less uncertainty, their maximum likelihood value is hardly changed. d) Inference of the synaptic input parameters across 8 recordings displaying an increase in the input frequency during movement but not in the mean or variance of the event amplitude. e) Classification of the synaptic event amplitude distribution. In both conditions both Log-normal and Stretched exponential distributions were observed. The truncated normal was inferred only once. Error bars denote the (min, max) range, boxes the 25th-75th percentile, horizontal bar the median.
frequency roughly doubles, from 585 to 1006 Hz. The synaptic time constants found by fitting the power spectrum of the current, were τ 1 = 0.25 ± 0.04 ms and τ 2 = 1.56 ± 0.21 ms (mean ± standard error), comparable with the 20-80% rise time of 0.41 ± 0.14 ms and the 1.85 ± 0.52 ms decay reported in slice (Szapiro and Barbour, 2007).
Across the population the MAP estimates of µ a , σ a and ν during quiet wakefulness and movement show a similar pattern, Fig. 7d and Table 2. Note that given the small changes between quiet and moving state, the power of the test calculated from the data is low, but 10% changes would be detected with high probability.
Next, we applied our inference method to each trace using the LN (log-normal), SE (stretched exponential), and TN (truncated normal) distribution and determined which synaptic amplitude distribution was the most likely, where it should be kept in mind that the statistical power of the data is limited. In general, we found that both during quiet periods and movement the most likely distributions were heavy-tailed being either LN or SE (with exponent on average 0.8, range 0.7 -1.2), Fig. 7e. In particular, during active periods the LN distribution (the most common) was significantly more likely than the TN (p=0.046), but the SE distribution was not significantly less likely (p=0.37).
Thus while this suggests that the distribution is stretched, the current data can not distinguish between the LN and SE types. Next, we wondered if the shape change using a binomial test. For instance, if a recording yielded 5 times the LN distribution out of 8 data traces during the quiet period, and did so 6 out of 8 times during the active period, there is no evidence for a change. We found no evidence for a change in the distribution shape between quiet and active period (LN,p=0.78;SE,p=0.96;TN,p=0.71).
Finally, we compared our estimates to a standard single event extraction method (see Methods). Because the event extraction method fails at frequencies higher than ∼ 500 inputs per second, the frequency of the synaptic inputs is underestimated by a factor two, due to the misclassification of overlapping events.

Discussion
In the last decade, numerous studies have been published using voltage-clamp data from anesthetized animals to investigate the contribution of excitation and inhibition to the V m dynamics, with recordings from auditory cortex (Wehr and Zador, 2003;Poo and Isaacson, 2009;Liu et al., 2010), visual cortex (Liu et al., 2010;Haider, Hausser, and Carandini, 2012), and pre-frontal cortex (Haider et al., 2006). However, in these experiments only the total excitatory or inhibitory contributions can be extracted, therefore 18 they are unable to distinguish properties of single synapses and changes therein. We proposed a novel probabilistic method to infer the synaptic time-constants, the mean and variance of the synaptic event amplitude distribution, and the synaptic event rate from in vivo voltage-clamp traces. Moreover, the method accurately recovers the shape of the distribution of synaptic inputs. The inference is robust to slow fluctuations of synaptic input rate, experimental noise, and to heterogeneity in the time constants of the PSCs.
The extracted distribution reflects the amplitude of events received by the neuron. It therefore includes not only variations across synapses, but also variation due to synaptic unreliability and heterogeneity from effects like short-term synaptic plasticity (Szapiro and Barbour, 2007;Abrahamsson et al., 2012). Furthermore, the contribution of each synapse is weighted by its own input rate: synapses receiving inputs at higher rates will contribute more to the estimated amplitude distribution than synapses receiving low rates. Our method thus captures the effective distribution of synaptic inputs in an in vivo recording and thereby complements techniques that infer the amplitude distribution either anatomically from spine size or from paired recordings in vitro, and that are not weighted by the input rate.
Applied to voltage-clamp recordings from cerebellar interneurons of awake mice, we found that the excitatory synaptic amplitude distribution is either a stretched exponential or log-normal. This means that the probability for large events is larger than for a Gaussian with same mean and variance. Such heavy-tailed distributions have been observed in a number of systems (Sayer, Friedlander, and Redman, 1990;Song et al., 2005;Barbour et al., 2007;Ikegaya et al., 2013) and are believed to be an important characteristic of neural processing (Koulakov, Hromádka, and Zador, 2009;Roxin et al., 2011;Teramae, Tsubo, and Fukai, 2012). While any distribution can be tested (although for efficiency reasons the moments should ideally be available analytically), a future goal is to reconstruct the amplitude distribution directly, for instance by reconstructing it from it moments. However, there are currently no fully satisfactory mathematical methods to achieve this.
Furthermore we found no evidence that the synaptic amplitude distribution changes in these neurons when the animal is moving. Instead the increase of the excitatory current during movement is due to the higher frequency of the inputs. The most parsimonious explanation is that all inputs, big and small, increase their rates similarly during movement. However it is important to remember that the method is based on the ensemble of inputs. While our findings are inconsistent with a case where only large inputs become more active, and inconsistent with a case where all single synaptic events become stronger by, say, neuro-modulation, we can however not rule out that for instance a second population of inputs with an identical amplitude distribution becomes active during movement.
We summarize restrictions of the method. First, as in most methods, the in vivo traces need to be stationary over a period long enough to accumulate sufficient statistics.
The second assumption is that the neurons are electronically compact such that a good 'space clamp' can be achieved, which is problematic for Purkinje and pyramidal neurons (Williams and Mitchell, 2008). It would be of interest to examine how robust our method is toward deviations from this (Abrahamsson et al., 2012), e.g. using compartmental simulations.
The third assumption is that synaptic inputs are uncorrelated and follow a Poisson distribution. Experimental measurements of correlations in the brain are contradictory and largely depend on what time-scale is considered, reviewed in Cohen and Kohn (2011). Notably, slow correlations are visible in the PSD, adding a component with a different time-constant (Moreno-Bote, Renart, and Parga, 2008). When fitting the PSD of in vivo data, we observed a bump in activity in the low frequencies (f < 10 Hz), that could correspond to spike correlations on time-scales 15ms. Such correlations are included in our model. The method would not be able to identify spike-correlations on the order of the synaptic time-constants (τ 1 and τ 2 ), because they would contribute to the PSD in the same frequency range. However, it is generally believed that spike count correlations on a short time scale (∼ 2ms) are small, normally < 0.03 (Smith and Kohn, 2008;Helias, Tetzlaff, and Diesmann, 2014;Grytskyy et al., 2013;Renart et al., 2010;Ecker et al., 2010), and thus the inference would likely still give correct results. Recent experimental evidence shows that high frequency firing Purkinje cells contact interneurons directly (Witter et al., 2016), which could lead to strongly correlated input. We did not observe bumps in the interneuron powerspectum at around 200-250 Hz, which is the typical cerebellar oscillation frequency (de Solages et al., 2008). Nevertheless, when applying this method one should be aware of the possibility of such effects. Finally, in these population measurements truly instantaneous correlations, where multiple events arrive simultaneously, can in principle never be distinguished from altered distributions.
However, the error associated to this effect is likely limited. Consider a neuron that receives inputs of equal amplitude a at a rate ν. If the inputs have correlation c = 0.05, it means that every 100 events, as a first approximation one will observe on average only 95 events, 90 of size a and 5 of size 2a. In general, for a given correlation c, the observed frequency is ν obs = ν true (1 − c) and the observed average amplitude a obs = a true /(1 − c).
Thus, even assuming c = 0.05, the error in the estimate would be ≤5%.
While the robustness of the method can further be tested using (multi-compartment) simulations, physiological validation is much harder as even with optogenetics there is no obvious way to generate high frequency Poisson-like input trains.
The method used the first four moments of the measured current. While in order to infer the distribution shape, one needs at least three moments, we found that using only the first three led to a consistent overestimate of the event amplitude with some 10 pA (not shown). In contrast, higher moments are difficult to estimate with brief recordings. Thus four seems a good middle ground for the recordings analyzed here.
In summary, commonly used methods to analyse in vivo voltage clamp data can not infer the single event statistics at all or introduce large errors. Instead the proposed method represents an important step to extract such information from in vivo intracellular recordings.

Acknowledgments
We are grateful to M. Nolan, P. Latham, and L. Acerbi for helpful discussions.

Implementation details
We implemented the model in PyMC2, a python package to perform Bayesian computation (Patil, Huard, and Fonnesbeck, 2010), using a Metropolis Hastings sampler, with normal proposal distribution and standard deviation in each dimension equal to 1 over the absolute value of the parameters. Usually, the auto-correlation of the chains was about 300 − 500 samples and the burn-in phase was about 10 effective samples. To construct the posterior, we generated 150,000 samples yielding ∼ 400 effective samples and assessed the mixing by using the Geweke method provided by the PyMC package. The computational analysis tools and data are available at https: //github.com/ppuggioni/invivoinfer. To compare our method to traditional single event detection methods, we employed TaroTools, a freely available IgorPro package (see sites.google.com/site/tarotoolsregister/) to detect putative post-synaptic currents (PSCs).
The experimental data is described in detail elsewhere (Jelitai et al., 2016). Briefly, whole-cell patch clamp recordings of molecular layer interneurons (basket and stellate cells) were obtained from awake behaving but head-restrained mice at a depth of 100-300 µm from the pial surface of the cerebellum, using a Multiclamp 700B amplifier (Molecular Devices, USA). The signal was filtered at 6 -10 kHz and acquired at 10 -20 kHz using PClamp 10 software in conjunction with a DigiData 1440 DAC interface (Molecular Devices). Patch pipettes (5-8 MΩ) were filled with internal solution (285-295 mOsm) containing (in mM): 135 K-gluconate, 7 KCl, 10 HEPES, 10 sodium phosphocreatine, 2 MgATP, 2 Na2ATP, 0.5 Na2GTP and 1 mg/ml biocytin (pH adjusted to 7.2 with KOH). External solution contained (in mM): 150 NaCl, 2.5 KCl, 10 HEPES, 1.5 CaCl2, 1 MgCl2 (adjusted to pH 7.3 with NaOH). While biocytin was included in the pipette for histological identification, to allow for off-line classification interneuron type, our recovery rate was relatively low (<10%) Thus, we were unable to further differentiate between different interneuron subtypes in this study.
To detect movement, the animals were filmed using a moderate frame rate digital camera (60 fps) synchronized with the electrophysiological recording. We defined a region of interest (ROI) covering the forepaws, trunk and face and calculated a motion index between successive frames (as in Schiemann et al., 2015). All movements (positioning, grooming and locomotion) were included.