Spacings Around An Order Statistic

We determine the joint limiting distribution of adjacent spacings around a central, intermediate, or an extreme order statistic $X_{k:n}$ of a random sample of size $n$ from a continuous distribution $F$. For central and intermediate cases, normalized spacings in the left and right neighborhoods are asymptotically i.i.d. exponential random variables. The associated independent Poisson arrival processes are independent of $X_{k:n}$. For an extreme $X_{k:n}$, the asymptotic independence property of spacings fails for $F$ in the domain of attraction of Fr\'{e}chet and Weibull ($\alpha \neq 1$) distributions. This work also provides additional insight into the limiting distribution for the number of observations around $X_{k:n}$ for all three cases.


Introduction
Let X 1:n ≤ X 2:n ≤ · · · ≤ X n:n be order statistics of a random sample X 1 , . . . , X n from a continuous cdf F. For 1 ≤ k ≤ n, we examine the clustering of data around the order statistic X k:n . This is done by an investigation into the limiting properties of the right and left neighborhoods formed by the adjacent spacings (X k+1:n − X k:n , . . . , X k+r :n − X k+r −1:n ) and (X k:n − X k−1:n , . . . , X k−s+1:n − X k−s:n ) for fixed r and s. We let n → ∞ and consider three scenarios: (i) Central case where k/n → p, 0 < p < 1; (ii) Intermediate case where k, n − k → ∞ and k/n → 0 or 1; (iii) Extreme case where k or n − k is held fixed. In the first two cases we show that, under some mild assumptions, these (r + s) spacings appropriately scaled with a common scale parameter converge weakly to a set of i.i.d. standard exponential random variables (rvs). In the extreme case, this conclusion holds only when F is in the domain of attraction of the Gumbel cdf G 3 , or the Weibull type cdf G 2;α with α = 1. A direct and useful consequence of such a result is that order statistics around a selected one arrive as in a homogeneous Poisson process.
Neighborhoods around a selected order statistic have been investigated by several authors in recent years. Almost all these results, starting with X n:n , have concentrated on the distribution of counts around it. We refer to a few, relevant to our results, from an exhaustive list: (Balakrishnan and Stepanov 2005;Dembińska et al. 2007;Pakes and Steutel 1997;Pakes 2009;Dembińska and Balakrishnan 2010). These authors typically consider neighborhoods of the form (X k:n − d, X k:n ) or (X k:n , X k:n + d) where the lengths of the intervals may or may not depend on n; in some papers, the d's are induced by the quantile function F −1 or are chosen to be random. While these approaches are beneficial from a technical perspective, it is more natural and practical to consider neighborhoods that are in the scale of the data collected. This is our motivation for considering the joint distribution of adjacent spacings. Our approach allows us to characterize the process governing the distribution of counts and provides additional insight into the asymptotic properties of the counts of cluster sizes around a specified order statistic.
Section 2 contains preliminaries that explore the properties of uniform and exponential order statistics; it introduces the von Mises conditions and the associated extreme value distributions. Section 3 is concerned with the joint distribution of a central order statistic and spacings adjacent to it on its right and left neighborhoods. The Poisson arrival process of adjacent order statistics is established there. Assuming von Mises conditions, Sect. 4 reaches a similar conclusion for the neighborhood of an intermediate order statistic. Section 5 displays the distributional structure of the extreme spacings assuming that F is in the domain of attraction of an extreme value distribution. Section 6 applies our results and describes the limiting distribution of the counts of observations around an order statistic. Section 7 discusses further applications of our results and contains concluding remarks.
Let f (x) denote the pdf and F −1 ( p), 0 ≤ p ≤ 1, be the quantile function associated with F(x), where F −1 ( p) = inf{x : F(x) ≥ p} for 0 < p ≤ 1, F −1 (0) = sup{x : F(x) = 0}. We interchangeably use x p and F −1 ( p) as the pth quantile. It is well known that if F is differentiable at x p with finite and positive pdf f (x p ), F −1 is differentiable at p with derivative 1/ f (x p ). Standard uniform and exponential rvs are, respectively, denoted by U and Z . An exponential rv with rate parameter λ will be denoted by Exp(λ), and Poi(λ) represents a Poisson rv with mean parameter λ. The sum of r i.i.d. standard exponentials is a Gamma rv, to be denoted as Gam(r ). A Weibull rv with shape parameter δ will be denoted by Wei(δ). Further, a standard normal rv will be denoted by N (0, 1) and its pdf by φ(·). The Z i 's and Z * i 's are i.i.d. Exp(1) rvs. The symbol ∼ indicates asymptotic equivalence.
The rv U i:n , 1 ≤ i ≤ n, is the ith order statistic from a random sample of size n from a standard uniform population. The distributional equivalence, X i:n d = F −1 (U i:n ), for any collection of order statistics from an arbitrary cdf F is helpful in our investigations.

Spacings near a uniform order statistic
The key to our approach is the following well-known exchangeable property of the uniform order statistics. Let U 0:n = 0 and U n+1:n = 1, and define the uniform spacing (1) Then, it is well known that the Δ i,n 's are exchangeable and for any fixed r , and for constants v i ≥ 0, i = 1, . . . r with r ≤ n, and r i=1 v i ≤ 1, the joint survival function of Δ 1,n , . . . , Δ r,n (and hence any collection of r Δ i,n 's) is given by (see, e.g., David and Nagaraja 2003, p. 135)

This means
That is, nΔ i,n forms an i.i.d. Exp(1) sequence Z i as n → ∞. The convergence is fast; Problem P.5.19 of Reiss (1989, p. 201) notes that there exists a constant C such that for every positive integer n and r ≤ n, where B denotes the family of all Borel sets. We record the implications of (2) and the exchangeability of the Δ i,n 's as a lemma given below; it uses the fact that the interarrival times being i.i.d. Exp(λ) rvs is a defining property of a homogeneous Poisson process with rate λ.
Lemma 1 Let U i:n denote the ith order statistic from a random sample of size n from a standard uniform distribution, and assume n → ∞. Then, for any k such that n − k → ∞, (n(U k+1:n − U k:n ), . . . n(U k+r :n − U k+r −1:n )) for any fixed r , and for any k → ∞ for any fixed s, where the Z i 's and Z * i 's are all mutually independent Exp(1) rvs. That is, inter-arrival times of successive order statistics in the right and left neighborhoods of kth uniform order statistic, upon scaling by n, produce asymptotically independent homogeneous Poisson processes if n, k, and n − k approach infinity. If k [resp. n − k] is bounded, the right [resp. left] neighborhood produces a Poisson process in the limit.

Spacings near an exponential order statistic
When F is standard exponential, it is well known that where the Z i 's are i.i.d. Exp(1) rvs. From this representation, it follows that . . , r for r ≤ n − k turn out to be i.i.d. Exp(1) rvs. Only in this scenario, we need finite and distinct scaling constants for the spacings to transform them into i.i.d. exponential rvs for any n, and hence asymptotically as well.

Extremes and von Mises conditions
Suppose there exist sequences of constants a n and b n > 0 such that P{(X n:n − a n )/b n ≤ x} converges to a nondegenerate cdf G(x) corresponding to a rv W . Then, we say F is in the domain of maximal attraction to G and we write F ∈ D(G). Then, it is known that G is necessarily of one of the three types given below.
The following are necessary and sufficient conditions on the right tail of F in order that F ∈ D(G). The first two are due to Gnedenko (1943) and the last one is due to de Haan (1970). (1)) is finite, and the following condition holds for every t > 0: (c) F ∈ D(G 3 ) iff the following hold: E(X |X > c) is finite for some c, and for all real t, where Our approach for the intermediate case assumes the following sufficient conditions that are applicable to absolutely continuous cdf's. The first two are due to von Mises (1936), and the last one is due to Falk (1989) and is weaker than the corresponding von Mises condition that assumes differentiability of the pdf f (see, e. g., David and Nagaraja 2003, p. 300).
(a) F ∈ D(G 1;α ) if f (x) > 0 for all large x and for some α > 0, where The family of limiting distribution for normalized X 1:n corresponds to that of −W where W has one of the above three types of cdfs; parallel necessary and sufficient, and sufficient conditions exist that impose conditions on the left tail of F.

Joint distribution of spacings
For 0 < p < 1, X k:n is a central order statistic if k n → p. For such an X k:n , Smirnov (1952; Theorem 3, p. 12) has shown (as pointed out by a reviewer) that if the condition F(x) = p has a unique solution x p . Since for any fixed j, n(X k+ j+1:n − X k+ j:n ) the limiting joint distribution of the spacings from an arbitrary cdf F can be linked to that of a collection of i.i.d. standard uniform rvs provided the first factor on the right in (9) above converges in probability to a nonzero constant. From (8), it follows that k+ j,n = U k+ j+1:n −U k+ j:n (defined in (1)) almost surely converges to 0. The first factor on the right in (9), if the following condition holds: f is positive, finite and continuous at x p .
This conclusion follows from the definition of the derivative of F −1 and its assumed continuity at p. Upon using (10), (9), Slutsky's Theorem, and Lemma 1, we conclude that jointly where the Z j 's are i.i.d. Exp(1) rvs if (11) holds. We can weaken the continuity assumption for f in (11) with the following condition: where 0 < p < 1. This assumption is similar to (17) in Dembińska et al. (2007) (given as (38) in Sect. 6 later). The condition (11) implies that (12) holds since the latter is satisfied upon dividing the numerator and denominator by h and taking the double limit; the converse is not true.
On the other hand, we can weaken the requirement for a finite nonzero f (x p ) by modifying a condition on F used by Chanda (1975) [see also Ghosh and Sukhatme 1981]. We assume that for some θ > 0. If f is indeed finite and nonzero at x p , then the above condition is satisfied with θ = 1. Whenever f (x p ) is finite and positive or (13) holds, there is a unique solution to F(x) = p and (8) holds. Based on the above discussion, we can now formally state the result for the central case.
Theorem 1 Let k/n → p ∈ (0, 1), and r and s be fixed positive integers.
(a) If condition (11) holds, or if (12) holds and f (x p ) is finite and positive, where the Z 's are i.i.d. Exp(1) rvs. Thus, the two counting processes defined by setting the jth event to occur, respectively, at times converge weakly to two independent homogeneous Poisson processes with unit intensity. (b) Assume (12) and (13) hold. Then, That is, the counting processes defined by setting the jth event of the process to occur at times n θ converge to i.i.d. renewal processes with Wei(1/θ ) renewal distribution. They reduce to homogeneous Poisson processes with unit intensity only when θ = 1 and f (x p ) is finite and positive.
Proof To prove part (a), we need to show that (10) holds whenever (11) holds, or if (12) holds and f (x p ) is finite and positive. Then, we would use (10), (9), Slutsky's Theorem, and Lemma 1. We have shown earlier that (10) holds whenever (11) is satisfied. If (12) holds and f (x p ) is finite and positive, the left side expression in (10) can be written as where the first factor converges to 1 and the second factor converges to 1/ f (x p ), both almost surely. Thus, (10) is established. For (b), the idea is similar. We note that n θ (F −1 (U k+ j:n + k+ j,n )− F −1 (U k+ j:n )) can be written as Assumption (13) coupled with (12) ensures that the first factor above converges almost surely to M( p, θ). Since P(n k+ j,n > w) → exp{−w} for all w > 0, has to be positive and finite, and the limiting arrival process would be Poisson.

Remark
The condition (13) does not imply (12); nor does it ensure that f (x p ) is finite and positive. Consider the pdf This is a corrected version of the pdf given in Chanda (1975), and discussed in Ghosh and Sukhatme (1981) (we thank a reviewer for noticing the error). The associated quantile function is given by This quantile function fails to satisfy the condition in (12) when p = 0.5 and η is a positive even integer, but satisfies (13) with θ = 1/(η + 1). Here,

Asymptotic independence of a central order statistic and spacings in its neighborhood
We now assume k/n = p + o(n −1/2 ) and establish the independence of X k:n and spacings around it.

The uniform parent
Using the (well-known) joint pdf of the consecutive standard uniform order statistics U k−s:n , . . . , U k:n , . . . , U k+r :n , we first obtain the joint pdf of appropriately normalized U k:n and the vector and thus determine the limiting form of the joint pdf. The joint pdf of U k−s:n , . . . , U k+r :n is given by Hence, and the Jacobian is ∂u ∂v = t n /n r +s . The joint pdf of for fixed r and s. Further, , using Stirling's approximations for the factorials and the expansion log The conclusion of the above discussion is summarized below.

Arbitrary parent
By establishing density convergence under the assumption that k/n = p + o(n −1/2 ), we have shown above that The conclusion in (16) also follows from Ghosh (1971) who has shown that if f (x p ) is positive and finite, We have shown in Sect. 3.1 that when k/n = p + o(1), if condition (11) holds or if (12) holds and f (x p ) is finite and positive, and if (12) and (13) hold, X k+ j+1:n − X k+ j:n Δ θ if (13) holds.
In view of Lemma 2, assuming k = np + o( √ n), we have established the asymptotic independence of the normalized spacings (X k+ j:n − X k+ j−1:n ) introduced in Theorem 1 and appropriately normalized X k:n under the conditions stated there. This discussion leads to the following result.
(a) If condition (11) holds, or if (12) holds and f (x p ) is finite and positive, (12) and (13) hold. Then, In both cases, the Z i 's and Z * i 's are Exp(1) rvs, and the r + s + 1 components in the limit vector are mutually independent.
3.3 Remarks-the central case Siddiqui (1960) considered higher order spacings around a central order statistic and showed that when F is continuously twice differentiable and f (x p ) is finite and positive, the rvs are asymptotically independent when k/n → p ∈ (0, 1) with r/n and s/n tending to zero; further, asymptotically the higher order spacings are Gam(r ) and Gam(s), respectively. We have proved a more refined result here with less assumptions on the properties of F, but have taken r and s to be fixed. Pyke's (1965) classic paper on spacings shows (Theorem 5.1) that n f (x p 1 )(X i:n − X i−1:n ) and n f (x p 2 )(X j:n − X j−1:n ) with i/n → p 1 and j/n → p 2 where 0 < p 1 = p 2 < 1 are asymptotically i.i.d. Exp(1) rvs. The key difference is that the spacings considered there are far apart, while our focus is on adjacent spacings around X i:n .
The asymptotic half-normal distribution of the normalized central order statistic under the conditions of part (b) of Theorem 2 is comparable to Chanda's (1975) conclusion; our condition (13) is on F −1 , whereas his comparable condition (given as (6) there) is on F.

The intermediate case
Here, we lean on the work of Falk (1989) and directly examine the convergence of the joint pdf of an intermediate order statistic X k:n and spacings around it. We assume k → ∞ but k/n → 1 as n → ∞ such that n − k → ∞ and assume one of the von Mises sufficient conditions stated in (5)-(7) holds. Theorem 2.1 of Falk (1989) shows that . (17) This is established by showing that the pdf of (X k:n − a n )/b n at x, for all real x. Consider the joint pdf of X k−s:n , . . . , X k:n , . . . , X k+r :n : and consider the transformation y 0 = (x k:n − a n )/b n , y 1 = (x k+1:n − x k:n )/c n , . . . , y r = (x k+r :n − x k+r −1:n )/c n , so that x k:n = a n + b n y 0 ; x k+ j:n = a n + b n y 0 + c n (y 1 + · · · + y j ), j = 1, . . . , r ; x k− j:n = a n + b n y 0 − c n (y * 1 + · · · + y * j ), j = 1, . . . , s.

Lemma 3 Suppose one of the von Mises conditions stated in
(a) For any real y 0 , 1−F(a n +b n y 0 ) 1−F(a n ) = n(1−F(a n +b n y 0 )) n−k → 1.
(b) If c n = b n / √ n − k, for any y 0 , y 1 real, f (a n +b n y 0 +c n y 1 ) f (a n ) → 1.
Proof (a) From Theorem 2.1 of Falk (1989), it follows that (17) holds under the conditions we have assumed, and the limit distribution is N (0, 1). From Theorem 1 of Smirnov (1967) [Remark (ii) of Falk (1989)], it then follows that [n − k + 1 + n(F(a n + b n y 0 ) − 1)]/ √ n − k + 1 → x for all real y 0 . Thus, √ n − k + 1 · 1 − 1 − F(a n + b n y 0 ) 1 − F(a n ) → y 0 since 1 − F(a n ) = (n − k)/n. This implies that (1 − F(a n + b n y 0 ))/(1 − F(a n )) → 1 for any real y 0 . (b) In the proof of his Theorem 2.1, Falk establishes that whenever one of the sufficient conditions stated in (5)-(7) holds, for any real y for which F(a n +b n y) → 1 (or equivalently a n + b n y → x 1 ) as n → ∞, f (a n + b n θ y)/ f (a n ) → 1 uniformly for θ ∈ (0, 1) where a n and b n are given in (17). Part (a) that we just proved implies that for any real y 0 , F(a n + b n y 0 ) → 1 as n → ∞. Thus, from (27), it follows that f (a n + b n θ y 0 )/ f (a n ) → 1 for all real y 0 . For large n − k and real y 1 , Using (27) with y = 2y 0 , we conclude that f (a n + 2y 0 θ b n ){ f (a n )} −1 → 1 uniformly for all 0 ≤ θ < 1. Since a n + b n y 0 + c n y 1 = a n + b n (y 0 + y 1 / √ n − k) is in (a n , a n + 2y 0 b n ) if y 0 > 0 and in (a n + 2y 0 b n , a n ) if y 0 < 0. Hence, f (a n + b n y 0 + c n y 1 )/ f (a n ) → 1 for all real y 0 = 0 and real y 1 . When y 0 = 0, for any real y 1 , f (a n + b n (1/ √ n − k)y 1 )/ f (a n ) → 1 since 1/ √ n − k ∈ (0, 1) and (27) holds. This completes the proof of the claim in (b).
With y = y * 1 + · · · + y * s > 0, consider the following component of τ 2 in (22): F(a n + b n y 0 − c n y) F(a n + b n y 0 ) = 1 − F(a n + b n y 0 ) − F(a n + b n y 0 − c n y) F(a n + b n y 0 ) = 1 − f (a n + b n y 0 − θ * c n y) F(a n + b n y 0 ) · c n y = 1 − yd k,n k where the second form above follows from the mean value theorem, θ * ∈ (0, 1), and d k,n = 1 F(a n + b n y 0 ) f (a n + b n y 0 + θ * c n (−y)) f (a n ) k n , where we have used the fact that c n = 1/n f (a n ). From part (a) of Lemma 3, the first factor of d k,n above converges to 1; from part (b), the second factor approaches 1, and from our assumptions about k and n made in the intermediate case, the third factor also approaches 1. Hence, d k,n → 1 as n, k → ∞. Thus, upon recalling (22), we obtain With y = y * 1 + · · · + y * j , the jth term in the product representing τ 3 in (23) is given by f (a n + b n y 0 + c n (−y)) F(a n + b n y 0 ) = k − j n f (a n + b n y 0 + c n (−y)) f (a n ) 1 F(a n + b n y 0 ) .
Using Lemma 3 as we did in proving d k,n → 1 as n and n − k → ∞, we conclude that the jth factor of τ 3 → 1 for all j and so does τ 3 . With y = y 1 + y 2 + · · · + y j , the jth term in the product representing τ 4 in (24) is given by f (a n + b n y 0 + c n y) 1 − F(a n + b n y 0 ) = n − k − j + 1 n(1 − F(a n )) · f (a n + b n y 0 + c n y) f (a n ) · 1 − F(a n ) 1 − F(a n + b n y 0 ) .
Since F(a n ) = k/n, the first factor above converges to 1, and Lemma 3 shows that the other two factors also approach 1 as n and n − k → ∞. Thus, τ 4 → 1. Finally, with y = y 1 + · · · + y r , consider the following component of τ 5 in (25): 1 − F(a n + b n y 0 + c n y) 1 − F(a n + b n y 0 ) = 1 − F(a n + b n y 0 + c n y) − F(a n + b n y 0 ) 1 − F(a n + b n y 0 ) = 1 − f (a n + b n y 0 + θ * c n y) 1 − F(a n + b n y 0 ) · c n y from the Mean Value Theorem = 1 − y f (a n + b n y 0 + θ * c n y) f (a n ) · 1 − F(a n ) 1 − F(a n + b n y 0 ) · 1 n(1 − F(a n )) where θ * ∈ (0, 1) and we have used the fact that c n = 1/n f (a n ). Lemma 3 implies that the first two factors of y converge to 1 as n, n − k → ∞, and the denominator of the last factor, n(1 − F(a n )), is n − k. Hence, τ 5 ∼ 1 − y 1 + · · · + y r n − k n−k−r → e −(y 1 +···+y r ) , y 1 , . . . , y r > 0, as n, n − k → ∞. Thus, we have formally proved the following theorem.

Remarks-the intermediate case
When F ∈ D(G 1;α ), a n f (a n ) 1 − F(a n ) = F −1 (k/n)c n n − k → α and hence α(n − k)/F −1 (k/n) can be chosen as c n . When F ∈ D(G 2;α ), the von Mises condition implies that α(n − k)/(x 1 − F −1 (k/n)) can be used as c n . When F ∈ D(G 3 ), [n f (a n )m(a n )/(n − k)] → 1 and we can use m(a n )/(n − k) as our c n . From Theorem 3, it follows that as in the central case, asymptotically, any two spacings, possibly of higher order, formed by nonoverlaping collections of order statistics around an intermediate order statistic X k:n are independent, and the collection is independent of X k:n .
Teugels (2001) has introduced a family C * of cdfs F with the following property: F has an ultimately positive pdf f and for all real y, whenever h(x) → 0 as x → x 1 . He states that the condition F ∈ C * 'slightly generalizes ' Falk's (1989) version of von Mises conditions (i.e., (5)- (7)). Assuming F ∈ C * , Teugels shows that upon normalization described above (i) X k:n is asymptotically normal and (ii) (X k:n − X k−s:n ) is asymptotically Gam(s). Their joint distribution and the asymptotic independence are not discussed there.

The upper extreme case
We now assume that k → ∞ such that n − k is fixed. It is well known that when F ∈ D(G) for G given in (4), ((X n:n − a n )/b n , . . . , (X k:n − a n )/b n ) where for any finite fixed j the vector (W 1 , . . . , W j ) has the same joint distribution as and if G = G 3 , where the Z i 's are i.i.d. Exp(1) rvs, and γ is the Euler's constant. The first three representations above are from Nagaraja (1982) who also shows that the joint limiting distribution (W 1 , . . . , W j ) is identical to the joint distribution of the first j lower record values from the cdf G. The representation in (32) is due to Hall (1978), and is more convenient when G = G 3 . Thus, whenever F ∈ D(G), the limiting form of the joint distribution of the normalized spacings and the concerned extreme order statistics can be described as follows: ((X n:n − X n−1:n )/b n , . . . , (X k+1:n − X k:n )/b n , (X k:n − a n )/b n , where the W j 's have one of the forms given in (29)-(32). We now specialize our results for each of the three domains.

The Fréchet domain
In this case, a n can be chosen to be 0 and b n to be F −1 (1 − 1/n) (= x 1−n −1 ). The representation in (33) for the limiting joint distribution along with (29) suggests that an extreme spacing is not asymptotically exponential, and the adjacent spacings are neither independent, nor identically distributed in the limit. The asymptotic independence of the spacings and the extreme order statistic also fail. Hence, when F ∈ D(G 1;α ), the asymptotic distributional structure for the extreme spacings differs from that for the central and intermediate cases.
From (28) and (29), we conclude that when F ∈ D(G 1;α ), the normalized higher order spacing, where the sum S j = Z 2 + · · · + Z j+1 is a Gam( j) rv that is independent of Z 1 . This distributional representation complements the work of Pakes and Steutel (1997) who have given an expression for the cdf of the limiting rv as (p. 192) They comment that this expression for the cdf 'does not seem susceptible to simplification for any choice of the parameter α'; for the other two domains, they provide explicit distributional representation that is equivalent to ours (see below).

The Weibull domain
Here, x 1 (= F −1 (1)) is finite and can be chosen to be our a n and the scaling constant b n can be chosen to be x 1 − x 1−1/n . From (33) and (30), we can conclude that the normalized adjacent spacings are asymptotically i.i.d. exponential iff α = 1 when F ∈ D(G 2;α ). Otherwise, they are all dependent. When α = 1 and k < n, the joint asymptotic distributional structure of ((X n:n − X n−1:n )/b n , . . . , (X k+1:n − X k:n )/b n , (X k:n − a n )/b n , (X k:n − X k−1: Thus, when α = 1, X k:n is asymptotically independent of the spacings in its left neighborhood, but is symmetrically dependent on the ones on its right. This conclusion is formalized in the following result. Theorem 4 When F ∈ D(G 2;α=1 ), for each fixed n − k and s, the asymptotic joint distribution of ((X n:n − X n−1:n )/b n , . . . , (X k+1:n − X k:n )/b n , (X k:n −a n )/b n , (X k:n − X k−1:n )/b n , . . . , (X k−s+1:n − X k−s:n )/b n ) has the representation given by (34) The standard uniform distribution is in D(G 2;α ) with α = 1 and hence has asymptotically i.i.d. extreme spacings. We had reached this conclusion earlier in Lemma 1 (recall x 1 − F −1 (1 − 1/n) = 1/n). But we have a more general result now that describes the symmetric dependence of the right neighborhood spacings on X k:n and is applicable to all F ∈ D(G 2;1 ). In fact with n − k fixed, given (X k:n − x 1 )/b n = u (< 0), (X n:n − X n−1:n )/b n , . . . , (X k+1:n − X k:n )/b n behave asymptotically like the spacings from a random sample of size n − k from a uniform distribution over (u, 0). For the form of the joint distribution, see, e.g., David and Nagaraja (2003;Sec. 6.4).
From (28) and (30), we conclude that when F ∈ D(G 2;α ), the normalized higher order spacing where the sum S j = Z 2 + · · · + Z j+1 is a Gam( j) rv that is independent of Z 1 . This is the conclusion of Theorem 7.2 in Pakes and Steutel (1997).

The Gumbel domain
Using (33) along with the representation for W j in (32), we conclude the following.
The representation in (35) shows that while X k:n is independent of spacings in its right neighborhood, it is correlated with the spacings in its left neighborhood, and this correlation decreases at the rate of 1/(n − k + j), j ≥ 1 as one moves away from it. This is in contrast with the situation when F ∈ D(G 2,1 ). Weissman (1978) has considered the limit distribution of ((X n:n − X n−1:n )/b n , . . . , (X k+1:n − X k:n )/b n , (X k:n − a n )/b n ), and has given the representation given by the first (n − k) components of the vector in (35). He has also noted the independence of these spacings and W n−k+1 (in his Theorem 2). The above theorem shows that using varying scaling sequences for the spacings, we can obtain i.i.d. standard exponential distributions in the limit. In particular, we have the following: ((n − j + 1)(X n− j+1:n − X n− j:n )/b n , j = 1, . . . , n − k + s) d → (Z 1 , . . . , Z n−k+s ).
From (28) and (32) or from the representation (35), we can conclude that when F ∈ D(G 3 ), the normalized higher order spacing, The last equality above follows from the representation for exponential order statistics given in (3). This is the conclusion reached in Theorem 7.1 of Pakes and Steutel (1997).

Cox processes
Whenever F ∈ D(G), we can say that the arrival process of order statistics in the left neighborhood of an upper extreme order statistic is asymptotically a Poisson process and is independent of its value only when F ∈ G 2 with α = 1. The arrival processes on both sides of the extreme order statistics are pure birth processes when F ∈ G 3 ; only the arrival process on the right side is independent of the order statistic. Harshova and Hüsler (2000) have shown that the arrival processes on the left neighborhood of the sample maximum are special Cox processes, when G is of Weibull (G 2;α ) or Gumbel (G 3 ) type cdf. Cox processes are mixed Poisson processes where the time-dependent intensity λ(t) is itself a stochastic process (Daley and Vere-Jones (2003;Sec. 6.2)). Harshova and Hüsler consider the counting process in the left neighborhood of X n:n , N n (·), defined by N n ([a, b) is a Cox process with stochastic intensity function λ(t) = e t−W in the Gumbel case where W has cdf G 3 ; and λ(t) = α(t − W ) α−1 in the Weibull case where W has cdf G 2;α . The cdfs G 2;α and G 3 are given in (4). Representations given in (30), and (31) or (32) provide another characterization of the resulting Cox processes in terms of the distribution of inter-arrival times of order statistics below the maximum.

Higher order extreme spacings
The representation for the special higher order extreme spacing involving the sample maximum (X n:n − X n− j:n , discussed above) can be expanded to other extremes. From (28)-(30), and (32), we conclude that as n → ∞, for fixed 1 ≤ i < j, (X n−i+1:n − X n− j+1:n )/b n converges in distribution to Here, the last distributional equality follows from (3). The above representations are extremely helpful in providing the asymptotic distribution theory for the number of order statistics around a specified extreme order statistic. This will be illustrated in the next section where all cases (central, extreme, and intermediate) will be considered.

Counts of observations around an order statistic
Consider the following count statistics that track the number of observations in the right and left neighborhoods of X k:n : K − (n, k, d) = #{ j : X j ∈ (X k:n − d, X k:n )}, K + (n, k, d) = #{ j : X j ∈ (X k:n , X k:n + d)}. Clearly, and thus the asymptotic distribution theory for spacings developed here can be directly applied to determine the limit distributions of the count statistics for appropriately chosen d that is dependent on n. Pakes and Steutel (1997) have used the link in (37) in the reverse direction in the extreme case where they derive the limit distribution of K − (n, n, d n ) first and use it to determine the limit distribution of the spacing X n:n − X n−k:n . As noted in the introduction, the literature on the investigation into the limit distribution of K − and K + is substantial. Poisson limits are generally obtained when d = d n is nonrandom but is dependent on the behavior of F around the concerned statistic. We now discuss implications of our results on spacings on the asymptotic distribution of counts and compare our results with only the most relevant results in the literature.

The central and intermediate cases-the Poisson counts
We have seen in Theorems 1 and 3 that the (X k+i:n − X k+i−1:n )/c n are asymptotically i.i.d. standard exponential for any fixed (positive or negative) integer i, where c n = 1/n f (x p n ) with p n ≡ p = lim(k/n) ∈ (0, 1) in the central case, and in the intermediate case, p n = k/n → 0 or 1 such that, respectively k or n − k → ∞. In other words, K − (n, k, λ 1 c n ) and K + (n, k, λ 2 c n ) are asymptotically independent, and Poi(λ 1 ) and Poi(λ 2 ) rvs, respectively. This conclusion matches with that of Pakes (2009) This is comparable to our condition (12), but there are differences in these conditions and their implications. While (12) specifies the behavior of the quantile function F −1 around p, (38) puts a similar condition on the property of the cdf F around x p . Dembińska et al.'s neighborhood is determined by d n = F −1 ( p+λ/n)−x p , a quantity dependent on the behavior of F −1 at ( p + λ/n). In contrast, our d n = λ/n( f (x p )) depends on the behavior of F −1 only at x p . When f is continuous around x p and f (x p ) is positive and finite, from L'Hospital's rule it follows that (38) is readily satisfied, and Under (38) and a similar condition on the left neighborhood, Dembińska and Balakrishnan (2010) have established the asymptotic independence of K − and K + . This independence readily follows from our Theorems 2 and 3. The technical conditions used in Dembińska et al. (2007) in the intermediate case (in their Theorem 6.1, for example) appear difficult to verify whereas the familiar von Mises conditions needed here are known to hold for many common distributions. In addition, our results show that counts in disjoint intervals are Poisson and independent, and also that these are independent of the location of X k:n . These finer conclusions on the limiting structure of the neighborhood cannot be reached using any of the currently available results in the literature on count statistics for the central and intermediate cases.

The upper extremes-non-Poisson and Poisson counts
Asymptotic distributions of K − (n, k, d) and K + (n, k, d) have been investigated by many authors when k or n −k are held fixed starting from the work of Pakes and Steutel (1997) who looked at K − (n, n, d). Assuming k is held fixed, Pakes and Li (1998) showed that K − (n, n − k, d) is asymptotically negative binomial, and Balakrishnan and Stepanov (2005) showed that K + (n, n − k, d) is asymptotically binomial. The success probability in these distributions is given by where x 1 is assumed to be infinite. Pakes (2009) has considered the limit distribution of K + (n, n − k, cb n ) with k fixed assuming that F is in the domain of attraction of either Fréchet or Gumbel distribution and the b n 's form the associated scaling sequence. When F ∈ D(G 1;α ), he shows that the limit distribution of K + (n, n − k, b n ) is mixed binomial with parameters k and random success probability that is a function of a Gam(k + 1) rv (his Theorem 5, part (a)). When F ∈ D(G 3 ), K + (n, n − k, λb n ) is shown to be asymptotically a Binomial rv with parameters k and success probability 1 − e −λ (his Theorem 4).
We now examine the consequences of the representations in (36) and the relations in (37). When k is fixed and F ∈ D(G 3 ), for any j, 1 ≤ j ≤ k, P(K + (n, n − k, λb n ) < j) = P(X n−k+ j:n − X n−k:n > λb n ) as n → ∞. Since the maximum value attainable is k, the limit distribution is binomial, a result noted above. Further, P(K − (n, n − k, λb n ) < j) = P(X n−k:n − X n−k− j:n > λb n ) resulting in a negative binomial distribution, a result shown by Pakes and Li (1998). They also derive the limit distribution of K − (n, n−k, λb n ) in other cases; the representations in (36) along with (37) yield us the same results. Of these, a commonly known distribution is obtained only when F ∈ D(G 2;1 ) in which case K + (n, n − k, λb n ) has a censored Poi(λ) distribution that is censored on the right at k; this conclusion was reached in Theorem 4.1 of Dembińska et al. (2007) under a set of technical conditions similar to the one given in (38). Further, K − (n, n − k, λb n ) will have a Poi(λ) distribution and these two statistics are asymptotically independent. Whenever F ∈ D(G 1;α or G 2;α =1 ), we can obtain the asymptotic distributions of K − and K + using (36) and (37) directly. For example, when F ∈ D(G 1;α ), we can use the corresponding representation in (36) to obtain the cdf of K + (n, n − k, b n ) in terms of Gamma rvs (in contrast with the mixed binomial representation of Pakes (2009) mentioned earlier). While closed form expression for the cdf may not be available, the needed probabilities can be evaluated using tractable univariate integrals that involve gamma type integrands that can be easily evaluated numerically. The link between Gamma and Poisson cdfs comes in handy in this simplification.

Discussion
We now provide further illustrations of applications of our results to distribution theory and inference.

Examples
Our examples thus far were the uniform and exponential populations, but our results are widely applicable since the conditions imposed here are satisfied by several common distributions. In the central case, we need positivity and continuity of the population pdf at x p to achieve independent Poisson arrival process in both right and left neighborhoods. von Mises conditions are satisfied by the common distributions that are in the domain of attraction of an extreme value cdf G (given in (4)) and thus the intermediate case also leads to independent Poisson arrival process for these distributions. The extreme case does not require the von Mises conditions, and provides interesting examples of situations where we do not get Poisson processes. For example, for F ∈ D(G 1;α ), a property satisfied by Pareto and loggamma distributions, the arrival process is no longer Poisson. Tables 3.4.2-3.4.4 of Embrechts et al. (1997) contain a good list of distributions in the domain of attraction of each of the three extreme value distributions along with the necessary norming (scaling) constants needed for the application of our results in the extreme case.
Our intermediate and extreme case discussions focused on the upper end of the sample. Parallel results hold for the lower end of the sample and upper-end and lowerend spacings can exhibit different types of clustering processes. For example, in the exponential parent case, upper extremes are in the Gumbel domain, and the lower extremes are in the Weibull domain with α = 1 (i.e., Exp (1)). Thus, for the lower extremes, we have a homogeneous Poisson arrival process in the right neighborhood, whereas for the upper extremes, we have a pure birth process in the left neighborhood of the concerned order statistic.

Inferential implications
Theorem 2 (a) can be used in the central case to provide (asymptotically) distributionfree estimates of x p and f (x p ) as noted by Siddiqui (1960) when he studied the joint distribution of X k:n , X k+r :n − X k:n , X k:n − X k−s:n [see Sect. 3.3]. It follows from Theorem 2 that n f (x p )(X k+r :n − X k−s:n ) is asymptotically Gam(r + s) and this fact can be used to provide estimates of f (x p ) and confidence intervals for the population pdf at the pth quantile. The asymptotic independence of n f (x p )(X k+r :n − X k−s:n ) and √ n f (x p )(X k:n − x p ) and their known familiar distributions can be used to find the distribution of the pivotal quantity (X k:n − x p ) √ n(X k+r :n − X k−s:n ) .
From Theorem 2, it follows that this rv behaves asymptotically as the ratio of a standard normal and an independent gamma rv (or a scaled Chi-square rv) and this distribution is free of f (x p ). It easily leads to an asymptotically distribution-free confidence interval for x p . A similar application of Theorem 3 would provide asymptotically distribution-free inference for the intermediate population quantile F −1 (k/n) and pdf f (F −1 (k/n)) when one of the von Mises conditions is assumed to hold.
In the extreme case, we have seen that the limit distributions of the top k order statistics are dependent on the domain of attraction. Weissman (1978) has discussed in detail inference on tail parameters (extreme quantiles and the tail index 1/α) based on these limit results.

Concluding remarks
It is interesting to note that norming/scaling constants for X k:n and the adjacent spacings are of the same order only for the extreme case (b n ); the limiting distributions are similar as well (functions of Exp(1) rvs). For the central and intermediate cases, the spacings and X k:n are scaled differently, and their limit distributions are different; the spacings are related to exponential, whereas X k:n relates to the normal. For the extreme and intermediate cases, our sufficient conditions that ensure the nondegenerate limit distributions for X k:n and for the adjacent spacings are the same. In the central case, asymptotic normality for X k:n requires k = np + o( √ n) (actually slightly less restriction on k would work), the asymptotic independence property of spacings holds whenever k = np + o(n).
We have focused here on neighborhoods of a single selected order statistic; this work can easily be extended to multiple neighborhoods. In the case of two or more central order statistics and their neighborhoods, we obtain a multivariate normal limit distribution for the selected order statistics and independent Poisson processes around them. Such a set up is considered in Theorem 3.1 of Dembińska and Balakrishnan (2010) where the independence of Poisson counts in right and left neighborhoods of multiple central order statistics is derived. In other cases (for example, one upper extreme, and another lower, considered in Theorem 2.1 of Dembińska and Balakrishnan (2010)), the resulting counting processes will turn out to be independent.