Properties of the Power Envelope for Tests Against Both Stationary and Explosive Alternatives: The Effect of Trends

This article details a precise analytic effect that inclusion of a linear trend has on the power of Neyman–Pearson point optimal unit root tests and thence the power envelope. Both stationary and explosive alternatives are considered. The envelope can be characterized by probabilities for two, related, sums of Chi‐square random variables. A stochastic expansion, in powers of the local‐to‐unity parameter, of the difference between these loses its leading term when a linear trend is included. This implies that the power envelope converges to size at a faster rate, which can then be exploited to prove that the power envelope must necessarily be lower. This effect is shown to be, analytically, greater asymptotically than in small samples and numerically far greater for explosive than for stationary alternatives. Only a linear trend has a specific rate effect on the power envelope, however other deterministic variables will have some effect. The methods of the article lead to a simple direct measure of this effect which is then informative about power, in practice.


INTRODUCTION
The power envelope is a fundamental measure of how effectively we can discriminate between false null hypotheses and specified alternatives. Every new unit root test, whether testing against stationary or explosive/bubble alternatives, must have its power characteristics compared with this envelope. Despite this, the analytic properties of the unit root power envelope are generally unknown. The focus has instead been on the stochastic properties of tests and estimators, capitalizing on the pioneering methods of Phillips (1987aPhillips ( , 1987b and Chan and Wei (1987).
This article seeks to capture the precise effect, on the power envelope, of the inclusion of a linear trend. For Economic data the unit root remains one of the most tested hypotheses. And the inclusion or otherwise of a linear trend has a profound effect on both the theoretical and observed properties of unit root tests, see both Elliott et al. (1996) and Nielsen (2008). To emphasize the importance of this, as measured via numerical resolution of the asymptotic power envelope of the former article and in the context of an autoregressive parameter T = 1 + c∕T, tests can have 50% power against a local alternative value c = −7 with no linear trend, but not until c = −13.5 if there is. The net effect of a linear trend on power is equivalent to a practitioner discarding 48% of their data. There is no other context in Econometrics where the effect of a single regressor is so profound.

147
Since there is no uniformly best invariant (UBI) test against either stationary or explosive alternatives the power envelope is constructed via the union of the powers of the continuum of point optimal tests. For each, a critical value is first required to fix size under the null, before its power is evaluated under the alternative. Therefore, for every value under the alternative two probabilities must be considered. In this article these are characterized via probabilities for two, related, weighted sums of Chi-squared random variables, similar to the original representations in Dickey and Fuller (1979). These two weighted sums generally have a stochastic difference, near c = 0, of order O p ( c 2 ) . When there is a linear trend, this falls to O p ( c 4 ) , asymptotically.
This induces a change in the rate of convergence of the power envelope itself to the chosen size. Specifically, for arbitrarily small positive 1 and 2 it is O ( , when there is. This step change can be exploited to formally prove that powers of linear trend invariant tests are necessarily lower. Intuitively this arises because the covariance and its derivative are proportional, when there is a unit root. The algebraic mechanism by which this occurs can also be used to construct a simple measure of the impact of regressor invariance on any hypothesis on the covariance structure of data. In the current context this measure correlates very well with power. The next Section presents the main results, two Lemmas (proved in Appendix S1 (Supporting information) to this article) and a theorem detailing the analytic effects of a trend which is proved in Appendix. Section 3 discusses the implications of these results utilizing numerical results also presented in tables in Appendix S1.

CHARACTERIZATION OF THE POWER ENVELOPE AND ITS PROPERTIES
The Gaussian power envelope is constructed from the powers of each point optimal test, for example, see King (1980) and King and Sriananthakumar (2015). However, as is clear from Elliott et al. (1996) and Marsh (2011), the asymptotic distribution of these tests is the same under far more general assumptions. Let ( y t ) T t=1 be generated from, where x t is a k × 1 deterministic regressor, a k × 1 unknown parameter, t is a zero mean error process and we put T = 1 + c∕T. We will consider tests of H 0 ∶ c = 0 against both stationary (S) and explosive (E) alternatives, as in Elliot et al. (1996), under their Condition A, provide representations of the power envelope against H S 1 in two cases. First when d T = o ( T 1∕2 ) (their equation (4)) and second when d t = 1 + 2 t (their equation (8)). Here we will denote those two, size , envelopes by Π (c) and Π (c) respectively. Full expressions for these are also provided in Appendix S1. Although originally provided only for tests against H S 1 , power envelopes for H E 1 can also be generated using the results of Phillips (1987b) and Chan and Wei (1987), see for example Harvey and Leybourne (2014).
be a lower triangular matrix with 1 ′ s on the first lower diagonal and 0 ′ s elsewhere, Δ = I − L and let W = Δ 1 X. Put n = T − k, and define W ′ and C ′ C = I n , and let i , i = 1, … , n, be the ordered eigenvalues of A. Finally let z = ( z 1 , … , z n ) ′ = C ′ Δ 1 y and define the following two statistics; Lemma 1 provides alternative characterizations of the asymptotic power envelopes as well as a stochastic expansion of the limits of the two statistics defined in (4). Both the general assumptions under which it applies and its algebraic demonstration are given in Appendix S1.

Lemma 1. (i) Let {z
i } i∈ℤ denote a sequence of IID N (0, 1) random variables, then the asymptotic power envelope at size for testing either H S 1 and H E 1 , for any set of explanatory variables X, can be characterized by where the critical value is defined by (ii) Denote the jth derivative of A with respect to c, evaluated at 0, by D j , and let { be the eigenvalues of the matrices D 1 , D 2 and D 1 D 2 , and let be sequences of independent Chi-square variables, then in a neighbourhood of c = 0, where 1 and 2 are defined by Lemma 1 provides a representation for the asymptotic power envelope in terms of an (infinite) weighted sum of Chi-square random variables. Given that there is no UBI test, the properties of the power envelope can only be explored by directly comparing Q 1 (c) with Q 0 (c) . Via the stochastic expansions presented in Lemma 1(ii) we can establish the rate of convergence of the asymptotic power envelope to the chosen size, as in Lemma 2.
where k > 0 is an arbitrarily small constant.
. Now denote the column space of X by (X) and the linear trend by = (t) T t=1 . Suppose now a linear trend is included in the regressors, that is, ∈ (X), then Theorem 1, proved in Appendix, demonstrates that in this case . As with Moon et al. 2007 the effect manifests itself as an order of magnitude step change in the order of convergence, although here in the parameter itself. These results hold in a neighbourhood of c = 0, however by exploiting the analytic properties of Π (c) these findings can be continued to demonstrate that inclusion of a linear trend necessarily implies the power envelope is strictly lower for any finite value of c.
(ii) At the significance level, , so that the power envelope is Π (c) . If we add the column to X, then we obtain power envelope Π (c) , which satisfies for all finite c.

ANALYSIS AND CONCLUSIONS
(i) Parts (i) and (ii) of Theorem 1 apply only in a neighbourhood of c = 0. However, the power envelope (via (5) and (6)) is a function of both regressor set X and local parameter c through the eigenvalues of the matrix A c . Since these eigenvalues are analytic in c, then so is the power envelope. That Π (c) is smaller than Π μ (c) at some, local, value of c can therefore be analytically continued to all finite values of c. This finding links directly to findings in Nabeya and Tanaka (1990), which shows that there is no Locally Best Invariant test of a unit root when there is a linear trend although that article contains no explicit results for the power envelope, itself. Equally Theorem 1 explains how the precise finding of Marsh (2007) for the null c = 0 has, in fact, an impact for any finite value of c under the alternative. In the absence of any formal distribution theory, asymptotic or otherwise, for trend invariant estimators or tests, Theorem 1 offers the only analytic demonstration of the power loss of such trends, hitherto observed only experimentally.  Table S1a in Appendix S1 presents outcomes of the power envelope for a variety of simple choices of d t (a constant, a linear trend and trends involving the logarithm, square root, square and exponent of time) in (1) for T = 250. It is worth noting that not all trends are associated with low power, exponential trends imply powers similar to those of the constant case. In Table S1b the power envelopes are approximated using stochastic expansions of Q 0,n (c) and Q 1,n (c) to order O p ( c 3 ) . As is evident comparing across entries in Tables S1a and 1b simulation of just the leading terms of these statistics capture the envelope almost entirely. (iii) Although the results in Theorem 1 are asymptotic, their proof yields the insight that the effect of a linear trend can be greater asymptotically, than in finite samples. From the proof of Theorem 1(i), when there is a trend, the O p ( c 3 ) term in the stochastic difference between Q 0,n (c) and Q 1,n (c) is which converges in probability to 0 only as T → ∞. This indicates a differential relative effect that a linear trend has on the finite sample and asymptotic envelopes. To illustrate, Table S2 contains the ratios of the power envelopes evaluated for d t = 1 + 2 t and d t = 1 for values of c from 1.25 to −5.0 and for different significance levels, = 0.01, 0.05, 0.10 evaluated for sample sizes of T = 50, 250, 500. The effects are clear and significant, particularly when c is small. This difference in the behaviour of the asymptotic and finite sample envelopes has significance for the choice of unit root tests in practice. As Francke and de Vos (2007) note, tests designed to have power close to the asymptotic power envelope may not have power functions close to the finite sample one, in the presence of trends. This can only be explained via the quantitative difference between them found in this article. It is also suggestive that new tests ought to be compared to both finite sample and asymptotic envelopes to justify their properties. (iv) The mechanism by which the power envelope is reduced on inclusion of a linear trend is algebraic. Specifically, as in the proof of Theorem 1(i), ) −1 be the covariance of a pure 'near unit root' process, then To construct invariant tests we first let w = C ′ Δ 1 y, which removes dependence on .
then when X contains a linear trend we find D 1 = −T −1 I n = −T −1 A 0 and hence That is 2 A c is proportional to 2 A 0 up to and including the O (c) term when there is a linear trend. Since we also require scale invariance this, heuristically, captures the effective cause of the dramatic loss of power. Algebraically this proportionality is exact in the unit root/linear trend problem. Generically, suppose we wish If the derivative of A c at c = 0 is D 1 then we would expect low power if A 0 and D 1 are proportional. A simple measure of the proportionality of two matrices is the variation in the ratio of their respective ordered eigenvalues, i and 1,i . To proceed, where ∅ denotes the null set, that is, X is empty and no invariance is required in the construction of w. Λ 2 X thus measures the relative variation in eigenvalues for a given choice of X compared to the case of no regressors, that is, only scale invariance is required. In the linear trend caseΛ 2 X = 0. For example, in the cases enumerated in Appendix S1 we find, with T = 250, In terms of ranking these outcomes match perfectly the power envelopes given in Table S1a. This measure could be adapted for any (simple) hypothesis test on a covariance matrix, when invariance with respect to the mean is required. It provides a simple measure of the sensitivity of power to the choice of deterministics, similar in spirit to the analysis of Leamer (1985). (v) Bykhovskaya and Phillips (2018) explore tests involving functional local alternatives where the local parameter depends on time. For example, H 0 ∶ c t = 0 vs. H F 1 ∶ c t = c(t∕T)∕T, so that only the initial value has a unit root. Although invariance with respect to neither the mean nor scale is pursued in that article, it is trivial to apply the framework here to such cases, as well as to functional stationary alternatives, where c < 0. Definẽ is the indicator function. In this case the covariance of w = C ′ Δ 1 y isÃ c = C ′ Δ 1Σc Δ ′ 1 C and has slopeD 1 = C ′ Δ 1Σ 1 0 Δ ′ 1 C. Even in the case that X contains a linear trendÃ 0 is not proportional to D 1 .
Calculating the eigenvalue variation defined in (7) we findΛ 2 X = 0.997 for the constant case andΛ 2 X = 0.993 for the linear trend case, so the relative impact of a linear trend is extremely small when testing against functional alternatives. This is bourne out in the outcomes for the ratios of the power envelopes presented in Table S3 which repeat the experiments reported in Table S2, but for the functional alternatives, H F 1 , in both stationary and explosive directions. (vi) The focus thus far has been on the theoretical implications on the testing problem of the inclusion of a linear trend. Harvey et al. (2009) detail practical procedures which account for uncertainty over whether or not a trend is required. Numerically, their tests are shown to have power curves close to Π (c) when there is no trend, and close to Π (c) when there is. Marsh (2009) characterizes this uncertainty in terms of a Bernoulli mixture of the trend and no-trend cases. The results of this article demonstrate, unequivocally, the necessity of the Harvey et al. (2009) pre-test or union of rejections based tests. Specifically, under such uncertainty, any other test must either be inefficient (its power will be bounded by Π (c) < Π (c)) when no trend is present, or inconsistent when it is.

PROOF OF THEOREM 1
Part (i). Suppose that X contains a linear trend. Let e = (1, 1, 1, … , 1) ′ be the constant vector, so that a linear trend is defined by = Δ −1 1 e, where Δ 1 = I − L. If (X) is the column space of X, then where C is defined above. The first derivative of A c at c = 0 is see also the proof of Theorem 1 in Marsh (2007). Consequently and only when ∈ (X), 1,i = T −1 and 12,i = T −1 2,i . Substituting these into the definitions of 1 and 2 in the statement of Lemma 1, we find Since also both ∑ n i=1 2,ĩ2 2,i and instead. Part (iii). Since point optimal tests are unbiased and the power envelope is monotone, then for any c ≠ 0 both Π (c) > and Π (c) > . Suppose first that c < 0, then the difference in rates implied by part (ii) implies there exists some value c * < 0 such that Π (c * ) < Π (c * ) .
Similar to the proof of Theorem 1 in Marsh (2011), let is the mean cumulant generating function of ∑ n i=1 iz 2 i . Note that Π (c) = lim n→∞ F 1,n ( ) , that is, the asymptotic distributions are defined as the limit of the finite sample, consistent with the set-up of Lemma 1.
Since A is an analytic function of c then so are the i and hence so is F 1,n ( ) , and its limit, through R 1 ( ) . Consequently both Π (c) and Π (c) are analytic in c, since they are functions only of the eigenvalues of A. Thus Π (c) = Π (c) − Π (c) is analytic in c. Note thatΠ (c) ≥ 0, since adding an additional invariance requirementin this case for a linear trend -cannot increase power. Let ℝ − denote the set of negative real numbers and let U be any closed subset of ℝ − . A fundamental property of bounded analytic functions is that ifΠ (c) = 0 for some c ∈ U thenΠ (c) = 0 for all c ∈ U. Since this is not true for c = c * then it cannot be true for any c satisfying 0 > c > −∞, consequently it must be thatΠ (c) > 0 for all finite c. The proof for the case for c > 0 is identical.