Does daily climate variation have an effect on species’ elevational range size?

In their recent paper published in Science (2016, 351, 1437–1439), Chan et al. analysed 137 montane gradients, concluding that they found a novel pattern—a negative relationship between mean elevational range size of species and daily temperature variation, which was claimed as empirical evidence for a novel macrophysiological principle (Gilchrist's hypothesis). This intriguing possibility was their key conceptual contribution. Unfortunately, as we show, the empirical evidence was flawed because of errors in the analyses and substantial sampling bias in the data. First, we re‐ran their analyses using their data, finding that their model should have been rejected. Second, we performed two additional re‐analyses of their data, addressing biases and pseudoreplication in different ways, both times again rejecting the evidence claimed to support Gilchrist's hypothesis. These results overturn the key empirical findings of Chan et al.'s study. Therefore, the “macrophysiological principle” should be regarded as currently remaining unsupported by empirical evidence.

Second, we performed two additional re-analyses of their data, addressing biases and pseudoreplication in different ways, both times again rejecting the evidence claimed to support Gilchrist's hypothesis. These results overturn the key empirical findings of Chan et al.'s study. Therefore, the "macrophysiological principle" should be regarded as currently remaining unsupported by empirical evidence.
Species' distributional ranges determine broad-scale species richness patterns, and assessing the mechanisms driving species' distributional ranges is central to ecology. Because a disproportionately large amount of biodiversity occurs in mountainous regions (Heywood, 1995), understanding how species' elevational range sizes (i.e. the range of elevations occupied by each species) are driven by environmental factors can provide insights into the mechanisms driving global patterns of range size and species richness. One important body of theory (the "climatic variability hypothesis") proposes that temperature variability through time drives elevational range sizes of species (Janzen, 1967;McCain, 2009;Stevens, 1992), with larger elevational range sizes resulting from greater variability. The reasoning is that species that can tolerate changes in temperature in one place can also tolerate equivalent changes in temperature associated with higher or lower elevation. This theory has been tested almost exclusively with respect to seasonal temperature variability. However, it has been suggested that shorter-term temperature variability may select for thermal specialists, and thus smaller elevational ranges (Gilchrist, 1995). Gilchrist (1995) explained this reversal, to negative elevational range size-temperature variability relationships at shorter temporal scales of temperature variability, by distinguishing between within-generation and between-generation temperature variation. Chan et al. (2016) used a global-extent dataset (though lacking latitudes poleward of 40°S or N) of 137 montane gradients to relate mean elevational range size of species to their measures of seasonal temperature range and diurnal temperature range simultaneously.
They claimed that they found a novel pattern in their study: diurnal temperature range negatively affects mean elevational range size ( Figure 1b). They considered this pattern their most important finding and interpreted it as supporting their extension of Gilchrist's (1995) model-that between-generation temperature variation favours thermal generalists but within-generation temperature variation favours thermal specialists. This conclusion is interesting and represents the key conceptual advance of their paper. Unfortunately, as we show, the empirical pattern on which it is based results from flaws in their analyses, and sampling bias. The "best model" of Chan et al. (2016), on which their empirical conclusions were based, should have been rejected by any standard criteria, and by their own criteria. We now explain in more detail. Chan et al. (2016) analyzed 137 montane gradients obtained from McCain (2009). In the dataset, the diurnal temperature range and mean elevational range size variables are not correlated with each other (r = À.039, p = .651; Figure 1a). Chan et al. constructed 29 path models, selecting as "best" one that generates a weak (R 2 = .06; p = .012) direct effect of diurnal temperature range on mean elevational range size ( Figure 1c); note that the "R = À.25" they state on p.1437 is the standardized path coefficient within their structural equation model (SEM), which is a partial correlation coefficient, controlling for effects of both seasonal temperature range and precipitation on mean elevational range size. They based their conclusions on this "best model", but when we used their data to rerun their model, we found several errors in their reported results, as follows.
Crucially, while the key result of Chan et al.'s analysis (a negative diurnal temperature range→mean elevational range size effect) was significant within their "best model" (Figure 1b), this model should have been rejected. Their stated procedure was to first reject any of their 29 SEMs that failed to meet all of the following criteria for model-fit statistics: root mean square error of approximation (RMSEA) < 0.08, comparative fit index (CFI) > 0.95 and standardized root mean square residual (SRMR) < 0.1. For models meeting these criteria they then selected the model with the lowest SRMR (even though SRMR does not penalize model complexity ;Hooper, Coughlan, & Mullen, 2008). According to their Table S2, 16 of their 29 SEMs meet their criteria, including their "best model" (Figure 1b; model 28 in their Table S2). However, in the case of the "best model" the RMSEA value was incorrectly reported as 0.062 when actually RMSEA = 0.178 (Figure 1b,c), which makes their "best model" unacceptable by their criterion (note also that the 90% confidence interval for the RMSEA does not include 0.08). The actual value is also far in excess of other commonly used RMSEA thresholds for model acceptability (e.g. 0.10, 0.06, 0.05; Browne & Cudeck, 1993;Hu & Bentler, 1999;Shipley, 2000).
For their "best model" only, Chan et al. (2016) also reported the result of a v² test (testing discrepancy between the data and the model), a standard test of acceptability of an SEM. Models for which the data and the model are significantly different (p < .05) should be rejected before considering model-fit statistics such as RMSEA or SRMR (Grace, 2006;Shipley, 2000). Very importantly, Chan et al. We are unable to meaningfully improve on the analysis of this dataset that was published by McCain (2009), so we do not attempt to provide a new "best model". We do note, however, that of the remaining 15 SEMs reported by Chan et al. (2016;their Table S2) as meeting their criteria of RMSEA < 0.08, CFI > 0.95 and SRMR < 0.1, the model that their selection criteria would choose as "best" is model 3 (SRMR = 0.0416). This SEM only includes latitude and precipitation, and therefore does not include diurnal temperature range.
Thus, their reported results and selection criteria suggest a model that rejects their own findings. However, we hesitate to conclude much here because we cannot replicate the results reported for model 3 in Chan et al. (2016), nor those for many of the other models reported in their Table S2.
Another key criticism of Chan et al.'s (2016) analysis is that it suffers from bias and pseudoreplication, with respect to taxon sampling and geographical distribution of samples. Unlike McCain (2009), they did not attempt to reduce these problems before analysing the data. The first bias problem is that montane gradients in dry climates are substantially over-represented in the dataset. Only~30% of the world's land surface outside the Antarctic/polar deserts is under arid climates (Hess & McKnight, 2013), but 47% of the 137 montane gra- northern Africa, in latitudes higher than most other montane gradients used).
We re-ran Chan et al.'s model after attempting to address the over-representation of dry montane gradients in their data. Specifically, we first divided the 137 montane gradients into two subsets: "dry" or "arid" according to McCain (2009;N = 64), and the remaining samples ("humid mountains"; N = 73). Next, we re-ran Chan et al.'s "best model" on each subset, finding a diurnal temperature range effect on mean elevational range size only for dry mountains (Figure 2a,b) and only a weak one (Figure 2b).  F I G U R E 3 Partial residual plot of the modelled relationship between diurnal temperature range and mean elevational range size. This is the equivalent of figure 1d in Chan et al. (2016), but here using a dataset that excludes rodents (see text for explanation). The two influential points discussed in the text are indicated. Both represent reptile groups from the same study in the same study site, with identical values for all the environmental variables. Removing either makes the negative relationship non-significant