Global urbanicity is associated with brain and behaviour in young people

Urbanicity is a growing environmental challenge for mental health. Here, we investigate correlations of urbanicity with brain structure and function, neuropsychology and mental illness symptoms in young people from China and Europe (total n = 3,867). We developed a remote-sensing satellite measure (UrbanSat) to quantify population density at any point on Earth. UrbanSat estimates of urbanicity were correlated with brain volume, cortical surface area and brain network connectivity in the medial prefrontal cortex and cerebellum. UrbanSat was also associated with perspective-taking and depression symptoms, and this was mediated by neural variables. Urbanicity effects were greatest when urban exposure occurred in childhood for the cerebellum, and from childhood to adolescence for the prefrontal cortex. As UrbanSat can be generalized to different geographies, it may enable assessments of correlations of urbanicity with mental illness and resilience globally. Xu et al. show that satellite-measured urbanicity (living in a densely populated area) is correlated with brain volume, cortical surface area and brain network connectivity in a sample of 3,867 people from China and Europe.

behaviour, and may in a subsequent step enable the identification and ranking of the individual features of the physical, social and service environment, and their interactions, that contribute most to the observed relation.
Traditionally, characterization of urbanicity was carried out using census data, which are ascertained infrequently in different ways and at different times in different countries 5 . Thus, census data are less useful for comparing urbanicity across different countries. More recently, the Global Human Settlement Layer (GHSL) has provided globally standardized human settlements, including urbanization and urbanicity 9 . GHSL data, however, are only available at large and infrequent intervals, namely 1975, 1990, 2000 and 2015 (ref. 9 ).
To facilitate global comparative analyses of the overall relations of urbanicity with brain and behaviour and to identify potential susceptibility periods, dense quantitative and longitudinal environmental measures that can be obtained from different geographies are required. Remotely sensed satellite data provide globally standardized quantitative environmental measures enabling the tracing of environmental features going back nearly 50 years (ref. 10 ). Population density is a well-established and quantifiable general measure of urbanicity that is frequently applied for neighbourhood classification and used around the globe 9 . Here, we aimed to use population density as a general measure of urbanicity to investigate whether the urban environment is correlated with brain and behaviour, and if these correlations are comparable in China and Europe. Specifically, we developed a satellite-based measure of population density termed 'UrbanSat' , and applied it in China and Europe to investigate the relation of population density, as a proxy of urbanicity, with brain structure, function and behaviour in two neuroimaging datasets of young people: as an exploration dataset, we used the Chinese CHIMGEN cohort (www.chimgen.tmu.edu.cn) 11 , and as the replication dataset, we used the European longitudinal IMAGEN cohort (www.imagen-europe.com) 12 . While we did not have any a priori hypothesis, we were interested in investigating whether: (i) UrbanSat is associated with brain structure, function and behaviour; (ii) brain features associated with UrbanSat mediate the association between UrbanSat and behaviour; (iii) correlations of UrbanSat with brain and behaviour are similar in Chinese and Europeans; Furthermore, we were interested in (iv) identifying susceptibility periods for the relations of UrbanSat during child and adolescent development with brain and behaviour. A schematic summary is shown in Fig. 1.

Results
Demographics. We recruited young-adult participants with lifetime residential geographies from CHIMGEN (n = 3,306) and IMAGEN second follow-up (FU2) (n = 561). Detailed inclusion and exclusion criteria are presented in Supplementary Tables 1 and 2. Demographics of the samples used in statistical analyses and sample attrition are described in Supplementary Table 3  UrbanSat: a satellite-based measure of urbanicity. To develop a satellite-based measure of urbanicity, we selected information from nine types of satellite registrations relevant for detecting and characterizing urban settlements, including night-time light (NL), normalized difference built-up index (NDBI), normalized difference water index (NDWI), normalized difference vegetation index (NDVI) and five measures derived from land cover mapping (built-up%, cropland%, grassland%, forest% and water body%) (Supplementary Table 7). We used multiple imputation by chained equations (MICE) to impute the nine annual satellite registrations and generate ten complete datasets. The multiple training-test splits of the dataset were performed to ensure that the estimates of generalizability we obtain are unbiased throughout the imputation (Supplementary Table 8, Extended Data Fig. 2 and Methods). We then carried out a tenfold cross-validation to optimize the confirmatory factor analysis (CFA) models, and to predict annual UrbanSat scores of each participant from birth to age of recruitment (Extended Data Fig. 2 and Methods). UrbanSat was generated by the optimized CFA model consisting of NL, built-up%, cropland% and NDVI, which best captured variation of urban features while maximizing goodness of fit. UrbanSat in CHIMGEN and IMAGEN-FU2 had a Tucker-Lewis index and comparative fit index > 0.95, root-mean-square error of approximation (RMSEA) < 0.06 and standard root-mean-square residual (SRMR) < 0.08, indicating excellent model fit (Supplementary Table 9 and Methods). UrbanSat was robust across time and geographies, as validated by its correlations with ground-level population density from GHSL-population grid (GHSL-POP) 9 for China and Europe for the years 1990, 2000 and 2015 (Fig. 2). Histograms of the distribution of UrbanSat score in each centre are shown in Extended Data Fig. 3. UrbanSat showed higher correlations with population density in different residential categories (rural, town, city and overall), countries (Asia and Europe) and years (1990, 2000 and 2015) than any individual satellite measures (Fig. 2).
Correlations of UrbanSat with brain structure. Voxel-wise multiple regression of mean UrbanSat before age 18 years with brain grey matter volume (GMV) was performed in CHIMGEN (n = 2,176). We controlled for age in all analyses, thus accounting for the older and wider age spread in CHIMGEN (age 23.54 ± 2.33 years) compared with IMAGEN (age 18.89 ± 0.66 years). We also controlled throughout for gender, education, site, body mass index (BMI), genetic population stratification and socioeconomic status (SES) (Supplementary Tables 10 and 11). Total intracranial volume was controlled in all imaging analyses, except for the analyses of cortical thickness (CT) and surface area (SA), where mean CT and total SA were controlled, respectively. Parental history of mental illness was an exclusion criterion for CHIMGEN and controlled for in IMAGEN. Uncorrected statistical maps of the association of UrbanSat with brain GMV in CHIMGEN adjusting for confounding covariates under parametric testing and non-parametric permutation testing are shown in Fig. 3a and Extended Data Fig. 4. We found negative correlation of UrbanSat with medial prefrontal cortex (mPFC) volume (peak Montreal Neurological Institute (MNI) coordinate: x = −1.5, y = 60, z = 0; 294 voxels; peak t value −5.63; Fig.3b) and a positive correlation with cerebellar volume (peak MNI coordinate: x = 10.5, y = −51, z = −18; 440 voxels; peak t value 6.43; Fig. 3b) (parametric testing P c < 0.05, family-wise error (FWE) corrected for voxel numbers, imaging modalities and data categories; see Methods). We confirmed the results with non-parametric permutation testing (threshold-free cluster enhancement (TFCE)-FWE, P c < 0.05, Supplementary Methods and Extended Data Fig. 4). Potential imputation bias was ruled out by multiple sensitivity analyses (Methods, Extended Data Fig. 2 and Supplementary Results). Uncorrected and adjusted vertex-wise correlation maps of UrbanSat with whole brain CT and SA are shown in Extended Data Fig. 4 (n = 2,164). The mPFC region of interest (ROI) from GMV analyses was projected onto fsaverage surface of Freesurfer v5.3.0 (Extended Data Fig. 4). UrbanSat was correlated with mean SA (ρ = −0.07, P = 0.002) but not mean CT (ρ = −0.02, P = 0.381) of the mPFC cluster (Fig. 3b and  Supplementary Table 12). Using voxel-wise multiple regression of individual satellite measures with GMV, we found significant correlation with mPFC-GMV being driven by NL and built-up%, and correlation with cerebellar-GMV being driven by NL and crop-land%. We found no correlation of NDVI with either mPFC-or cerebellar-GMV (Extended Data Fig. 5). We observed similar results   Fig. 1 | Schematic summary of study design. a, Geographic distribution of recruitment sites and number of participants collected in each city of China and in each country of europe (top), including five sites in Tianjin, two sites each in Zhengzhou, Nanjing, Hefei and Wenzhou and one site in the remaining cities. Lifetime residential geopositions of each participant collected from 1986 to 2018 (bottom). On the basis of the individual geoposition data, the annual values of nine satellite measures of urbanicity are obtained for each participant, including NL, NDbI, NDWI, NDVI and five measures derived from land cover mapping (built-up%, cropland%, grassland%, forest% and water body%). b, Tenfold cross-validation of multiple imputation model building and CFA model optimization for urbanSat construction. The optimized CFA model includes NL, built-up%, cropland% and NDVI. The mean urbanSat scores before 18 years showed stronger correlation with ground-level population density from GHSL than any individual satellite measures in both CHIMGeN and IMAGeN-Fu2. c, Investigation of the cumulative relations of urbanSat with brain and behaviour. d, Identification of susceptibility periods of lifetime urbanSat on brain and behaviour using distributed lag models. CFA, confirmatory factor analysis; GHSL, global human settlement layers; IMAGeN bL-Fu2, IMAGeN bL-Fu2 measures brain structural and functional change rate between bL of 14 years and Fu2 of 19 years; IMAGeN Fu2, IMAGeN second-follow-up-assessment acquired at 19 years of age; NDbI, normalized difference built-up index; NDVI, normalized difference vegetation index; NDWI, normalized difference water index; NL, night-time light; S, satellite measures of urbanicity; Sub, subjects; y, years old.
in GHSL ground-level population density data, thus validating the relation of UrbanSat and GMV (Extended Data Fig. 5).
To exclude possible scanner and site effects, we performed separate analyses for each acquisition site of CHIMGEN and IMAGEN-FU2, carrying out a meta-analysis with an inverse variance-weighted random-effects model (Methods and Supplementary Methods). UrbanSat remained significantly negatively correlated with mPFC-GMV and SA, and positively correlated with cerebellar-GMV (Extended Data Fig. 6 and Supplementary Table 13). Heterogeneity of effect sizes was low to moderate for all regions (I 2 range 0.05-60.21%; Supplementary Table 13). Thus, the observed correlation between UrbanSat and brain structure is robust across geographies and socio-cultural conditions.
We applied distributed lag models (DLMs) to identify susceptibility periods of lifetime UrbanSat on GMV and SA (Methods). We observed a negative association of UrbanSat with mPFC-GMV from age 4 to 15 years (Fig. 3c) and SA from age 5 to 7 years with brain structure. a, uncorrected statistical maps in the voxel-wise multiple regression of mean urbanSat before 18 years with brain GMV under parametric testing in CHIMGeN (n = 2,176). b, In CHIMGeN, urbanSat is negatively (blue) correlated with mPFC-GMV (left) and positively correlated with cerebellar-GMV (right) (FWe P c < 0.05). The correlation of urbanSat with mPFC-GMV is driven by SA rather than CT, and these correlations are replicated in IMAGeN-Fu2 (n = 415); urbanSat is correlated with brain volumetric change in mPFC (n = 340) in IMAGeN bL-Fu2, which is driven by the change of SA rather than CT (n = 325). Dashed red lines indicate the threshold of P = 0.05. c, Susceptibility period analysis of brain structure using DLM. In CHIMGeN, we observed a negative association of urbanSat with mPFC-GMV during childhood and adolescence (age 4-15 years) and mPFC-SA during childhood (age 5-7 years) as well as a positive association with cerebellar-GMV during childhood (age 1-10 years). The y axis represents the changes of brain structural features associated with an increase of interquartile range of urbanSat; the x axis is urbanSat lag in years of ages. Grey areas indicate 95% CI. A susceptibility window is identified for the ages where the estimated 95% CI (shaded area) does not include zero. d, Numbers of participants migrating to city at different ages. Inset, urbanSat is highest in participants with life-long city living (n = 562), medium in participants moving to city before 14 years (n = 229) and lowest in participants moving to city after 14 years (n = 1,385). e, Participants who were born in the city or migrated to an urban environment at an earlier age showed smaller mPFC-GMV and mPFC-SA as well as greater cerebellar-GMV than those with later exposure. error bars show mean and s.e.m. bL, IMAGeN baseline assessment acquired at 14 years old; bL-Fu2, IMAGeN bL-Fu2 measures brain structural changes rate between bL of 14 years and Fu2 of 19 years; CT, cortical thickness; DLM, distributed lag model; Fu2, IMAGeN second follow-up assessment acquired at 19 years old; L, left; r, right; SA, surface area. *P < 0.05. (Fig. 3c), indicating a susceptibility period during childhood and adolescence, driven by NL, built-up%, cropland% and NDVI (Extended Data Fig. 7). Correlation of UrbanSat with cerebellar-GMV was significant from age 1 to 10 years, indicating a susceptibility period during childhood (Fig. 3c), driven by NL, built-up% and cropland% (Extended Data Fig. 7).
To measure the relation between age of migration and brain structure, we split CHIMGEN participants into those who migrated to the city before age 14 years (n = 229, mean age at migration 8.24 ± 4.86 years), after age 14 years (n = 1,385, mean age at migration 17.17 ± 2.68 years) and life-long city-dwellers (n = 562) (Fig. 3d). We found that participants born in the city or early migrants showed smaller mPFC-GMV (P = 0.032) and SA (P < 0.001) as well as greater cerebellar-GMV (P < 0.001) than those with later exposure ( Fig. 3e and Supplementary Table 14).

No correlation of UrbanSat with white matter microstructure.
Using tract-based spatial statistics (TBSS) analysis of diffusion tensor imaging (DTI) data, we did not find significant correlation of UrbanSat with brain fractional anisotropy (FA) in either CHIMGEN or IMAGEN-FU2 (TFCE-FWE, P c < 0.05).
Correlations of UrbanSat with resting-state functional network connectivity. Using group-independent component analysis of estimated 30 independent components (Supplementary Methods), we identified 17 resting-state networks (RSNs) related to cognitive and sensory-motor processes 13 in both CHIMGEN and IMAGEN (n = 2,156) (Extended Data Fig. 8). For each RSN, we tested the relation between mean UrbanSat and within-network functional connectivity (WNFC). A voxel-wise multiple regression analysis controlling for all confounders revealed a negative correlation of UrbanSat with WNFC in the mPFC of the anterior default mode network (aDMN) (peak MNI coordinate: x = −3, y = 69, z = 0; 142 voxels; peak t value −6.33), and positive correlations in the cerebellar vermis of the cerebellar network (CN) (peak MNI coordinate:  Table 12). Only the aDMN and CN results were replicated in voxel-wise analyses in IMAGEN-FU2 (Extended Data Fig. 9).
The correlations of UrbanSat with WNFCs and BNFCs were stable in a meta-analysis of all CHIMGEN and IMAGEN sites (Extended Data Fig. 6 and Supplementary Table 13). Brain localization (Fig. 4b,f) and susceptibility periods (Fig. 4d,h) of WNFCs and BNFCs in CHIMGEN and IMAGEN were consistent with those observed for brain structure (Extended Data Figs. 7 and 9), except for non-significance during adolescence. The changes in WNFCs and BNFCs between 14 and 19 years in IMAGEN were correlated with UrbanSat ( Fig. 4b,f, Supplementary Table 12 and Supplementary Results). In CHIMGEN, WNFCs and BNFCs were correlated with age of migration to the city (Fig. 4c,g and Supplementary Results).

Correlations of UrbanSat with behaviour. We investigated whether
UrbanSat is related to measures of cognition and mental health, that is, depression and anxiety. In CHIMGEN (n = 2,148), the social cognition measure 'perspective-taking' , perceiving a situation from an alternative point of view 16 , was positively correlated with UrbanSat (reaction time for perspective-taking: ρ = −0.16, P c < 0.05, Bonferroni corrected for data categories and 21 behavioural assessments; Methods) and replicated in IMAGEN-FU2 (ρ = 0.14, P c < 0.05) ( Table 1, Extended Data Fig. 10 and Supplementary Table  15). A negative correlation between UrbanSat and reaction time for perspective-taking performance was observed from 12 to 22 years in CHIMGEN (Fig. 5a).
UrbanSat was correlated with depression symptoms assessed by Beck Depression Inventory (BDI) in CHIMGEN (n = 2,170) (ρ = 0.15, P c < 0.05) ( Table 1) with a susceptibility period from 3 to 12 years (Fig. 5a). As BDI was not available in IMAGEN, we validated this association using an instrument measuring core features of depression, the Ruminating Scale Questionnaire (RSQ) (ρ = 0.14, P c < 0.05) ( Table 1 and Supplementary Methods).
In CHIMGEN and IMAGEN, increased NL and built-up% and decreased NDVI and cropland% were significantly correlated with enhanced perspective-taking performance and increased depression symptoms ( Table 1). The susceptibility periods for individual satellite measures were similar to UrbanSat in CHIMGEN (Extended Data Fig. 7). Although most correlations of UrbanSat with brain and behaviour were consistent between males and females, some correlations, especially with brain development in IMAGEN BL-FU2, were sex specific (Supplementary Tables 16 and 17).

Discussion
Using UrbanSat, a remote-sensing satellite measure, we characterized the relation of population density, a proxy of urbanicity, with brain structure, function and behaviour during childhood and adolescence in large datasets in China and Europe. We provide converging evidence for association of UrbanSat during childhood and adolescence with GMV and SA of mPFC as well as WNFCs and BNFCs of aDMN. No significant association was observed with CT and FA. mPFC and aDMN mediate the correlation between UrbanSat and improved perspective-taking and increased depression symptoms. We also found positive correlations of UrbanSat during childhood with cerebellar volume, which mediated the association with perspective-taking and depression symptoms. We extend previous observations reporting an association of depression symptoms with urban settings 17 by demonstrating the stability of this observation in different geographical and sociocultural regions, and by discovering possible underlying brain mechanisms and susceptibility periods during childhood and adolescent development.
Our results suggest that urban living has both beneficial and adverse correlations with health: enhanced social cognition (perspective-taking) and increased depression symptoms, in contrast to previous studies, which mainly reported adverse aspects of urbanicity 18 . The mPFC, the core brain area of the aDMN, has been implicated in a variety of social cognition and affective functions commonly compromised in psychiatric disorders 19 . The susceptibility of mPFC to urban environment is supported by the greater sensitivity of mPFC to urbanicity-related risk factors including chronic stress 20 and air pollution 21 . While our findings are consistent with reports of an association between urbanicity and mPFC in smaller European samples 22 , they differ from these studies as we found associations with GMV and SA rather than CT, and an absence of sex specificity.
We found a positive correlation of UrbanSat with cerebellar volume, a mediator for the association of UrbanSat with perspective-taking and depression symptoms. The functional network connectivity of the cerebellum also mediates the association of UrbanSat with perspective-taking. Cerebellar lesions cause the cerebellar cognitive affective syndrome, characterized by impairments in executive function and memory, as well as affect 23 . Animal studies extend these findings to stress-dependent depressive affect 24 and impairment in social behaviour 25 . It is tempting to speculate that these pathways may connect to brain regions involved in perspective-taking and depression symptoms 26 . Imaging features related to cerebellum showed susceptibility periods to high population density at age 1-10 years, during which cerebellum and cortex are increasing in volume [27][28][29] .
While previous studies focused on the effect of mean exposure to urban living on brain and mental health 30 , we identified neurodevelopmental periods with increased susceptibility to urban living. Consistent with observations of susceptibility periods of non-affective psychosis to residential mobility during childhood and adolescence 31 , we found that structure and function of mPFC, as well as depression symptoms, have pronounced susceptibility to high population density during childhood and adolescence, a period more sensitive to social stress 32 . Perspective-taking was more sensitive to high population density during adolescence and young adulthood, implying a time window for neurobehavioural interventions targeting social cognition.
Our results are suggestive of a cumulative relation of urbanicity with brain and behaviour, whereby participants born or migrating to the city at an earlier age had more pronounced effects than those who become city-dwellers later. Given that CHIMGEN participants were students who moved to cities for their studies, we do not have any data on people who, after spending some years in the city, moved back to the countryside. We also do not have data to distinguish possible short but extreme exposure to urban life, in utero or during susceptibility periods from moderate continuous exposure.
We found several shared relations of high population density in urban settings with brain structure and function as well as on social cognition and mental health in both CHIMGEN and IMAGEN,  indicating their generalization to other sociocultural conditions and geographies. The relation of urban living with brain development during adolescence was confirmed by exploring the correlation of UrbanSat with brain structural and functional changes from age 14 to 19 years in IMAGEN. Taking into account normative references [27][28][29] , our observations are consistent with an accelerated development in densely populated urban areas of cerebellum during childhood and mPFC during childhood and adolescence. We also found inconsistent results between CHIMGEN and IMAGEN: only 4 of 49 BNFCs correlating with UrbanSat in CHIMGEN were replicated in IMAGEN. The more extensive relations of urban living with BNFCs in CHIMGEN may reflect the more drastic changes in urbanization in China compared with Europe 6 , but may also relate to confounding factors beyond the covariates controlled in our study 33 . UrbanSat was correlated with GMV, SA and functional connectivity, but not with FA and CT, indicating different sensitivities of brain properties to residential environments. UrbanSat showed positive correlation with cerebellar volume, negative correlation with mPFC volume, but non-significant correlation with volumes of other regions, suggesting different spatial sensitivities to residential environments. The cerebellum was sensitive to urban residential environments during childhood, whereas the mPFC was sensitive during both childhood and adolescence, indicating different temporal sensitivities to residential environments. This framework of different spatial and temporal sensitivities to urban residential environments may help to understand the association of urban living with brain and mental health.
High population density, a general measure of urbanicity, can cause increased social stress and air pollution, both of which affect brain structure in young people 34,35 . A recent study observed an association of urbanicity with brain activity in regions linked to social stress processing 30 . Such brain changes may mediate the well-established relation of urbanicity with mental health, including mood disorders 36 and social cognition 18 . Stress in childhood can accelerate brain development and lead to faster maturation of certain brain regions during adolescence, including cerebellum and mPFC 37 . Faster brain maturation results in enhanced cognitive development 38 and may account in part for the positive correlation of urbanicity and perspective-taking observed in our study. However, faster maturation of mPFC and cerebellum may come at a cost of decreased plasticity, including of fear extinction mechanisms (mPFC), which may contribute to increased vulnerability to anxiety and depression 39 . Air pollution induces neuroinflammation in the brain, leading to damage and loss of neural tissue in prefrontal cortex 35 , and may provoke depression symptoms 40 . Thus, urban upbringing may cause affective and anxiety symptoms by way of both increased social stress and pollution. Remotely sensed satellite data play a critical role in monitoring the Earth's surface to track environmental conditions that are intimately related to human health 10 . Satellite data are applied to map urbanization, poverty, climate change and pollution, as well as the spread of infectious diseases 10 . Our study extends the application of remote-sensing satellite data and provides a method to characterize and monitor spatial and temporal patterns of risk for mental disorders. In the optimized CFA models, the four satellite measures contributing to UrbanSat showed different factor loadings. NL with the highest factor loading can capture the physical environmental features of urbanicity, such as patterns of human settlements 41 , urban expansion 42 and population counts 43 , as well as information about social-environmental features of urbanicity, such as economic activity 44 . Built-up% and cropland% with medium factor loadings mainly reflect physical-environmental features of urbanicity. NDVI with the smallest factor loading measures residential greenness and has been used extensively to record the distribution of green spaces in urban settings 45 . UrbanSat mainly reflects the physical-environmental features and indicates the social-environmental features of urbanicity only indirectly.
For privacy reasons, our satellite measures were obfuscated to a spatial resolution of 1 km, preventing the capture of important aspects of urban life, such as daily mobility paths. Future studies will investigate the integrated effect of urban physical and social environment, and their interaction with genetics and relation to brain and behaviour. MICE is a flexible tool for multiple imputation of missing data, which properly accounts for uncertainty in the downstream estimates by combining subsequent estimates from different imputed datasets. This approach also accommodates the multiple training-test splits of the dataset to ensure that the estimates of generalizability we obtain are unbiased throughout the imputation. The disadvantage of MICE is the difficulty in combining multiple imputed datasets with voxel-wise neuroimaging statistics. Although several sensitivity analyses were applied after the multiple imputation and led to identical conclusions, we are still not in a position to fully exclude the potential bias derived from missing data. This study is not epidemiological but neurobiological, aiming to identify brain mechanisms by which urbanicity influences behaviour. How representative the identified mechanisms are among the general population is a different task, for future epidemiological studies.
Our findings were made possible due to recent advancements in remote-sensing satellite technologies which were leveraged to measure the relation of urbanicity with brain and behaviour. We were able to (1) apply a general measure of urbanicity, that is, population density, which is not dependent on census definitions of urban areas that might be conflated by densely populated rural areas or sparsely populated areas within urban settlements, or that may vary between nations; 5 (2) obtain a high spatial and temporal resolution; 10 (3) use a measure applicable anywhere on Earth from the 1970s to the present day. Thus, UrbanSat provides a unique opportunity to identify the cumulative effects and susceptibility periods of urbanicity on brain and behaviour. However, we note that UrbanSat cannot unravel environmental pathways and their interactions that cause the aversive effects of urban living. This is a task for subsequent studies with access to sufficient ground-level data for comprehensive characterization of causal environmental pathways that underlie the observed correlations.
In the current work, we provide proof of principle establishing the use of satellite data to inform the relation between urban environment, brain and behaviour. As our approach can be extended and generalized to other geographies and is easy to implement even in the absence of detailed or directly comparable ground-level data, it may be relevant for public health, policy and urban planning globally.

Methods
Ethical approval. The Chinese Imaging Genetics (CHIMGEN) project was approved by the ethics committee of 31 centres in China 1 , and written informed consent was obtained from each participant 11 . The IMAGEN study was approved by local research ethics committees at each research site. Informed consent was sought from all participants and a parent or guardian of each participant 12 .
Participants. The CHIMGEN project collected genetic, transcriptomic, environmental, neuroimaging and behavioural data of 7,306 healthy Chinese Han participants aged 18-30 years recruited from 31 centres of 21 cities in Mainland China 11 . At the time of data analysis (January 2018), data were available for 5,425 participants. The IMAGEN project is the first European multisite and longitudinal study 12 , and comprehensive genetic, transcriptomic, epigenetic, environmental, neuroimaging and neurocognitive data were collected from more than 2,000 14-year-old adolescents in 2009. Brain imaging measures were longitudinally assessed at age 14 years (baseline, BL) and 19 years (second follow-up, FU2). Most of the neurocognitive and mental health outcomes were longitudinally assessed at BL, FU1 (16 years) and FU2. The approach for sample selection in CHIMGEN and IMAGEN is described in the Supplementary Methods and is shown in Extended Data Fig. 1.

Data collection. Residential geographies.
For each CHIMGEN participant who had consented to provide residential information, we recorded the precise residential addresses in each year from birth to recruitment and the category of each place (1 for rural, 2 for town or 3 for city) as determined according to the National Bureau of Statistics of China (Supplementary Methods). However, since residential histories were not obtained prospectively, we were not in a position to fully exclude the possibility of recall bias. Finally, 3,336 participants who provided their lifetime residential geographies were included in the further analysis. The remaining 2,089 participants were excluded because they only provided their residential addresses at the time point of recruitment but refused to provide their residential addresses at any other time points since birth.
At the time of the second follow-up, 561 IMAGEN participants provided their precise residential addresses for each year from birth to recruitment and the category of each place (1 for rural, 2 for town or 3 for city) (Supplementary Methods). To protect the anonymity of participants in CHIMGEN and IMAGEN, these addresses have been obfuscated to 1 km scaled longitude and latitude based on the Google Earth Engine (GEE) coordinate system using code (https://github. com/crickfan/geo-anonymization).
Confounding covariates data. In CHIMGEN and IMAGEN, we controlled for age, gender, education, site, BMI, genetic population stratification, total intracranial volume, mean corticle thickness, total surface area and SES in the correlation of satellite-based measure of urbanicity with brain and behaviour. Parental history of mental illness was an exclusion criterion for CHIMGEN, but not in IMAGEN, where this variable was controlled for in IMAGEN data analyses (Supplementary Methods).
Neuroimaging data. Among participants with lifetime satellite measures and confounding covariates, 2,176 participants from CHIMGEN and 482 participants from IMAGEN-FU2 with at least one modality of neuroimaging data were included in the analysis. T1-weighted imaging, DTI and resting-state functional imaging were acquired using 3.0-T MRI scanners from 28 sites of CHIMGEN and 6 sites of IMAGEN-FU2 (Supplementary Tables 19-23). Brain GMV, CT and SA were calculated from the T1-weighted imaging, FA from the DTI, as well as WNFC and BNFC from the resting-state functional MRI. The detailed preprocessing and calculation methods are described in Supplementary Methods and http://chimgen.tmu.edu.cn/en/index.php?c=article&id=2034. Finally, in CHIMGEN, qualified GMV data were available in 2,176 participants, CT and SA data in 2,164 participants, FA of TBSS data in 2,158 participants as well as WNFC and BNFC data in 2,156 participants.
In IMAGEN, among the remaining 482 participants, 415 (FU2) were included in GMV analysis, 420 (FU2) in CT and SA analysis, 436 (FU2) in TBSS analysis and 351 (FU2) in WNFC and BNFC analyses (Supplementary Methods). For the 340 participants with qualified structural imaging data at both BL (14 years) and FU2 (19 years) in IMAGEN, structural imaging data were pre-processed using the pairwise longitudinal tool implemented in SPM12 for longitudinal voxel-based morphometry analysis 53 . Finally, we obtained year-averaged GMV change maps for 340 participants and year-averaged CT and SA change maps for 325 participants. In 83 participants with qualified fMRI data at both BL (14 years) and FU2 (19 years), we calculated WNFC and BNFC for each participant at each stage, then obtained the WNFC and BNFC change maps of the 83 participants (Supplementary Methods).
Behavioural data. Among the 2,176 participants with at least one type of qualified MRI data in CHIMGEN, 2,173 were finally included in the analysis of verbal learning memory, 2,063 in working memory, 2,139 in information processing speed, 2,148 in social cognition (Extended Data Fig. 10) To test selection bias, we compared differences in demographic and behavioural variables between the included sample (n = 3,306 for CHIMGEN; n = 561 for IMAGEN-FU2) and the excluded sample (n = 2,119 for CHIMGEN; n = 850 for IMAGEN-FU2) using Wilcoxon rank sum test and chi-square analysis. While we found subtle but significant differences in a minority of variables that did not relate to the main behavioural results (perspective-taking and depressive symptoms) (Supplementary Tables 5 and 6), we cannot fully rule out an influence of selection bias on our findings. We controlled for demographic variables in the main analyses, including the significantly different variables age, gender, education, BMI and SES.
UrbanSat calculation. Tenfold cross-validation of multiple imputation for missing values. To calculate annual UrbanSat scores for each participant (n = 3,306 for CHIMGEN) from birth to age of recruitment, we required satellite data from 1986 to 2018. However, the available satellite data did not cover the entire time period. For example, NL data were only available from 1992 to 2013 (Supplementary Table 7), indicating that 7,560 of 78,315 spatiotemporal points were missing (data missing rate 9.65%). Thus, we needed to impute the missing values for the nine satellite measures. The numbers of imputed years for each satellite-based measure and the number of CHIMGEN participants with missing data are presented in Supplementary Table 8. The imputation strategy is shown graphically in Extended Data Fig. 2.
We used MICE as implemented in the 'mice' R package 54 to impute the nine annual satellite registrations and generate ten complete datasets. The data were input as 'wide' format. The imputation models of predictorMatrix with default settings and predicted mean matching approach in the 'mice' R package were used in the multiple imputation process. We used 20 iterations of Gibbs sampling in the 'mice' R package.
Multiple training-test splits of the dataset were performed to ensure that the estimates of generalizability we obtain are unbiased throughout the analytic pipeline. Specifically, the nine satellite measures of 3,306 participants with missing data were randomly and equally divided into ten groups. Then, tenfold cross-validation was used to build the imputation diagnostic models in the training datasets and to predict missing values of satellite measures in the test datasets using MICE. For each fold, 90% of participants were used as a training dataset to build the imputation diagnostic model, then this diagnostic model was used to predict the missing values in the other 10% of participants in the test dataset. This process was applied in ten folds to predict all missing satellite data of 3,306 participants (Extended Data Fig. 2).
The IMAGEN participants born between 1994 and 1998 were recruited from 2009 to 2010 at baseline and followed up from 2015 to 2016. Thus, we have almost all true exposure values of the nine satellite measures for these participants. The only exception is the lack of NL data for 2014, 2015 and 2016. The same imputation methods were applied in these three years.
Tenfold cross-validation of CFA. NL, NDVI, NDBI, NDWI, cropland%, forest%, grassland%, water body% and built-up% have all been reported to measure urban environment (Supplementary Methods). As we hypothesize that each individual satellite-based measure contributes to the measurement of urban environment, we directly apply CFA but not exploratory factor analysis. The aim of CFA in this study is to construct a latent variable (UrbanSat) to comprehensively reflect urban environments, which was carried out using the R package 'lavaan' (https:// cran.r-project.org/web/packages/lavaan) 55 . The same ten cross-validation splits as applied in the imputation process were used to optimize the CFA models and to predict annual UrbanSat scores of each participant from birth to age of recruitment. For each fold, 90% of participants were used to build the CFA model, and the optimized CFA model was used to calculate the UrbanSat scores for the other 10% of participants. For each fold, two criteria were used to optimize the CFA model by selecting appropriate satellite-based measures of urbanicity. The first criterion was the goodness of fit of the CFA model, which can be assessed by Tucker-Lewis index, comparative fit index, chi square, RMSEA and SRMR. The criteria for excellent model fit were TSI > 0.95, comparative fit index > 0.95, RMSEA < 0.06 and SRMR < 0.08 (refs. [56][57][58]. The second criterion was the inclusion of satellite-based measures as much as possible to better reflect different aspects of urbanicity. Specifically, in the first imputed satellite dataset, we initially constructed a CFA model by including all nine satellite-based measures of urbanicity in the training dataset of each fold. Based on the factor loadings of the nine satellite-based measures, we removed the satellite measure with the smallest factor loading and repeated the CFA modelling. These steps were iterated until the resulting CFA model satisfied our criteria for excellent model fit in the training dataset. The factor loadings of the optimized CFA model were used to calculate UrbanSat scores in the test dataset. This process was applied in ten folds to predict all out-of-sample lifetime UrbanSat scores of 3,306 participants. And the same process was applied from the first imputed satellite datasets to the tenth ones to generate ten datasets of lifetime UrbanSat scores of 3,306 participants (Extended Data Fig. 2).
The statistical estimate combination between voxel-wise neuroimaging analyses and the multiple imputed UrbanSat is not supported by the current version of Statistical Parametric Mapping (SPM12) software implemented in MATLAB R2018a (http://www.fil.ion.ucl.ac.uk/spm) during the model estimation. Therefore, we applied Rubin's rules 59 to combine the ten datasets of UrbanSat scores derived from MICE imputation. The combined UrbanSat scores were related to the further voxel-wise whole brain analysis and susceptibility period analysis. Subsequently, we performed several sensitivity analyses as explained below.
Sensitivity analysis. The combined UrbanSat scores according to Rubin's rules may understate the uncertainty in the downstream statistical estimates since they do not propagate uncertainty about the multiple imputation process. To test the potential uncertainty caused by the multiple imputation of satellite data, we combine this with multiple sensitivity analyses including the statistical comparison of subsequent mass-univariate voxel-wise neuroimaging analyses using multiple versions of the imputed UrbanSat and the estimation of ROI-level analyses where the multiple imputed UrbanSat were combined in a statistically principled fashion. Firstly, voxel-wise multiple regression analysis of each imputed UrbanSat with whole brain GMV adjusted for all confounding covariates in 2,176 participants was performed (FWE P c < 0.05). Secondly, the significant brain ROI was firstly identified from the voxel-wise multiple regression between the combined UrbanSat score following Rubin's rules 59 and whole brain GMV after controlling confounding factors (FWE P c < 0.05). The 'mice' R package 54 was applied to pool the statistical estimates between ten imputed UrbanSat scores and brain ROI GMV value after controlling for confounding factors.
To test the potential bias caused by the imputation of satellite data, sensitivity analysis was additionally performed in participants with complete satellite and neuroimaging data. Among the 3,306 CHIMGEN participants with complete residential information, 1,460 participants had nine complete satellite measures from birth to age of recruitment and brain imaging data. This subset of participants was used to construct the out-of-sample CFA models and to calculate the annual UrbanSat scores before 18 years for each participant. Then the sensitivity analysis was performed to identify voxel-wise analysis between GMV and UrbanSat adjusted for all confounding covariates in 1,460 participants (FWE P c < 0.05).
In addition, the 'mice' R package 54 was also applied to pool the multiple statistical estimates between ten imputed datasets of UrbanSat scores and behaviours after controlling confounding factors.
Correlation analyses of mean UrbanSat with brain imaging measures. The voxel-wise multiple regression of mean UrbanSat before 18 years with brain GMV was performed in CHIMGEN (n = 2,176) using Statistical Parametric Mapping (SPM12) implemented in MATLAB R2018a (http://www.fil.ion.ucl. ac.uk/spm) (Supplementary Methods). Statistical significance of the voxel-wise multiple regression models in relation to mean UrbanSat with neuroimaging data was assessed by parametric testing FWE correction, where we corrected for voxel numbers, six imaging features (GMV, CT, SA, FA, WNFC and BNFC) and two data types (neuroimaging and behavioural data). We therefore set a significance threshold of FWE-corrected P c < 0.05 (equal to an uncorrected P < (1.25 × 10 −6 /6/2) = 1.01 × 10 −7 ) in brain structure analysis in CHIMGEN.

Meta-analysis.
To exclude possible scanner and site effects, we repeated the ROI-based correlation analyses of UrbanSat with neuroimaging measures in each site (both CHIMGEN and IMAGEN) and performed meta-analysis to integrate the results. The meta-analyses pooled each centre's effect size of correlation coefficient between UrbanSat and neuroimaging measure of each ROI, using an inverse variance-weighted random-effects model as implemented in the R package 'metafor' (version v2.1-0) 61

Correlation analyses of mean UrbanSat with behaviour.
For the behavioural analysis, partial correlation analysis was firstly applied to test the correlation of UrbanSat with each neuropsychological domain and mental health in CHIMGEN under Bonferroni-corrected P c < 0.05 after controlling for the same confounding covariates. Bonferroni correction for the two data types and 21 items (Table 1  and Supplementary Table 15) was applied in CHIMGEN. We therefore set a significance threshold of Bonferroni-corrected P c < 0.05 (equal to an uncorrected P < (0.05/2/21) = 0.001) in CHIMGEN. All the significant results would be replicated in IMAGEN at Bonferroni-corrected P c < 0.05 (equal to an uncorrected P < (0.05/4) = 0.013), where we only include four measures of behaviour (IRI, RSQ, DAWBA-GA and CIDI-AS).
Identification of susceptibility periods using a distributed lag model. To account for within-subject autocorrelation and consider the delayed effect of longitudinal UrbanSat, we used a DLM to investigate susceptibility periods of lifetime UrbanSat on brain and behaviours by creating an UrbanSat lag space 62 (1-30 years in CHIMGEN). The DLM is defined through a 'cross-basis' function, which allows the simultaneous estimation of a linear exposure-response association and nonlinear lag-response association across lags. Specifically, we fitted the DLM as the following formula: Here UrbanSat ij is the out-of-sample predicted UrbanSat score in age j of lifetime year n, X 1i to X pi are the same confounding covariates adjusted before for participant i. To account for collinearity among the yearly UrbanSat, we fitted constrained DLMs that assume these effects α j are a smooth function of j year. Therefore, the DLM model simultaneously integrates the data from all time points and assumes that the association between the UrbanSat and brain/behaviour at a given time point, controlling for UrbanSat at all other time points, varies smoothly as a function of time. The smooth function of lag structure was modelled using a natural cubic spline with five degrees of freedom, setting the knots at equally spaced values on the lag scale (1-30 years). The number of knots was chosen based on the AIC. A susceptibility period is identified when the estimated pointwise 95% confidence intervals do not include zero.

Multiple mediation analysis.
In the well-replicated UrbanSat-behaviour correlations in both datasets, multiple mediation analysis, an extension of mediation analysis 63,64 , was applied to formally test whether the UrbanSatbehaviour relationship can be mediated by brain structure and function while controlling confounding covariates and other mediators (Supplementary Methods).
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All the data are available from the authors upon reasonable request and with permissions of the CHIMGEN and IMAGEN consortia.

Code availability
Custom code that supports the findings of this study is available from the corresponding author upon request. Corresponding author(s): Gunter Schumann Last updated by author(s): Jul 20, 2021 Reporting Summary Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Research policies, see our Editorial Policies and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted

Software and code
Policy information about availability of computer code Data collection In CHIMGEN and IMAGEN datasets, Google earth engine was applied to collect the physical environment data (nightlight, NDVI,NDBI,NDWI,global land cover mapping) based on the subjects' geolocation. T1-weighted imaging, diffusion tensor imaging and restingstate functional imaging were acquired using 3.0 Tesla MRI scanners at each center. And paper-based questionnaire and E-Prime 2.0 software was used to collect some of behavioral data. Data analysis R package lavaan, R package mice, R package meto, MATLAB and SPSS was used to analysis remote sensing data for physical environment and behavioral data. And Statistical Parametric Mapping software package (SPM12), FMRIB's Software Library (FSL v6.0.1) toolbox , Freesurfer, Data Processing Assistant for Resting-State fMRI (DPARSFA) and Group ICA Of fMRI Toolbox (GIFT) were used to analysis neuroimaging data.
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability The raw remote sensing data are free access from Google earth engine. The raw neuroimaging data and behavioral data that support the findings of this study are available from the corresponding author upon reasonable request. These data were used to generate images in Table 1 ,Figs. 1-5, all supplementary Tables and Figs. Data exclusions CHIMGEN: (1) Excluding participants without lifetime residential information Among the 5425 participants, 3336 participants had provided lifetime residential geographies. The remaining 2089 participants were excluded because they only provided their residential addresses at the time point of recruitment but they refused to provide their residential addresses at any other time points since birth. From the 3336 participants, we successfully extracted satellite-based measures of urbanicity of 3306 participants. The other 30 participants were excluded because extracting satellite measures failed in more than three years during their lifetime.
(2) Excluding participants without confounding covariates Potentially confounding covariates including age, gender, education, site, body mass index (BMI), genetic population stratification, socioeconomic status (SES), total intracranial volume (TIV), mean cortical thickness (MCT) and total surface area (TSA) were corrected in the correlation analyses of satellite based-measure of urbanicity with brain and behavior. Complete information of confounders was available in 2176 participants, with 1130 participants being excluded from the 3306 participants with lifetime geopositioned data.
(3) Excluding participants without qualified neuroimaging data For each neuroimaging measure, we had to exclude participants with unqualified raw imaging data and participants failed to pass the quality control (QC) during imaging data preprocessing. In the 2176 participants, 2176 participants were included in the voxel-based morphometry (VBM) analysis of gray matter volume (GMV) and 2164 participants in the surface-based morphometry (SBM) analysis of cortical thickness (CT) and surface area (SA) based on T1-weighted neuroimaging data ; 2158 participants in the Tract-based Spatial Statistics (TBSS) analysis of fractional anisotropy (FA) based on diffusion tensor imaging (DTI) data; and 2156 participants in the within-network (WNFC) and betweennetwork (BNFC) functional connectivity analyses based on resting-state functional MRI (rsfMRI) data.
(4) Excluding participants without qualified behavioral assessments For each behavioral measure analysis, we had to exclude participants without qualified behavioral assessment. In the 2176 participants with at least one type of the qualified MRI data, 2173 participants were finally included in the analysis of verbal learning memory, 2063 in working memory, 2139 in information processing speed, 2148 in social cognition, 2024 in cognitive control, and 2170 in mental health. IMAGEN: (1) Excluding participants without lifetime residential information Among the 1411 participants of IMAGEN-FU2, 561 participants provided lifetime residential geographies. All participants's satellite-based measures of urbanicity at each year have been extracted successfully.
(2) Excluding participants without confounding assessments From these 561 participants, we excluded 79 participants without the confounding assessments (SES, parental history of mental illness and genetic population stratification) and the remaining 482 participants were included in the further analysis.

Replication
The results from CHIMGEN could be replicated in IMAGEN datasets