Two‐year follow‐up of infant and maternal outcomes after planned early delivery or expectant management for late preterm pre‐eclampsia (PHOENIX): A randomised controlled trial

Abstract Objective We evaluated the best time to initiate delivery in late preterm pre‐eclampsia in order to optimise long‐term infant and maternal outcomes. Design Parallel‐group, non‐masked, randomised controlled trial. Setting Forty‐six maternity units in the UK. Population Women with pre‐eclampsia between 34+0 and 36+6 weeks of gestation, without severe disease, were randomised to planned delivery or expectant management. Main outcome measures Infant neurodevelopmental outcome at 2 years of age, using the Parent Report of Children’s Abilities – Revised (PARCA‐R) composite score. Results Between 29 September 2014 and 10 December 2018, 901 women were enrolled in the trial, with 450 women allocated to planned delivery and 451 women allocated to expectant management. At the 2‐year follow‐up, the intention‐to‐treat analysis population included 276 women (290 infants) allocated to planned delivery and 251 women (256 infants) allocated to expectant management. The mean composite standardised PARCA‐R scores were 89.5 (SD 18.2) in the planned delivery group and 91.9 (SD 18.4) in the expectant management group, with an adjusted mean difference of −2.4 points (95% CI −5.4 to 0.5 points). Conclusions In infants of women with late preterm pre‐eclampsia, the average neurodevelopmental assessment at 2 years lies within the normal range, regardless of whether planned delivery or expectant management was pursued. With the lower than anticipated follow‐up rate there was limited power to demonstrate that these scores did not differ, but the small between‐group difference in PARCA‐R scores is unlikely to be clinically important.


| I N TRODUC TION
Pre-eclampsia complicates between 2% and 3% of pregnancies in high-income settings, 1 and is a leading cause of iatrogenic preterm birth. 2 It is a multisystem disorder characterised by placental and maternal vascular dysfunction and is associated with severe complications for both mother and infant. 3 Potential adverse consequences include maternal and perinatal death, maternal stroke, renal and hepatic injury and fetal growth restriction. Current management of pre-eclampsia in most high-income settings involves the close monitoring of maternal and fetal condition, with delivery recommended at 37 weeks of gestation, or sooner, if there is evidence of severe maternal or fetal compromise. 4,5 At 37 weeks of gestation, previous trials have shown that the initiation of delivery benefits the woman without any additional perinatal risk. 6 In women with pre-eclampsia between 34 +0 and 36 +6 weeks of gestation, without severe features of the disease necessitating delivery, there is less evidence to guide the optimal timing of birth. 6 At this gestation, any maternal or perinatal benefit offered by early delivery must be balanced against the potential short-and long-term impacts of late prematurity to the infant. The PHOENIX trial showed that a policy of routine planned early delivery between 34 +0 and 36 +6 weeks of gestation significantly reduces shortterm adverse maternal outcomes. 7 This was accompanied by an increase in neonatal unit admissions, but the indicators of short-term neonatal morbidity were similar between groups. Before making firm recommendations to guide clinical practice based upon these findings, it is important to fully evaluate the impact of planned delivery in this group on longer-term infant outcomes. Planned delivery may improve neurodevelopmental outcomes, as the disease process itself will be stopped, thereby limiting the continuing placental dysfunction associated with fetal growth restriction and other morbidities. However, the consequences of the intervention (planned delivery resulting in an earlier gestational age by 3-5 days, compared with expectant management) could also adversely impact neurodevelopmental outcomes. Thus, there remains a clinical dilemma about the best time to plan delivery, in order to optimise short-and long-term infant outcomes.
The aim of this follow-up study was to evaluate the primary infant outcomes of the PHOENIX trial at 2 years, comparing neurodevelopmental outcomes for infants of women with late preterm pre-eclampsia randomised to planned early delivery or to expectant management. Additionally, we evaluated the impact of the intervention on secondary maternal outcomes (health-related quality of life) and will report on the health economic evaluation separately.

| Study design and participants
The PHOENIX trial was a parallel-group, non-masked, multicentre randomised controlled trial across 46 maternity units in the UK. The published trial protocol and short-term coprimary outcomes described the trial methodology in detail, 7,8 and therefore a brief summary is provided here. There were no substantial changes to the published study design, methods or outcomes after the start of the trial. The trial was approved by the South Central -Hampshire B Research Ethics Committee (no. 13/SC/0645). We compared planned delivery with expectant management (usual care) in pregnant women presenting with pre-eclampsia between 34 +0 and 36 +6 weeks of gestation, without severe features of the disease (which would necessitate immediate delivery), aged 18 years or older, with a singleton or dichorionic diamniotic twin pregnancy and at least one viable fetus. Women with any other comorbidity or with a previous caesarean section or with any fetal position were eligible. The only exclusion criterion to participation was the clinician's decision to initiate delivery within the subsequent 48 h. After providing written informed consent, women were randomly assigned to planned delivery or expectant management via a secure web-based randomisation program provided by MedSciNet. A (non-deterministic) minimisation algorithm, including study centre, singleton or twin pregnancy, severity of hypertension in the 48 h before enrolment, parity, previous caesarean section and gestational age at randomisation, was used to ensure balance between the groups. The intervention could not be hidden from women, clinicians or data collectors because of the nature of the intervention.

| Interventions
Planned early delivery consisted of the initiation of delivery within 48 h of randomisation, to allow for the administration of antenatal corticosteroids if deemed necessary by clinicians. Induction of labour was commenced according to local protocol, with caesarean section undertaken only if an additional obstetric indication was present. Expectant management consisted of usual care, with close monitoring of the maternal and fetal condition, until either 37 completed weeks of pregnancy or the development of severe features necessitating delivery.

| Data collection
Baseline and short-term clinical outcome data were collected up until maternal and infant discharge from hospital K E Y W O R D S delivery, infant, neurodevelopment, pre-eclampsia, preterm and recorded on the web-based trial database. Long-term outcomes were assessed at 6 months post-delivery and again when the infant was 2 years of age. Questionnaires were posted to all woman at these time points (or a link was sent electronically) and participants completed a paper copy or an online version captured by the MedSciNet study database. Health resource use and quality-of-life outcomes, including the EQ-5D-5L questionnaire, were also collected and are reported separately.

| Infant outcomes
The primary long-term infant outcome was neurodevelopmental assessment at 2 years of age, using the Parent Report of Children's Abilities -Revised (PARCA-R) composite score. 9 Secondary long-term infant outcomes were the non-verbal and language PARCA-R subscale scores. The PARCA-R is a questionnaire completed by a parent (or caregiver), taking 15 min to complete, that assesses non-verbal and language development. It is recommended by the National Institute of Health and Clinical Excellence (NICE) as a practical and costeffective method of identifying cognitive and language delay at 24 months in children born preterm. 10 Raw scores from the non-verbal subscale (range 0-34) and language subscale (0-124) are summed to produce an overall composite score. Non-verbal PARCA-R scores were prorated if up to four subscale questions were missing. During the trial the methodology to convert the overall composite score to an age-and sex-adjusted standard score and percentile ranking, relative to the norm, was published, 11 requiring the questionnaire to have been completed at 2 years corrected age (between 23 months and 16 days and 27 months and 15 days). A standardised score of between 85 and 114 would indicate development in the normal range, with scores between 70 and 84 indicating mild delay, scores between 55 and 69 indicating moderate delay and scores of 54 or less indicating severe delay.

| Maternal outcomes
Secondary long-term maternal outcomes included quality of maternal physical and mental health scored using the validated SF-12v2 Health Survey, a short-form generic measure of health status with eight health-related domains. 12 Scores from each of the eight health concepts can be used to generate a physical component summary scale score (PCS-12) and a mental component summary scale score (MCS-12), both with a mean of 50 and a standard deviation of 10, and with a higher score indicating better health. It has been validated in diverse populations, including women who are postpartum. [13][14][15][16] For participants who completed the long-term follow-up, we have additionally reported the co-primary short-term outcome (a composite of maternal morbidity using fullPIERS outcomes and recorded systolic blood pressure of at least 160 mmHg post-randomisation) and the co-primary short-term perinatal outcome (a composite of neonatal deaths within 7 days of delivery and perinatal deaths or neonatal unit admissions). 17 Outcomes were selected before the development of a core outcome set for pre-eclampsia, which does not currently include any longterm outcomes. 18

| Sample size
An initial loss to follow-up rate of 20% assumed that long-term outcomes would be available for approximately 690 infants. 8 This calculation was revised before follow-up was completed and analysis was undertaken, to take into account the higher than expected loss to follow-up rate of 40%. Based on this, it was anticipated that long-term outcomes would be available for approximately 568 infants in total (284 per group, assuming no difference in loss to follow-up between groups). With a one-sided significance level of 2.5%, under a non-inferiority hypothesis, a sample size of 284 in each group achieves 88% power to detect a non-inferiority margin of difference in the mean PARCA-R score of no fewer than four points (one-quarter of a standard deviation). A higher response rate would have enabled narrower confidence intervals and more certainty in our conclusions.

| Statistical analysis
Demographics and clinical characteristics at baseline and short-term infant and maternal outcomes are reported using descriptive statistics. The primary inferences for the 2-year infant outcomes were based on a non-inferiority hypothesis testing framework in both the intention-to-treat (ITT) and the per-protocol (PP) analysis populations. The primary inferences for the 6-month and 2-year maternal outcomes were based on a superiority hypothesis testing framework in the intention-to-treat analysis population. All analyses used the expectant management group as the reference group. There were no interim analyses planned.

| Infant outcomes
With the statistical analysis plan based on standardised scores, but with infant questionnaires being sent out at a chronological age of 2 years, a lower proportion than anticipated of PARCA-R questionnaires were completed during the time window allocated for standardising (at <23.5 and >27.5 months of age, corrected for prematurity). To correct for this, multiple imputation by chained equations was used to impute the PARCA-R standardised scores for those infants (approximately 74% of responders). Imputation models included the raw PARCA-R scores, age-corrected for prematurity, sex, minimisation factors and any auxiliary variables associated with the outcome or the missingness of the outcome. Imputation models were developed separately for each outcome and each population. Pooled estimates were obtained from linear regression models, adjusted for minimisation factors as fixed effects and the correlation between multifetal pregnancies. Centre was not fitted as a random effect as planned, because of model nonconvergence. Pooled adjusted means, adjusted mean differences and 95% confidence intervals are reported. The p-values for the composite score alone are reported, and are for onesided 2.5% significance non-inferiority tests based on a margin of four standardised score points.

| Maternal outcomes
Mixed-effect linear regression models adjusted for minimisation factors were fitted for the maternal outcomes (PCS-12 and MCS-12), with centre fitted as a random effect. The adjusted mean values, the adjusted mean differences, the 95% confidence intervals and the corresponding p-values are reported. The means and standard deviations for subdomains are unadjusted.

| Subgroup analyses
Pre-specified subgroup analyses for the 2-year infant outcomes were performed on the multiply imputed data sets for the composite PARCA-R score. Pooled estimates were obtained from the same linear regression models used for the primary analysis, containing an interaction term between the subgroup and the study arm. Pooled adjusted means and 95% confidence intervals are reported.

| Sensitivity analyses
Sensitivity analyses were performed on the 2-year infant outcome, excluding infants outside of the time window for standardisation. Mixed-effect linear regression models were fitted, adjusting for correlation between twins, minimisation factors as fixed effects and centre as a random effect. The adjusted mean values, the adjusted mean differences and the 95% confidence intervals are reported.

| Role of the funding source
The study was funded by the UK's National Institute for Health and Care Research (NIHR) Health Technology Assessment Programme (12/25/03) following external peer review, and with involvement of public representative panel members. The funder of the study had no role in the study design, data collection, analysis, interpretation or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication. The trial was prospectively registered with the ISRCTN registry (ISRCTN01879376).

| Patient and public involvement
We worked with representatives (including those with lived experience of pre-eclampsia) from Action on Pre-eclampsia (the patient support group) and Tommy's (a national baby charity) to ensure that the voices of pregnant women (and their wider families) informed and influenced every stage of the research process. Full details on the methodology and outcomes of this are reported in Table S8 (GRIPP2-SF checklist) of the supporting information.

| R E SU LTS
Between 29 September 2014 and 10 December 2018, 901 women were enrolled in the trial, with 450 women allocated to planned delivery and 451 women allocated to expectant management ( Figure 1). The ITT analysis population for short-term maternal and perinatal outcomes included 448 women (471 infants) allocated to planned delivery (as two of the allocated women withdrew consent) and 451 women (475 infants) allocated to expectant management. Follow-up for the 2-year assessment continued until 31 December 2020. At the 2-year follow-up, the long-term ITT analysis population included 290 infants (62%) and 276 women allocated to planned delivery and 256 infants (54%) and 251 women allocated to expectant management. There were no serious adverse events reported at long-term follow-up.

| Characteristics of women responding to follow-up
Baseline maternal and pregnancy characteristics of women responding at 2 years were broadly similar across the two randomised groups ( Table 1). The median gestational age at randomisation in both groups was 36 weeks, and the prevalence of suspected growth restriction was similar (19.8% in the planned delivery group and 23.1% in the expectant management group). The study centre at randomisation of the women responding at 2 years is shown in Table S1.
In women who completed the 2-year assessment, a higher proportion of infants in the planned delivery group had been delivered at 34 weeks of gestation (17.2% vs. 11.7%), as expected with the trial intervention (Table  S2), and had been admitted to the neonatal unit (40.3% vs. 35.5%), driven by admissions where the primary indication was listed as prematurity. However, a higher proportion of infants in the expectant management group were born small-for-gestational age (21.5% vs. 14.1% <10th centile; 5.1% vs. 2.8% <3rd centile), compared with those in the planned delivery group. Maternal mortality and morbidity were lower for responding women allocated to planned delivery, compared with those allocated to expectant management (65.2% vs. 75.5%) (Table S3).

| Primary infant outcomes
Of the 546 infant questionnaires returned, and using imputed standardised scores for those who had a raw PARCA-R score outside of the age window for standardisation, the adjusted mean difference comparing planned delivery with expectant management for the composite PARCA-R score at 2-years follow-up was −2.4 (89.5 vs. 91.9, 95% CI −5.4 to 0.5, non-inferiority p = 0.1) in the ITT population ( Figure 2). The confidence interval encompassed the four-point margin and so we could not conclude non-inferiority. Similar results were seen in the PP population: −1.9 (90.2 vs. 92.1, 95% CI −5.2 to 1.4, non-inferiority p = 0.1) (Figure 2). The adjusted means for both groups and populations were within the range of 85-114 (indicating normal neurodevelopment), as were the adjusted means for the subscale scores ( Figure 2).

| Maternal outcomes
For maternal outcomes, there were no significant differences in physical component summary scale score  and mental component summary scale score (MCS-12) between women allocated to planned delivery and expectant management arms at 2 years (PCS-12 mean difference 0.29, 95% CI −1.29 to 1.87; MCS-12 mean difference 1.27, 95% CI −0.86 to 3.40) (Figure 3). Similar summary scores and subdomain scores were seen at 6 months and 2 years, indicating no evidence of a change of health status during follow-up.

| Sensitivity analyses (infant outcomes)
Sensitivity analyses including only infants assessed within a corrected age range of 23.5-27.5 months did not alter the findings (Tables S4 and S5).

| Subgroup analyses (infant outcomes)
Pre-specified analyses for the PARCA-R composite score did not suggest important clinical differences by subgroups for both ITT and PP populations ( Figure S1).

| Women responding to follow-up
The baseline characteristics of responders and non-responders at the 2-year assessment are described in Tables S6 and S7. Maternal responders at the 2-year follow-up were more likely to be white, have a low deprivation index score and were less likely to currently smoke at the time of initial antenatal visit, compared with those who did not respond. Short-term infant outcomes between responders and non-responders at the 2-year follow-up were similar with regards to neonatal unit admission, birth of a small-for-gestational age (<10th centile) infant and short-term morbidity (Table S2).

| Main findings
The mean standardised PARCA-R scores at 2 years for infants of mothers with late preterm pre-eclampsia randomised to planned early delivery or expectant management indicate that, on average, their neurodevelopment is within the normal range for both trial groups. 9 This provides reassuring data on the long-term outcomes of infants born late preterm, even when the additional complication of pre-eclampsia is present. Subgroup analysis by gestational age at randomisation showed that mean standardised scores remained within the normal range, even at earlier gestations (34 +0 -34 +6 weeks of gestation), where the severity of disease may also be worse. The confidence intervals for the adjusted mean difference of −2.4 points in the planned delivery arm compared with the expectant management arm were above the pre-specified threshold to be able to definitively confirm the non-inferiority of planned delivery. However, a mean difference of two points is unlikely to be clinically important at 2 years of age. No evidence of a difference was found in quality of maternal mental or physical health at 6 months and at 2 years between the two groups. Mean SF12-v2 scores were consistent with those previously reported in similar populations. 15,16

| Strengths and limitations
This is the largest trial to date evaluating planned early delivery in late preterm pre-eclampsia and provides important information for clinicians and women faced with this clinical scenario. Long-term follow-up was identified as an important component of the research question and every possible strategy was employed to maximise the number of respondents. Similar trials attempting long-term follow-up of women and their infants report response rates varying from 14% to 61%, [19][20][21] demonstrating the challenge associated with this objective, particularly when the population of interest is generally healthy and not under routine clinical follow-up (in contrast to infants born very preterm). Thus, the inclusion of long-term outcome data is a strength of this study and is likely to be of interest to women with preeclampsia and their clinicians. The trial was limited by a higher loss to follow-up rate than expected, meaning that the extent and direction of bias in outcomes (between responders and non-responders) is uncertain. This was compounded by PARCA-R questionnaires being sent out at chronological rather than corrected age, meaning that imputation was needed to convert some raw scores into standardised scores. With a smaller sample size than expected for the long-term primary outcome, and the consequently reduced precision of our estimates, our ability to draw firm conclusions is limited. A longer follow-up period (e.g. up to 5 years) would have enabled us to provide further evidence on long-term infant outcomes, using measures such as intelligence quotient (IQ), and to identify whether any of the differences observed between the two groups resulted in any clinical meaningful differences at school age, but this runs the risk of greater attrition and increased expense.

| Interpretation
Infants born late preterm have been found to be at increased risk of neurodevelopmental delay and poor school performance in the long term, [22][23][24][25][26] but this is typically compared with healthy infants born at term. 27 Pre-eclampsia is a disease state associated with fetal growth restriction, 28 which itself is demonstrated to adversely impact childhood development. 29,30 In this scenario, it is possible that earlier delivery might improve long-term neonatal outcomes, compared with expectant management which is associated with increased risk of growth restriction. 7,20,31 In support of this, previous trials have shown that although infants of women with hypertensive disorders of pregnancy who underwent planned early delivery between 34 +0 and 36 +6 weeks of gestation had a small difference in neurodevelopmental outcomes at 2 years of age, 20 these differences did not persist at the 5-year follow-up. 21 At 5 years of age, other factors such as maternal education and birthweight appear to be more important predictors of long-term infant development than near-term gestational age at delivery. 21,26 This trial provides strong evidence that planned early delivery reduces immediate adverse maternal outcomes with no evidence of differences in self-reported quality of maternal physical and mental health at 6 months and at 2 years between the intervention groups. However, the impact upon the infant remains unclear. Planned early delivery may increase the need for neonatal unit admission in the short term, primarily for an indication of prematurity (i.e. a routine admission without objective morbidity), but there is no evidence that it increases short-term neonatal morbidity. At 2 years, the mean PARCA-R scores for infants across both groups were within the normal range, which suggests no clinically important long-term harm to the infant, but as the confidence intervals for the mean difference between the groups crosses the pre-specified non-inferiority margin, uncertainty remains. Pre-eclampsia is an independent risk factor for F I G U R E 3 Maternal secondary long-term outcomes: SF-12 Health Survey Summary Scale at follow-up at 6 months and at 2 years. The solid line shows the line of no difference. CI, confidence interval; MCS-12, Mental Component Summary Scale Score; PCS-12, Physical Component Summary Scale Score; SD, standard deviation adverse infant neurodevelopmental outcomes, 26,[32][33][34] and the mean PARCA-R scores in this trial were at the lower end of the normal range, consistent with previous studies. Infants in the planned early delivery group had lower PARCA-R scores compared with those in the expectant management group, but the mean difference of −2.4 points is unlikely to be clinically meaningful or to influence longer-term outcomes, such as school performance, particularly once other important predictors such as socio-economic status are taken into account. 26 In addition, the risks for an infant associated with late preterm birth must be balanced against those associated with continuing fetal growth restriction.
Future research must focus on how best to communicate these findings to women and translate them into clinical practice. The choice of clinically meaningful neonatal outcomes, particularly for infants born to mothers with pre-eclampsia, remains a challenge and an area where further work and consensus building is needed. 18 Furthermore, an intervention such as planned early delivery is likely to have a considerably different impact in different contexts where resources and disease burden are different. Most maternal and perinatal deaths associated with pre-eclampsia occur in low-and middle-income countries, 35 which have markedly higher stillbirth rates than those reported in high-income healthcare settings. 36 A multicentre randomised controlled trial evaluating the effect of planned delivery on adverse maternal outcomes and perinatal morbidity and mortality is currently underway. 37

| CONCLUSION
Our results show that in women with late preterm preeclampsia, the average neurodevelopmental assessment of infants at 2 years lies within the normal range, regardless of the timing of delivery. The small between-group difference in PARCA-R scores is unlikely to be clinically important, but because of the lower than anticipated follow-up rate there was limited power to demonstrate that these scores did not differ. This follow-up provides further information for clinicians about the balance of risks of benefits of planned early delivery between 34 +0 and 36 +6 weeks of gestation to facilitate shared decision making.