Age, sex, race and ethnicity representativeness of randomised controlled trials in peri‐operative medicine

The applicability of the results of any clinical trial will depend to a large extent on whether the study population is representative of the population seen in clinical practice. The growing older surgical population presents challenges for peri‐operative researchers to ensure there is adequate representation of patients in terms of their age, sex, race and ethnicity in clinical trials. A review of purposively sampled published randomised controlled trials was performed to establish the age, sex, race and ethnicity of study participants. These data were compared with national registry data for the relevant surgical populations. We included 224 peri‐operative trials that were cited in 469 retrieved meta‐analyses. Of these, 50 (22.3%) had an upper age limit to recruitment. The median (range [IQR]) difference in study population age from the registry population age was: ‐2.4 (−6.2 to 1.0 [−34.7 to 14.5]) years for all randomised controlled trials; −6.2 (−9.4 to −2.8 [−18.6 to 4.6]) years for randomised controlled trials of patients undergoing hip arthroplasty; and −3.4 (−9.6 to −1.1 [−34.7 to 2.9]) years for randomised controlled trials of patients undergoing hip fracture surgery. In 92 (41.1%) randomised controlled trials, the proportion of each sex in the study population was more than 25% different from the proportion in the registry population. Only 5 (2.2%) trials published data on the race or ethnicity of participants. We conclude that peri‐operative randomised controlled trials are unlikely to be representative of the age and sex of clinically treated surgical populations. Researchers must endeavour to ensure representative study populations are recruited to future clinical trials.


Introduction
The applicability of the results of any clinical trial depends to a large extent on whether the study population is representative of the population of patients treated in clinical practice. Ideally, this would at least include representation of: age; sex; race; and ethnicity. The global population is ageing, and it is predicted that by 2050, the number of people aged > 60 y will more than double to 2.1 billion [1]. Fowler et al. have shown that the population of patients having surgery in England are ageing faster than the population as a whole, and estimate that by 2030, one-fifth of those aged > 75 y will undergo surgery each year [2].
Under-representation of older patients in clinical trials [3][4][5] has resulted in numerous agencies producing guidance on the inclusion of older patients to improve adequate representation and hence the applicability of any findings [6,7]. Historically, there has been an under-recruitment of females to clinical trials [8]. This has been in part attributed to the discouragement of females of child-bearing age from participation, particularly following the thalidomide scandal of the 1960s [9]. Reporting on this issue together with publication of guidance on clinical trial sex-representation appears to have resulted in improved representation [9,10]. Despite this, female underrepresentation in clinical research continues to be reported [11,12].
The expanding and increasingly older surgical population presents challenges for peri-operative trialists to ensure there is adequate patient representation in clinical trials. We aimed to assess the adequacy of representation of: age; sex; race; and ethnicity of peri-operative research trials compared with contemporaneous national surgical populations.

Methods
We conducted a review of randomised controlled trials (RCTs) in peri-operative medicine. A purposive sampling approach was used, and we aimed to extract data on clinical trials likely to have been of greatest impact. Studies were selected that: aimed to answer a clinical question relevant to a high-volume major surgical population; examined core peri-operative domains; and were included in recent meta-analyses.
We used English surgical registry data to identify major surgical procedures frequently performed on older people [13]. Seven different surgical subgroups were chosen: colorectal resection; elective hip arthroplasty; elective knee arthroplasty; operative management of fractured neck of femur; transurethral resection of prostate and transurethral resection of bladder tumour; nephrectomy; and lung resection. Selection criteria for surgical subgroups can be found in Supplementary Appendix S1. In England, these seven operative categories account for 125,000 operations per year in patients aged > 75 y and 228,000 operations in all patients (see Supplementary Appendix S2 for frequency of each operation stratified by age). The most common procedure in older adults is cataract surgery (221,000 procedures per year in patients aged > 75 y). Trials in these patients were excluded, as it is a minor procedure and peri-operative research in this area is sparse.
Based on investigator discussions and following an initial scoping review of the literature, five domains of peri-operative interventions were chosen. These were: enhanced recovery after surgery; goal directed therapy; patient blood management; pain management; and pre-operative optimisation. Selection criteria for peri-operative domains can be found in Supplementary Appendix S1. We employed a purposive sampling strategy to identify peri-operative RCTs that have been included in meta-analyses, and are therefore likely to impact clinical practice. An initial search was performed to identify meta-analyses of the different peri-operative interventions in the specified surgical subgroups. Randomised controlled trials were then selected from these meta-analyses to be used in our analysis. Literature searches were conducted using the electronic databases Ovid MEDLINE, Embase and the Cochrane Central Register of Controlled Trials.
We performed 35 searches across seven surgical categories and five research domains. Results were restricted to systematic reviews published during 2009 or later. A full list of the search terms used can be found in Supplementary Appendix S3. The search was conducted on 28 December 2018. The titles and abstracts of the search results were screened independently by two investigators. Any disagreement was resolved by a third investigator. Meta-analyses investigating the specified perioperative intervention in the specified surgical subgroup were selected. Studies looking specifically at paediatric populations were excluded.
Full texts of meta-analyses identified by our literature search were reviewed to ensure they met the inclusion criteria. Randomised controlled trials cited in the retrieved meta-analyses were selected. If there were 20 or more RCTs identified for any of the 35 search categories, the 10 most recent and then the 10 most cited of the remaining RCTs were used in our analysis. The full text of each of the identified RCTs was reviewed independently by two investigators. Any disagreement was resolved by a third investigator. The following data were extracted: the year of start and end of study recruitment; the country of treatment of the primary study population; the number of participants; the proportion of females; central values (mean or median) and spread (standard deviation or interquartile range); whether an age subgroup analysis was performed; whether participant race or ethnicity was reported (as defined by the primary study authors); the presence of any study population inclusion or exclusion criteria which the reviewer felt would exclude or bias against older people or a particular sex. The midpoints of the trial recruitment were used to represent the study year when making comparisons with registry data.
We extracted information regarding surgical subgroup population age and sex from the published data of three national registries: Hospital Episodes Statistics (England) [13], Federal Statistics Office of Germany [14] and Australian Institute of Health and Welfare [15]. Age data from the National Hip Fracture Database [16] were used in the analysis of RCTs involving patients undergoing surgical management of a fractured neck of femur. The procedure codes used to identify each of the surgical subpopulations in each of the registries are included in Supplementary Appendix S4. Where not available, total mean age and standard deviations were calculated from the age group data provided by the registries. Where registry data was reported for financial years, the start year was used in the analysis. Where registry data was not available for certain years, data from the nearest year with data recorded was used in the analysis.
To report age representation, the mean registry age for the nearest year with available data was subtracted from the central value of the study age (mean or median). Values less than zero represent studies with average ages less than the corresponding registry population. Due to the asymmetrical sex balance in some populations (notably hip fracture), the sex representation was reported as the ratio of the proportion of male participants in the study compared to the proportion in the registry population. Values less than one represent underrepresentation of males, and greater than one, over-representation. Due to the lack of reported data on race and ethnicity, these data are reported narratively.
Data analysis was performed with Stata version 15 (Stata Corp, College Station, TX, USA). Data are presented descriptively. There is no inferential statistical analysis as this was felt to be inappropriate given the nature of our non-random sample.

Results
We identified 469 meta-analyses from which we screened 1158 RCTs and included 366 (31.6%). Of these, 224 (61.2%) met the eligibility criteria and were included in the final analysis ( Fig. 1). A definitive list of included studies is provided in Supplementary Appendix S5. Of the 224 RCTs: 63 (28.1%) studied patients undergoing colorectal resection; 46 (20.5%) studied patients undergoing elective knee arthroplasty; 39 (17.4%) studied patients undergoing elective hip arthroplasty; 41 (18.3%) studied patients undergoing surgical management of a fractured neck of femur; 4 (1.8%) investigated patients undergoing transurethral resection of prostate or transurethral resection of bladder tumour; 1 (0.4%) studied patients undergoing nephrectomy; and 34 (15.2%) investigated patients undergoing lung resection. Given the limited number of RCTs identified studying nephrectomy, transurethral resection of prostate and transurethral resection of bladder tumour populations, we did not perform individual analyses of these data.
There was a total of 22,404 participants in all included RCTs. The median (IQR [range]) number of participants was 66 (45-110 ). Of the 224 analysed RCTs, 114 (50.9%) were conducted in Europe, 34 (15.2%) in North America, 11 (4.9%) in Australasia, 64 (28.6%) in Asia, and 1 (0.4%) in South America. Registry data were qualitatively similar between all the included national registries. Data are therefore presented using the Hospital Episodes Statistics (England) database unless otherwise specified. For data on the age of patients undergoing operative management of fractured neck of femur, we used the National Hip Fracture Database. Additional analyses using German and Australian data are included in Supplementary Appendix S6.
We made a judgement that 77 (34.4%) RCTs had exclusion criteria which might have unnecessarily excluded or biased against the inclusion of older people. Of these, 50 (22.3%) RCTs had an upper age limit to recruitment and 18 (7.4%) excluded patients with specified comorbidities, even though these participants were appropriate candidates for the study intervention. Eleven (4.5%) RCTs excluded patients based on ASA grade, despite this not being a clinical contra-indication to the study intervention. Thirteen (5.3%) RCTs excluded patients with cognitive impairment rather than on a formal assessment of capacity.
Regarding the 41 RCTs on patients undergoing surgery for fractured neck of femur: 7 (17.1%) had exclusion criteria felt to unnecessarily exclude older patients; 5 (12.2%) had upper age limits to recruitment; 15 (36.6%) had a lower age limit to recruitment; and two (4.9%) published an age subgroup analysis.
In 92 (41.1%) RCTs, the proportion of each sex in the study population was more than 25% different from the proportion in registry population. In 32 (14.3%) RCTs, the proportion of each sex in the study population was more than 50% different from the proportion in registry population (Fig. 3).
Five (2.2%) RCTs published data on the race or ethnicity of participants.

Discussion
We found that our sample of peri-operative RCTs were insufficiently representative of the age of surgical populations treated in clinical practice. While there was often misrepresentation of sex in individual RCTs, we did not find evidence of a systematic misrepresentation of one sex over another.
Race and ethnicity of RCT participants are rarely reported. We found that 25% of RCTs we identified had a population that was younger than the surgical population of England by 6.2 years or more, with individual RCTs under-representing the national surgical age by up to 33 years. Underrepresentation of older people was most consistently seen in RCTs investigating patients undergoing hip arthroplasty or surgical management of a fractured neck of femur. The proportion of males and females in 41% of the RCTs in our sample was at least 25% higher or lower than in the overall surgical population of England. While historically there has been under recruitment of females to clinical trials [8], we found no convincing evidence of a consistent sex recruitment bias in our sample.
Lack of representation is likely multi-factorial. Individual RCTs are, in theory, a random sample from the whole population, so some variation is to be expected, more so with smaller studies. Indeed, if such variation was not present, questions might be raised about data integrity [17]. Our study reports data from a non-random, purposively sampled range of studies. It is possible, though we suggest unlikely, that our selection of studies was biased. We nevertheless found clear evidence that study inclusion and exclusion criteria systematically biased against the involvement of older people. We did not observe any evidence of criteria that bias for or against males or females.
Older people are often more likely to trigger exclusion criteria regardless of the appropriateness of the intervention under investigation. We found evidence of this with 34% of the sample RCTs having selection criteria felt to unnecessarily exclude older people and 22% having an upper age limit to recruitment. Several other barriers to the recruitment of older people, which could result in a selection bias, have been identified [18][19][20]. Older people may be less likely to consent due to: sensory and cognitive limitations; frailty; a reluctance to take on something onerous; or reluctance of a family member or caregiver to support an older person in a research project [18]. Investigators may be less likely to recruit older people due to apprehension about the impact of enrolling patients with multiple comorbidities on dropout rates and adverse events, and logistical concerns regarding the inclusion of frailer, older patients [19].
Fewer RCTs involving patients undergoing surgical management of a fractured neck of femur had recruitment age limits or other potentially age discriminatory exclusion criteria, and many had a lower age limit to recruitment. This likely represents a deliberate effort by researchers to increase recruitment of older people. Despite this, the mean or median age of the population in over 75% of these RCTs remains below the registry population mean age, highlighting the difficulty in achieving age representative recruitment. The extent to which sampling error or selection bias contribute to a difference in age between an RCT population and the treated surgical population may not matter in terms of the generalisability of any findings. Whether a study is non-representative of the treated surgical population by chance, or by design, it still creates a problem with interpretation.
Race or ethnicity of RCT participants was only reported in five RCTs. The relationship between race, ethnicity and health outcomes is complex. It has been argued that race and ethnicity serve as proxies for a mix of genetic, disease, social, behavioural, or clinical characteristics, that should be considered individually rather than under the umbrella of a racial or ethnic group [21]. While it may not be appropriate to stratify research by race or ethnicity, awareness of the race and ethnicities of research participants is clearly important if researchers are going to consider their potential implications and ensure equal opportunities are provided for participation in clinical research. This analysis has a number of strengths. The purposive sampling strategy resulted in the identification of RCTs that were likely to have impacted on clinical practice, and it is therefore of importance that their study populations were representative. We sampled 224 RCTs for analysis. Randomised controlled trials were only selected if they investigated specific surgical populations for which registry data were available, allowing reliable comparisons to be made between the study and registry populations. The availability of annualised English registry data from 1998 to 2017 for all the surgical categories, except the surgical management of a fractured neck of femur, allowed these RCT populations to be matched by year for comparisons with registry data.
There are several limitations to our analysis. Limited registry data were available for patients undergoing surgery for fractured neck of femur. Data on age from the National Hip Fracture Database from 2017 and English registry data on sex from 2012 to 2017 were used in the analysis, and these RCT populations were, therefore, not matched by year to registry data. Due to a lack of openly available international surgical registry data, we were unable to compare RCT population age and sex to geographically matched registry populations. Age and sex differences between RCT and registry populations will influence the generalisability of RCT findings to the surgical population of the country of the registry, but will not necessarily reflect age and sex differences in other countries due to geographical variation in the surgical population. We were unable to include RCTs studying the effect of a peri-operative intervention on more than one surgical group, as we would have been unable to compare the study population age to specific registry data. This resulted in a number of potentially influential peri-operative RCTs being excluded from our analysis.
A key issue facing clinical research is an understanding of the impact of ageing, multi-morbidity and frailty on outcomes following medical and surgical interventions. This is amplified in the 'old-old', however this is defined. Neither the RCTs nor the registries provide consistent data on this population (e.g. aged > 80 y or identified as being frail) which leaves a clear gap in our understanding of this high-risk population. Advances in peri-operative care and a changing national and global demographic mean that surgery is being offered to an increasingly diverse population. The evidence of non-representative RCT populations in this analysis highlights the importance of endeavouring to recruit representative research populations if findings are to be reliably generalised to the overall population. Although there is some variation, sex representativeness seems to be adequate. A better understanding and awareness of barriers to recruitment of older people when planning clinical trials may improve representative recruitment and help ensure research is following patient need. Unnecessary age discriminatory exclusion criteria including age limits should clearly be avoided.       ((transfus* or red cell* or red blood cell* or RBC* or PRBC*) near (trigger* or threshold* or target* or restrict* or liberal* or aggressive* or conservative* or prophylactic* or limit* or protocol* or policy or policies or practic* or indicat* or strateg* or regimen* or criteri* or standard* or management or program*)) 7. ((h*emoglobin or h*ematocrit or HB or HCT) near (polic* or practic* or protocol* or trigger* or threshold* or maintain* or indicator* or strateg* or criteri* or standard*)) 8. (transfu* or posttransfus* or retransfus* or hypertansfus* or h*emotranfus* or red cell* or red blood cell* or RBC* or erythrocyte*) 9. ((allogen*ic near blood) or (blood near exposure) or (unit* near blood) or (blood near management) or (blood near product*) or (blood near component*) or (donor* near blood) or (donat* near blood)) 10. (blood sparing or cell salvage or cell saver* or (blood near salvag*) or blood support or (blood near requir*) or (blood near replac*) or autotransfus*) 11. ((prior or before or post or after) near (surg* or operat*)) 18. #1 or #2 or #3 or #4 or # or #5 or #6 or #7 or #8 or #9 or #10 or #11 or #12 19. #13 or #14 or #15 or #16 or #17 20. #18  1. Exercise Test/ or (exercise test or exercise capacity or cardiopulmonary exercise or cardiopulmonary reserve).mp. 2. Anaerobic Threshold/ or (anaerobic threshold, aerobic capacity or aerobic fitness).mp. 3. Oxygen Consumption/ or (oxygen consumption or peak oxygen).mp. 4. (CPET or CPEX or VO2).mp. 5. PREOPERATIVE PERIOD/ or PREOPERATIVE CARE/ 6. (preoperative or pre-operative or pre-surgery).mp. 7. 1 or 2 or 3 or 4 or 5 8. 5 or 6 9. 7 and 8 10. (prehab* or (preoperative and (rehabilitation or exercise or training or physiotherapy or physical therapy or exercise training))).mp. 11. EXERCISE/ and 8 12. 9 or 10 or 11