Is there an association between study size and reporting of study quality in dermatological clinical trials? A meta‐epidemiological review

Clinicians engaged in evidence-based medicine often have to base treatment decisions on randomised clinical trials (RCTs) with small sizes.1 The choice of sample size is influenced by a number of factors including power, magnitude of difference anticipated, alpha, study design and whether the objective of the trial is equivalence or superiority. Based on our experience of reviewing hundreds of dermatological clinical trials, we hypothesized that sample size could be a crude surrogate of study quality in that larger studies are of generally better quality than smaller ones. 
 
This article is protected by copyright. All rights reserved.

DOI: 10.1111/bjd.14931 DEAR EDITOR, Clinicians engaged in evidence-based medicine often have to base treatment decisions on randomized clinical trials (RCTs) with small sample sizes. 1 The choice of sample size is influenced by a number of factors, including power, magnitude of difference anticipated, alpha, study design and whether the objective of the trial is equivalence or superiority. Based on our experience of reviewing hundreds of dermatological clinical trials, we hypothesized that sample size could be a crude surrogate of study quality in that larger studies are of generally better quality than smaller ones. Given the substantial evidence that some specific trial characteristics, such as allocation concealment, may influence treatment effect, [1][2][3] knowledge of study quality before implementing trial results into clinical practice is important. Therefore, we aimed to examine if there is an association between sample size and risk-of-bias assessment in RCTs included in published Cochrane skin reviews. The Cochrane risk-of-bias tool focuses on whether key items, known to be empirically associated with distortion of treatment effect, are reported or not. 3,4 Therefore, we refer to 'reporting of study quality' as a proxy for formal study quality.
To perform the assessment, we identified all Cochrane Skin Group reviews and updates published between 2010 and 2014. For each of the included trials we extracted the following: sample size (total number randomized), study design (parallel/nonparallel), type of study (full/pilot/feasibility) and year of publication. Using the risk-of-bias tables included in the individual Cochrane reviews, two authors independently extracted the Cochrane risk-of-bias assessment (low, high or unclear) for the following five quality criteria: (i) methods for sequence generation; (ii) allocation concealment; (iii) blinding of participants and personnel; (iv) blinding of outcome assessment; and (v) incomplete outcome data.
We used the Kruskal-Wallis test to determine if there was a statistically significant association between sample size and risk-of-bias category, for each quality criterion. The Mann-Whitney U-test was performed to assess each of the three pairwise comparisons. Given that within-patient and crossover trials require fewer patients than parallel-design trials, we stratified the results by study design and repeated the analysis. We also performed a post hoc analysis by stratifying the data by decade of trial publication (1966-75, 1976-85, 1986-95, 1996-2005, 2006-14), to assess whether the nature of the association changed over time. Finally, we determined if the proportion of trials with unclear risk of bias for each quality criterion changed over time using the v 2 -test. We adjusted for multiple testing using a Bonferroni correction.
In total, we identified 24 new reviews and 10 updates, and 1127 included trials within these. The median sample size was 72 (interquartile range 40-151). The majority of trials were of parallel design [n = 928 (82Á3%)]. Of the 1127, four were feasibility trials and 18 were pilot trials. For random sequence generation and allocation concealment the majority of trials were assigned unclear risk of bias (64Á4% and 79Á0%, respectively; Table 1). In contrast, for blinding participants/personnel (43Á5%), blinding-outcome assessment (44Á7%) and incomplete outcome data (56Á0%) in approximately half of the trials had a low risk of bias.
Overall, average sample size did not differ significantly between the three risk-of-bias categories for any of the quality criteria (Table 1). For all quality criteria, apart from incomplete outcome data, trials with low risk of bias had a larger sample size than those at high risk of bias. Comparing median sample size of low with unclear trials, there were also no statistically significant differences for any of the quality criteria. Similar findings applied to both parallel and nonparallel studies.
During the first (1966-75) and last (2006-14) decades, there was no statistically significant association between sample size and risk-of-bias category for any of the quality criteria. In the intermittent years there were some significant associations; however, after accounting for multiple testing these were no longer statistically significant.
Our study failed to show any statistically significant association between study size and quality, although the direction was as we expected for most quality criteria. The high degree of unclear risk of bias for some items made it very difficult to come to any firm conclusions. Our study underlines the critical importance of improving the quality of reporting of dermatological trials by applying reporting guidelines such as CONSORT 2010. Large and statistically significant associations between sample size and risk of bias could still exist if all key items had been reported properly. 1,5 It is reassuring that the proportion of criteria with unclear risk of bias has declined over the last decade, yet the overall quality of reporting is still poor. It is possible that other key (but as-yet-unknown) factors are critical for determining study quality as opposed to reflecting quality of reporting. We conclude that sample size cannot be used as a shortcut for determining reporting of study quality as assessed by current methods to assess risk of bias.