Beyond current research practice: Methodological considerations in MS rehabilitation research (is designing the perfect rehabilitation trial the Holy Grail or a Gordian knot?)

Rehabilitation is an essential aspect of symptomatic and supportive treatment for people with multiple sclerosis (MS). The number of randomised controlled trials (RCTs) for rehabilitation interventions in MS has increased over the last two decades. The design, conduct and reporting quality of some of these trials could be improved. There are, however, some specific challenges that researchers face in conducting RCTs of rehabilitation interventions, which are often ‘complex interventions’. This paper explores some of the challenges of undertaking robust clinical trials in rehabilitation. We focus on issues related to (1) participant selection and sample size, (2) interventions – the ‘dose’, content, active ingredients, targeting, fidelity of delivery and treatment adherence, (3) control groups and (4) outcomes – choosing the right type, number, timing of outcomes, and the importance of defining a primary outcome and clinically important difference between groups. We believe that by following internationally accepted RCT guidelines, by developing a critical mass of MS rehabilitation ‘trialists’ through international collaboration and by continuing to critique, challenge, and develop RCT designs, we can exploit the potential of RCTs to answer important questions related to the effectiveness of rehabilitation interventions.

Effectiveness (or pragmatic) trials aim to answer whether an intervention produces benefits for patients under usual clinical (real-world) conditions, to determine whether the intervention translates to clinical practice, and how and with whom the intervention could be used 6 . An example of an effectiveness rehabilitation trial is Boesen et al. 7 . Despite this distinction, efficacy and effectiveness trials should be seen as two ends of a continuum 8 . Both serve different functions 9 , can be equally robust, and are required. The key, at the design stage, is determining what kind of trial is needed, with clarity and transparency about the research questions being posed. Based on this distinction, trials vary in relation to Participants, Intervention, Control, and Outcomes (PICO).

(i) Selection
Several factors influence participant recruitment, including whether the study is an efficacy trial (requiring a homogenous sample meeting specific entry criteria) or an effectiveness trial (requiring a more heterogeneous sample to enable generalisability). Trials are sometimes criticised for not recruiting participants akin to patients seen in clinical practice (with multiple comorbidities and illness profiles, from different backgrounds), or are drawn from specific services or geographical locations with regional idiosyncrasies, thereby questioning the generalisability of the findings 10 . This criticism can be addressed if trialists specify the trial type. Also important is distinguishing between the proportion of participants eligible for the intervention and those unable to participate in the research study (exclusions such as reduced capacity for consent, enrolment on another trial), to provide a clearer indication of how many people would benefit from the intervention if it were part of routine practice.
Participants should be recruited based on the goal of the rehabilitation intervention. Some MS rehabilitation trials have not done this. For instance, a Cochrane review of exercise interventions for fatigue in MS concluded that most trials did not explicitly include people who experienced fatigue 11 . This could have affected the overall 'effectiveness' of the intervention.
Often neglected in terms of participant selection is a participant's readiness for the intervention or behavioural change. This is particularly important in rehabilitation trials where participants need to actively engage in the treatment to derive benefit. Using strategies such as motivational interviewing has demonstrated improvement in participant engagement, therapy adherence, outcomes, and patient-provider relationships 12 . Consideration of this variable as an inclusion criterion or moderator may enhance understanding about differential treatment effects.
As per CONSORT guidelines for non-pharmacological trials 13  and rehabilitation context should also be documented.

(ii) Sample size
In RCTs a sample with the specific attribute of interest is selected and studied, and the findings "extrapolated" 15 . Having a small study risks making a Type II error. Indeed, most trials with 'negative results' are with small samples, under-powered to detect significant differences between groups 16 . Having 'too many' people also poses ethical and resource issues 15 . Therefore, an a priori sample size calculation is required to identify the numbers needed. However, we still find MS rehabilitation trials where sample size calculations are not reported.
Publication bias (i.e., only trials with positive findings being reported 17 in meta-analysis are more likely to affect small studies 18 . Given that most MS rehabilitation trials are 'small', it is concerning that many of the systematic reviews (including several Cochrane Reviews) have not formally assessed publication bias, despite evidence that publication bias exists in some MS rehabilitation trials 11,19 . Assessing this will enable a fuller understanding of the state of the science.
There are, however, complexities in determining sample size in trials of 'complex interventions', such as rehabilitation, because determining a clinically meaningful difference on outcome measures can be difficult (see below).

Intervention
Rehabilitation is a "process of assessment, treatment and management by which the individual (and their family/carers) are supported to achieve their maximum potential for physical, cognitive, social and psychological function, participation in society and quality of living" (p2, 20 ). It is often provided by different healthcare professionals, based on specific needs (such as neuropsychological rehabilitation), or specific symptoms (such as mobility problems), and in a holistic fashion by a coordinated inter-disciplinary team 21 , with access to several specialist health professions 22 .

(i) Dose of treatment
Unlike pharmacological interventions, within much of MS rehabilitation research the optimal 'dose' of the intervention, and when to deliver this (e.g., time post-relapse) is yet to be established. The reasons for this are manifold. In some areas (e.g., cognitive rehabilitation), few large, well-conducted trials are available, so dose-response analyses have not been performed. Where rehabilitation packages containing several 'modules' are evaluated, oftentimes, it is the whole package that is evaluated, so the relative effect of each module is unknown. Also, the complex nature of the intervention means that combined effects of a combination of modules may be greater than the sum of the parts. Finally, there is a recognition, in other areas of neurology, that higher doses of rehabilitation training do not always provide better results 23 . This may also be the case in MS, for instance when impairments (such as fatigue or weakness) and disabilities are very severe. Rule-based dosefinding trial designs are one way of tackling this issue. More commonly used in pharmaceutical research, they have recently been undertaken in stroke rehabilitation to determine the optimal dose for specific intervention prior to undertaking an efficacy study 24 .
These studies utilise an adaptive design approach to enable dose escalation/de-escalation according to pre-set rules and a mathematical sequence, thereby allowing exploration of the dose-response relationship, whilst minimising sample size and maintaining participant safety.

(ii) Description of the intervention
Describing and characterising the "black box" of rehabilitation is important to enable replication of studies and implementation of interventions based on what was trialled.
Checklists and guidelines like the Template for Intervention Description and Replication (TIDieR) 25 can help researchers report interventions more thoroughly, however this is not yet commonplace in MS rehabilitation research. For instance, in a systematic review of trials evaluating cognitive rehabilitation (k=52), the reporting of key aspects of the intervention was judged to be poor, particularly in relation to content of the intervention, delivery mode, and proposed mechanism of action 26 .

(iii) Active ingredients
What constitutes the active ingredients of rehabilitation, which typically uses a multicomponent biopsychosocial model 27,28 , is difficult to determine. For instance, many interventions incorporate several settings, healthcare providers and treatment techniques based on the premise that a successful outcome is a result of this cross-fertilisation of inputs.
Further, the core principle of many rehabilitation interventions is to actively engage individuals in change processes such as learning, practicing, and developing skills to enhance coping and adaptation across diverse outcomes.
Psychotherapy research demonstrates that where different therapies produce similar outcomes, this is because of a set of 'common factors', with 'therapeutic alliance' between the patient and therapist being most relevant 29 . MS rehabilitation research has rarely evaluated such common factors, however, there is some consensus regarding what the 'key' (if not 'active') ingredients, and mode of delivery are in some rehabilitation interventions (e.g., falls 30,31 , fatigue [32][33][34] ).
Understanding the mechanism of action, its target, and desired effect is also important in defining the active ingredients. Advances in neuroscience, such as functional imaging techniques that enable investigation of neuroplastic adaptations 35 , provide important insights into this. But equally, easy-to-collect patient-reported outcomes can improve our understanding of underlying mechanisms 36 . While these may not be a trial's main outcomes, embedding this work within clinical trials provides opportunities to advance knowledge of how treatments work, and to develop new recovery-oriented strategies in MS. For instance, video recordings of therapists delivering the trial intervention can enable us to determine the degree to which therapist competence is associated with patient outcomes 37 , or mixed methods designs collecting patient and therapist perspectives can improve our understanding of the perceived effects and experiences of interventions 38 .

(iv) Fidelity of intervention delivery
Given the complex nature of most rehabilitation interventions, in the context of a trial, it is vital for reliability and validity of the findings that the intervention is delivered as intended, consistently across participants and sites, and any deviations from the planned intervention recorded. While some MS trials describe intervention fidelity 39,40 , this is not yet the norm, despite guidelines available on treatment fidelity in health behaviour change studies 41 , which can be adapted for MS rehabilitation trials. This said, assessing fidelity in complex interventions is challenging, because it not only requires an examination of what was delivered (content), but also how the intervention was delivered (process). Much of the literature around fidelity assessments in trials tends to focus on 'implementation fidelity' (i.e., how the intervention was delivered, dose received, etc.) and 'theoretical fidelity' (i.e., how the intervention delivered matched the theoretical underpinning of the intervention) 42 .
However, these still neglect the quality of the intervention delivery and the quality of the therapeutic relationship. This, however, is rarely assessed in MS rehabilitation trials.

(v) Targeting of interventions
Related to the notion of 'person-centredness', a criticism levelled against RCTs in rehabilitation is that oftentimes the intervention is not sufficiently individualised. This is particularly challenging with group-based interventions, although some attempts have been made to personalise interventions. For instance, in a memory rehabilitation RCT, baseline memory assessment scores were used to highlight which memory strategies might be more useful than others, and homework exercises enabled participants to test memory strategies within their home settings to determine what adaptations needed to be made to the strategies 43 . Similar approaches have been used in MS RCTs of fatigue 36 and falls 44 interventions.

(vi) Treatment adherence
The success of a treatment is partly dependent on patients' adherence. Where patients are required to actively participate over a long period, there is often considerable variation in adherence 45 . While there are some good examples of MS rehabilitation trials that report adherence 46 , this appears less well described in some domains (such as memory/cognition 47 ).
It would be beneficial for trials to report the minimum number of sessions (or other marker of dose) required to demonstrate effect, and the actual number of sessions that participants engage in. The risk of not specifying the actual 'amount' of intervention participants received is that where trials show 'no intervention effect', it is difficult to disentangle whether the treatment was ineffective or whether people did not engage in the required dose of the intervention.
Despite some challenges in their use 48 , wearable technologies (sensors and/or software applications on watches, smartphones, shoes, etc.) can be embedded within RCTs, and can provide an opportunity to objectively capture adherence or behavioural changes outside clinical settings, within the 'real world'.

Control groups
There are several approaches available when deciding which control condition is most appropriate 49 . Whilst one option is to withhold active treatment, ethical issues need to be considered, such as the nature of the target problem, the state of science with regard to the effectiveness of existing treatments, and lack of acceptability to participantswhich might affect recruitment, retention and data integrity. An alternative is the wait-list control design, which has been used in several RCTs of multi-disciplinary packages of rehabilitation 50 and single component interventions such as tele-rehabilitation 51,52 . This approach can control for the effects of time, guaranteeing eventual intervention, reflecting the realities of clinical waiting lists, and being more acceptable to potential participants. However, there is the risk that simply knowing that they will eventually receive the intervention may change the nature of the controls, and may not represent the experience or response to those who do not receive an intervention. Controls that involve comparison to usual care are more commonly used in rehabilitation RCTs, and whilst useful for examining the added value of an intervention, they are complicated by the fact that usual care is often highly variable and may require large samples to achieve adequate statistical power.
Active controls, wherein active treatments are compared to the test intervention (e.g., equivalence trials), are challenging in terms of defining the active ingredients, and being clear about what elements are most essential to control for (e.g., dose, attention, etc.) in behavioural-based 'complex' interventions, with multiple 'active ingredients' that may be difficult to characterise. Nevertheless, this has been achieved in some rehabilitation trials of exercise 53, 54 and cognition 55 , although these trials have investigated single interventions that typically have not incorporated a behavioural approach. Finally, some trials have used attention placebo controls, wherein control participants receive an intervention that is plausible, mimicking the amount of time and attention received by the treatment group, but is thought not to have a specific effect on the outcomes of interest. Arguably, this can be ethically defensible when no effective treatment has proven to exist, which is commonplace for MS rehabilitation, and when participants know what to expect in each treatment arm. This is a challenge for behavioural-based interventions since it requires double blinding, which often cannot be accomplished in the MS rehabilitation field. Studies in MS rehabilitation have achieved this, but have involved "passive" interventions not requiring active engagement by the participants to affect target outcomes 56,57 . Others have attempted to include attention placebo groups, where the facilitators do not present the presumed active ingredients in the sessions, but do not (and cannot) prevent participants from discussing the issues they struggle with (e.g., memory 55 ).

(i) Choice of outcomes
The International Classification of Functioning, Disability and Health (ICF) is the World Health Organisation's "framework for measuring health and disability at both individual and population levels" 58 and is consistent with the aims of rehabilitation. Outcomes can be at an impairment, activity, or participation level, but we cannot assume that improvements at impairment level will generalise to reductions in activity limitations and participation restrictions. There are however, examples where this is the case, for example strengthening of lower limbs has shown to improve mobility in people with MS 59 . Whether or not translation from impairment to activity limitations/participation is seen, should be explicitly and systematically reported so that clear conclusions can be drawn.
It is important to consider health economic outcomes, particularly in pragmatic trials where policy decisions need to be based on both clinical and cost-effectiveness data. MS rehabilitation trials have begun addressing this 43,44,[60][61][62][63] . Quality of life (QoL) is sometimes considered a desirable 'general' outcome of rehabilitation trials. While this is laudable, there are challenges in using such an outcome. As a primary outcome for specific types of rehabilitation (e.g., cognitive rehabilitation), improvements may not be observed on QoL measures as a consequence of the intervention because QoL is often a composite construct (e.g., measures may include items related to fatigue, pain, social and physical function), whereas the intervention may only be able to change one aspect (e.g., cognition or social function). Baumstarck et al. 64 provide a good summary of some of the issues in assessing QoL in people with MS. Similar issues may arise when using generic activities of daily living scales or participation scales wherein a total score is derived from multiple items. In such instances, subset or domain-specific scores may be more appropriate as a primary outcome.
Capturing adverse events in complex intervention trials is challenging, because the chance of detecting them may differ between treatment arms. Furthermore, in some rehabilitation trials, the primary outcome may also be considered as an adverse event (e.g., falls frequency in falls prevention interventions 65,66 . Despite these challenges, and although many rehabilitation trials are considered 'low risk', it is important to consider the potential adverse events, how these will be assessed, monitored, reported, and addressed 60, 61, 67 .

(ii) Number of outcomes
It is tempting to use a large number of outcomes to capture any potential effect in rehabilitation trials. This has the disadvantage of increasing the risk of false positive findings.
Nonetheless, in a complex intervention trial, it may be appropriate to have different types of outcomes that relate to the key clinical or economic variables. In addition, mechanistic measures may be useful to unpack why an intervention worked (or not).

(iii) Defining a primary outcome
Scientific rigor in RCTs rightly demands that the primary and secondary outcomes are defined beforehand, because the trial is powered based on finding a clinically relevant effect of the intervention on the primary outcome. All outcomes need to be described as primary or secondary (in the protocol or trial registry), and should be analysed and reported, in line with CONSORT guidelines 13 . Unfortunately, reviews find that not all MS rehabilitation trials do this 11,19 . Selecting and reporting on a limited number of pre-specified (primary) outcomes, with a clear rationale for selecting them, limits the risk of bias and is an important consideration to building a solid evidence base for rehabilitation interventions. Core Outcome Sets for rehabilitation trials in MS have been recommended for various types of rehabilitation (e.g., exercise interventions 68

(iv) Timing of outcomes
The time point of primary concern should be specified in advance of the analysis. Because it is likely that the best outcome is seen immediately after the intervention, outcomes assessed at this time-point can be useful to determine effectiveness of the intervention. However, this leaves uncertainty as to the long-term benefits. Treatment effects can gradually wear off 34 or, particularly with regard to impact on activities and participation, can take longer to establish 70 . The selection of one time point (either long-or short-term) limits insight into what happens during the rehabilitation process, and restricts the use of powerful longitudinal analysis techniques that can help to understand the course of treatment effects. The time point of primary concern, however, should be specified in advance of the analysis.

(v) Defining a clinically important/meaningful difference
Labelling the result of a trial as "positive" or "negative" is often inappropriately based on whether or not statistical significance is achieved for the primary outcome [71][72][73] . It is more important to show that a pre-specified difference in scores between groups is clinically relevant 74 . Nevertheless, sound evidence on minimal clinically important differences (MCID) of outcomes is often unavailable, barring some good exceptions 43,[75][76][77][78] .

(vi) Statistical analysis
Several MS rehabilitation trials conduct significance testing comparing baseline with followup outcome scores (within group changes) rather than comparing randomised groups directly (between-group differences), which can be highly misleading 79 . Analyses plans should be decided before the data are locked and deviations from this plan should be specified in the publication of trial results, with some arguing that pre-specified statistical analysis plans should be published prospectively 80 .
A summary of these issues and suggestions is provided in Table 1.

Conclusions and future directions
We have made much progress in undertaking, conducting, and reporting RCTs in MS rehabilitation research, but there is some way to go yet.
We believe we can improve the design, conduct, and reporting of trials by following internationally-accepted RCT guidelines (see www.equator-network.org), and by developing a critical mass of MS rehabilitation researchers, perhaps by extending and intensifying cooperation within organisations such as Rehabilitation in MS (www.eurims.org), to undertake high quality rehabilitation trials. We may need to extend our collaborations with "critical friends" outside of the area of MS to facilitate inter-and cross-disciplinary ways of addressing shared methodological dilemmas. One such group might be the newly formed Cochrane Rehabilitation Field (www.rehabilitation.cochrane.org), which may be a vehicle to improve research methods (for trials and syntheses), publicise results, and drive forward evidence-based clinical care.
The ICF is an excellent framework to map rehabilitation trial outcomes. More, however, can be made in relation to the 'personal' and 'environmental' aspects of the ICF, perhaps as understanding prognostic factors that relate to outcome, or in relation to clinical implementation of positive trials.
We also believe work is needed to ensure that journal editors and reviewers judge rehabilitation trials in the light of the specific challenges posed in designing these trials, and view the merits of the trials based on internationally-recognised RCT guidelines for complex intervention trials, rather than compare them with pharmacological trials. We need research that covers the full cycle of the Medical Research Council's framework for the development and evaluation of complex interventions 81 -from developing and modelling interventions, to feasibility and pilot testing, to pragmatic trials and implementation studies. Most MS rehabilitation trials have focussed on the earlier parts of the cycle, and there is a need for more pragmatic trials evaluating clinical and cost-effectiveness, for translation of findings from research to clinical practice. 'Null' or 'negative' results should be published. Some journals, including MSJ, accept "short reports on null or negative results". This is important for the advancement of science. We would, however, suggest that such trials should not be limited to "short" reports, and indeed, may be worthy of longer reports given that such results need to be expanded further and the pathways for future research in the field should be clearly stated. Such papers have spurred great developments in the field 82 .
We need to continue to critique, challenge and develop RCT designs. Because most rehabilitation interventions are complex interventions, outcomes-focussed trials only address whether an intervention is effective or not and do not offer answers to more nuanced questions: "When and for what kind of patient is this intervention effective?" It is clear that undertaking scientifically rigorous research that is clinically meaningful is a complex problem, but we believe it is not unsolvable. Just as Alexander the Great disentangled the Gordian knot, so too can clinicians, researchers and people with MS work together to systematically conquer the methodological conundrums that currently challenge us.