External memory aids for memory problems in people with multiple sclerosis: A systematic review

ABSTRACT Approximately 40–60% of people with multiple sclerosis (MS) have memory problems, which adversely impact on their everyday functioning. Evidence supports the use of external memory aids in people with stroke and brain injury, and suggests they may reduce everyday memory problems in people with MS. Previous reviews of people with MS have only evaluated randomised trials; therefore this review included other methodologies. The aim was to assess the efficacy of external memory aids for people with MS for improving memory functioning, mood, quality of life, and coping strategies. Seven databases were systematically searched. Intervention studies that involved training in the use of external memory aids, e.g., personal digital assistants, with at least 75% of people with MS, were included. Based on study design, quality was rated with the SCED or PEDro scale. Nine studies involving 540 participants were included. One single case experimental design (mean of 8 on SCED scale) and eight group studies (mean of 5 on PEDro scale) were included. One study reported a significant treatment effect on subjective memory functioning, two on mood, and two on coping strategies. There is insufficient evidence to support or refute the effectiveness of external memory aids for improving memory function in people with MS.

Impairments in cognitive functioning are related to low mood (Gilchrist & Creed, 1994), and have the potential to affect independence in activities of daily living (Langdon & Thompson, 1996). Severe cognitive impairment presents a major barrier to rehabilitation, because individuals may be unable to retain advice or have difficulty acquiring new skills (Thomas, Thomas, Hillier, Galvin, & Baker, 2006). The safety of people with memory deficits can also be compromised, making them vulnerable in the home (e.g., forgetting to turn the oven off) and at work (e.g., forgetting deadlines). Impairments in memory can have a detrimental effect on the psychological well-being of people and others around them (Skeel & Edwards, 2001), and have significant longterm effects on a person's work and social life (Amato, Zipoli, & Portaccio, 2008).
Cognitive rehabilitation is a process whereby people with neurological trauma and clinicians work together as a team to remediate or alleviate the resulting cognitive deficits (Wilson & Watson, 1996). Cognitive rehabilitation literature is divided on what strategies work best for people with cognitive impairment (das Nair, Ferguson, Stark, & Lincoln, 2012). Restoration focuses on improving a specific cognitive function, potentially through regeneration, and typically involves retraining exercises. Compensation focuses on teaching people to adapt to the presence of a cognitive impairment, and is achieved through teaching people to use internal or external strategies. These include applying/using internal aids, such as mental imagery, mnemonics and rehearsal; or external memory aids, such as diaries, lists and notice boards. Technology has enabled the use of paging systems (Wilson, Emslie, Quirk, & Evans, 2001), mobile phones, and palmtop devices to reduce prospective memory problems. Cicerone et al. (2011) recommended the use of external compensatory devices for people with memory problems following traumatic brain injury (TBI) or stroke; while another review (de Joode, van Heugten, Verhey, & van Boxtel, 2010) found assistive technology, such as personal digital assistants (PDA), reduced prospective memory problems after acquired brain injury (ABI). A recent meta-analysis (Jamieson, Cullen, McGee-Lennon, Brewster, & Evans, 2014) including seven group studies concluded that there was strong evidence for the efficacy of prospective memory-prompting devices for people with ABI or degenerative diseases. However, Jamieson et al. (2014) only reviewed one study that included people with MS and concluded that there was a specific need for the investigation of technology for people with degenerative diseases.
Recommendations for the provision of cognitive rehabilitation for people with MS have largely been based on single case experimental designs (SCED) and controlled clinical trials (CCT). A systematic review (O'Brien, Chiaravalloti, Goverover, & DeLuca, 2008) concluded that there was insufficient evidence to support or refute the effectiveness of memory rehabilitation for people with MS, due to small sample sizes, inadequate randomisation and blinding procedures, and impairment-level outcome measures. Rosti-Otajarvi and Hamalainen (2014) found that cognitive training improved memory span and working memory, and when combined with other neuropsychological methods also improved delayed memory. However, the authors concluded that the overall quality of included studies was relatively poor. Other Cochrane reviews, such as das  and Thomas et al. (2006), also concluded that there is no evidence to support or refute the effectiveness of memory rehabilitation. Furthermore, it is unclear which elements of memory rehabilitation are most effective, for example, training, mnemonics or external memory aids. das  concluded that more research is required to determine whether memory rehabilitation for people with MS is effective in reducing memory problems.
An expert panel underscored the need for cognitive rehabilitation interventions for people with MS and recommended the use of compensatory devices (Multiple Sclerosis Society, 2006). There is some suggestion that external memory aids may be effective in reducing everyday memory problems in people with MS (das Nair & Lincoln, 2012). A recent Cochrane review  investigated the evidence base for memory rehabilitation for people with MS. However, the majority of the literature on memory aids did not employ randomised controlled trial (RCT) designs, and so was not included in the Cochrane reviews. A systematic review of external memory aids for cognitive problems in a mixed sample (Gillespie, Best, & O'Neill, 2012) highlighted that most studies have been qualitative or single subject designs. Therefore the present systematic review evaluated research that employed other quantitative methodologies, such as quasi-experimental designs and SCEDs, in addition to RCTs. This review supplements Jamieson et al.'s (2014) findings by evaluating the effectiveness of all types of external memory aids, not just technological ones; specifically for people with MS; and included comprehensive cognitive rehabilitation programmes, provided the use of memory aids was included.
The aim of the review was to determine whether people with MS who received training in the use of external memory aids showed better outcomes in their memory functions, mood, and quality of life, than those given other types of interventions, usual care, or a placebo control.

Type of studies
Studies evaluating the effectiveness of interventions were considered for review. Therefore, RCTs, CCTs, before-and-after designs, and SCEDs were included. A study was deemed to be a RCT on the basis that the individuals followed in the trial were definitely or probably assigned prospectively to one of two (or more) alternative forms of healthcare using random allocation (Higgins & Green, 2011). SCED studies were distinguished from descriptive case reports by the inclusion of a control condition either through multiple baseline measures or a separate control measure that allowed the causal impact of the treatment efficacy to be inferred, as in reversal/withdrawal (ABA) designs (Tate et al., 2008). AB (where A = baseline and B = intervention) design SCEDs were also considered. Studies were included with any type of control group (i.e., usual care, standard care, placebo, waiting list, other rehabilitation, or intervention).

Type of participants
Studies were limited to people with MS, regardless of clinical course or length of time since diagnosis. Studies with mixed diagnosis samples were included if the sample consisted of 75% or more MS participants, or a subgroup of MS participants could be identified for which separate data were available. Memory impairments were not defined in advance and it was assumed that people receiving training on the use of external memory aids had memory impairments. Studies were included if participants were 18 years or over, or if separate data were available for those over 18 years.

Types of interventions
Interventions included in this review involved the use of, or training in the use of, external memory aids, defined as any external means of compensating for a memory deficit, e.g., diaries, PDAs, electronic calendars. Studies involving general cognitive rehabilitation programmes covering other aspects of cognition, such as executive function or visual perception, or other forms of memory rehabilitation, such as training on internal strategies, were included provided they explicitly provided training on the use of external memory aids. Studies were considered to involve an intervention if the training took place over more than a single session. Pharmacological interventions were not included. Where studies had active control groups, it was checked that these groups contained no memory content, to allow pure comparison with the treatment group.

Types of outcomes
The primary outcome was a measure that directly assessed the degree of subjective memory problems in everyday life. If more than one outcome measure was used to measure this construct, the following hierarchy was used: . Everyday Memory Questionnaire (EMQ; Sunderland, Harris, & Baddeley, 1984); Cognitive Failures Questionnaire (Broadbent, Cooper, FitzGerald, & Parkes, 1982); Subjective Memory Questionnaire (Davis, Cockburn, Wade, & Smith, 1995); Memory Functioning Questionnaire (Gilewski, Zelinski, & Schaie, 1990). This was based on the degree to which the measures focused on memory and their psychometric properties.
Secondary outcomes were measures of objective memory, mood, quality of life, and coping strategies for memory problems. If more than one outcome measure was used to measure each construct, the following hierarchies were used: . Performance on memory tests such as the Wechsler Memory Scale (Wechsler, 1997 or newer versions of this test), the Cambridge Prospective Memory Test (Wilson et al., 2005), the Doors and People Memory Test (Baddeley, Emslie, & Nimmo-Smith, 1994), the Rivermead Behavioural Memory Test (RBMT; Wilson, Cockburn, & Baddeley, 1985 or newer versions of this test). . Mood, such as the General Health Questionnaire (GHQ; Goldberg, 1992), the Hospital Anxiety and Depression Scale (Zigmond & Snaith, 1983), the Beck Depression Inventory (BDI; Beck, Ward, Mendelson, Mock, & Erbaugh, 1961), the State-Trait Anxiety Inventory (Spielberger, Gorsuch, Lushene, Vagg, & Jacobs, 1983). . Quality of life, such as the MS Quality of Life Inventory (LaRocca et al., 1996), the MS Impact Scale (Hobart, Lamping, Fitzpatrick, Riazi, & Thompson, 2001), the Short Form-36 (SF-36; Ware & Kosinski, 2001), the Euro-QoL (Brooks, 1996) Hierarchies were established by considering the relevance of each measure to each construct and their psychometric properties. MS-specific measures were placed above generic measures. General measures of constructs were placed above domain-specific measures (e.g., visual, verbal, etc.). If psychometric properties were not available, the hierarchy was decided through discussion between authors.

Search methods for identification of studies
The following electronic databases were searched and all potential studies were identified by one author (RAG).

Electronic searches
(1) Cochrane Central Register of Controlled Trials (CENTRAL) (The Cochrane Library, latest issue). The search strategy used and modified for all databases can be found in Appendix A.

Searching other resources
Citation-tracking of primary study articles was employed and the reference lists of identified papers were searched for further relevant studies. Journals covering relevant topics were identified and the contents of new volumes were hand searched. Grey literature was accessed by searching GreyNet (http://www.greynet.org/) and Mednar (http:// mednar.com/). The first four pages of results on Google Scholar (http://scholar.google. co.uk/) were searched, with the date restricted from 2010 to present, along with websites relevant to the topic area, such as the MS Society (http://mssociety.org.uk/) and the MS Trust (http://MSTrust.org.uk/). These websites were searched using combinations of the following search terms: memory (memory, cognition, remember, remembering, recall, plan, planning); multiple sclerosis (multiple sclerosis, MS); external aids (memory aids, external aids, reminder systems, assistive technology, paging).

Selection of studies
The review's primary author (RAG) developed the search strategy, following consultation with a subject librarian and using guidance from relevant past reviews. She reviewed abstracts of studies identified by this strategy to identify those appearing pertinent, and systematically excluded studies that did not fit the inclusion criteria using the following hierarchy: (1) Not at least 75% participants with MS. (2) Study design did not evaluate the effectiveness of an intervention. (3) Intervention did not use external memory aids. After this initial search, duplicate papers were filtered out using endnote software (http://endnote.com).
The studies that met the criteria were then subject to a full text review to select studies. Authors were contacted if clarification was needed in order to reach the decision or if it was unclear whether training on external memory aids was provided. Authors were also contacted to retrieve data for participants with MS in mixed diagnosis samples.

Data extraction, management, and assessment of risk of bias
The methodological quality of each of the selected studies was assessed using the PEDro (Maher, Sherrington, Herbert, Moseley, & Elkins, 2003) or SCED scales (Tate et al., 2008). The PEDro scale was used to rate the group studies and the SCED scale was used for SCED studies. Previous research has established that there is good interrater reliability for both scales (Maher et al., 2003;Tate et al., 2008). Both are 10-point scales, with higher scores indicating better methodological quality. For RCTs, the main measures of quality were whether random allocation was concealed and whether outcomes were conducted blind to group allocation (Maher et al., 2003). The inclusion of non-RCTs in this review meant that some studies did not have randomisation and blinding procedures, and these were considered as lower quality.
Data for the review were extracted using a pre-prepared data extraction form that included items listed in Table 1. These characteristics were judged on the basis of information provided in the reports of the studies. Risk was assessed as being low, high or unclear, if the information available was insufficient to make this judgement, on the basis of the following criteria: random sequence generation, allocation concealment, blinding, incomplete outcome data.
Broadening the inclusion criteria to non-RCT designs meant that studies without control groups were included for evaluation in the review. Therefore it was decided that performing a meta-analysis on the data would be inappropriate and inconsistent with the aims of the study.

Results of the search
The search strategy identified 1171 results for review. Figure 1 provides a flowchart demonstrating the search process.

Excluded studies
In total 1110 studies were excluded on the basis of the exclusion criteria. During title and abstract screening, 1093 papers were excluded. Of these, 792 studies were excluded because they did not evaluate the effectiveness of an intervention, 57 because the sample did not contain MS patients, and 244 because they did not instruct participants in the use of external memory aids. Fifty-two duplicates were removed, leaving 26 studies remaining.
Six studies were from Europe (UK, Denmark, Austria) and three from the USA. Eight studies were conducted in community settings and one was conducted in a rehabilitation hospital.

Study
Reason for exclusion Allen et al. (1995) Did not use external memory aids Allen et al. (1998) Did not use external memory aids Beer and Kesselring (2009) Did not evaluate the effectiveness of an intervention Ben Ari et al. (2012) Conference abstract; full article not yet published; data not available from author Brissart et al. (2010) Did not use external memory aids Brissart et al. (2011) Did not use external memory aids Brissart et al. (2013) Did not use external memory aids das Nair et al. (2012) Did not evaluate the effectiveness of an intervention Gich et al. (2011) Did not use external memory aids Johnson et al. (2009) Did not evaluate the effectiveness of an intervention Kardiasmenos et al. (2008) Did not use external memory aids Kesselring (2004) Did not evaluate the effectiveness of an intervention Mantynen et al. (2014) Did not use external memory aids Ramio et al. (2010) Did not use external memory aids Rosti-Otajarvi and Hamalainen (2011) Did not evaluate the effectiveness of an intervention Solari et al. (2004) Did not use external memory aids Topcular et al. (2010) Did not use external memory aids   Types of design. Six studies were RCTs (Carr et al., 2014;Jønsson et al., 1993;Lincoln et al., 2002;Stuifbergen et al., 2012;Tesar et al., 2005). Two studies employed before-and-after group designs (Gentry, 2008;Shevil & Finlayson, 2010), and one study was a SCED (Lincoln et al., 2003). Within the RCTs, the method of generating the randomisation schedule was mentioned in all but one study (Tesar et al., 2005). Independent randomisation was reported in three studies (Carr et al., 2014;Lincoln et al., 2002) and two studies used a closed envelope system (Jønsson et al., 1993;Stuifbergen et al., 2012). Outcomes were assessed by an individual blind to group allocation in five RCTs, but not in Tesar et al. (2005). The SCED (Lincoln et al., 2003) employed an AB design for 29 participants within the treatment group of the RCT (Lincoln et al., 2002).
Types of participants. The diagnosis of participants was based on the Poser criteria (Poser et al., 1983) in four studies Lincoln et al., 2002Lincoln et al., , 2003Tesar et al., 2005); the Jønsson et al. (1993) study used the Schumacher criteria (Schumacher et al., 1965), and four studies relied on self-reported diagnoses (Carr et al., 2014;Gentry, 2008;Shevil & Finlayson, 2010;Stuifbergen et al., 2012). Seven studies had mixed types of MS-relapsing remitting MS (RRMS), primary progressive MS (PPMS) and secondary progressive MS (SPMS) (Carr et al., 2014;Gentry, 2008;Lincoln et al., 2002Lincoln et al., , 2003Jønsson et al., 1993), or RRMS and SPMS (das Nair Tesar et al., 2005). The subtypes of MS were not described by Shevil and Finlayson (2010) or Stuifbergen et al. (2012). The sample size in the studies varied from 19 (Tesar et al., 2005) to 240 (Lincoln et al., 2002); the number of participants receiving active treatment similarly varied from 10 (Tesar et al., 2005) to 82 (Lincoln et al., 2002). The majority of participants were in their mid to late 40s, with mean ages ranging from 42.1 years (Lincoln et al., 2002) to 54.3 years (Carr et al., 2014). Eight studies reported there to be a higher percentage of women than men in their samples, with the percentage of women ranging from 88% (Stuifbergen et al., 2012) to 47% (Jønsson et al., 1993). Time since diagnosis ranged from a mean of 9 years (Tesar et al., 2005) to 15 years (Jønsson et al., 1993); and years of education varied from a mean of 11.5 years (Jønsson et al., 1993) to the majority having a BSc or postgraduate education (Stuifbergen et al., 2012). In the six studies comparing performance between groups, four studies had groups comparable at baseline on all variables (Carr et al., 2014;Lincoln et al., 2002;Tesar et al., 2005); the remaining two studies were comparable on all baseline variables except visuo-spatial and visual perception (Jønsson et al., 1993) and executive function (Stuifbergen et al., 2012).

Types of interventions.
Four studies employed individual treatment (Gentry, 2008;Jønsson et al., 1993;Lincoln et al., 2002Lincoln et al., , 2003 and five studies used group interventions (Carr et al., 2014;Shevil & Finlayson, 2010;Stuifbergen et al., 2012;Tesar et al., 2005). Two studies ran three-group comparisons Lincoln et al., 2002) and three studies employed two-group comparisons (treatment vs. control) (Carr et al., 2014;Stuifbergen et al., 2012;Tesar et al., 2005). Two studies evaluated the performance of a single group (Gentry, 2008;Shevil & Finlayson, 2010). Gentry (2008) compared performance before treatment and immediately after treatment. Shevil and Finlayson (2010) compared performance before treatment, after a post-training period, and at follow-up. One study evaluated performance at multiple time points at baseline and at multiple time points during intervention (Lincoln et al., 2003). Most programmes were 3 weeks (Gentry, 2008) to 10 weeks (Carr et al., 2014; long. Two individual treatment studies specified that the time period was a maximum of 6 months post-assessment (Lincoln et al., 2002(Lincoln et al., , 2003. Sessions were between one hour (Tesar et al., 2005) and two hours (Shevil & Finlayson, 2010;Stuifbergen et al., 2012), and participants met one to three times a week in all studies, except two where it depended on the needs of the participant (Lincoln et al., 2002(Lincoln et al., , 2003. Eight studies employed comprehensive cognitive or memory rehabilitation programmes, which all included teaching participants how to use external memory aids, as well as internal memory strategies. Of these eight studies, five (Carr et al., 2014;Jønsson et al., 1993;Stuifbergen et al., 2012;Tesar et al., 2005) ran programmes that also included cognitive training, such as computerised functional training and attention retraining. One study also included a "neuropsychotherapy" component (Jønsson et al., 1993), and one study provided psycho-education (Shevil & Finlayson, 2010). Six studies (Jønsson et al., 1993;Lincoln et al., 2002Lincoln et al., , 2003Shevil & Finlayson, 2010;Stuifbergen et al., 2012;Tesar et al., 2005) were cognitive rehabilitation programmes, which were not specific to memory rehabilitation, therefore the amount of time dedicated to memory rehabilitation, let alone external memory aids, is unknown. Gentry (2008) was the only study with the content solely restricted to teaching participants how to use external memory aids. This study involved the installation of PDA software and demonstration of how to use calendar and alarm functions on a PDA, followed by a post-training period where administrative support was available if needed.

Risk of bias in included studies
The risk of bias in the nine included studies was mixed, with high risk of detection bias associated with the lack of blinding in one group study (Tesar et al., 2005), and two group studies at high risk of selection, detection and performance bias associated with the lack of a control group, and therefore absence of randomisation, allocation and blinding procedures (Gentry, 2008;Shevil & Finlayson, 2010). The risk of bias in the SCED (Lincoln et al., 2003) was considered to be generally low, but with some risk of observer bias and bias in determining the treatment efficacy due to the AB design. The risk of bias was deemed to be unclear in five studies, due to lack of information when reporting the methods used for random sequence generation (Jønsson et al., 1993;Stuifbergen et al., 2012;Tesar et al., 2005), blinding (Jønsson et al., 1993), and how incomplete data were handled (Gentry, 2008;Shevil & Finlayson, 2010;Tesar et al., 2005).
The methodological quality of group studies using the PEDro scale (Maher et al., 2003) are summarised in Table 4, and single case experimental designs using the SCED scale (Tate et al., 2008) in Table 5. Group studies received a mean score of 5 (SD = 2.51; range = 2-8) out of a possible 10. The SCED scored 8 out of a possible 10.
Random sequence generation in group studies. Three group studies were judged to have a low risk of selection bias on the basis of having adequate random sequence generation, using a computerised random number generator by an independent agency or researcher (Carr et al., 2014;Lincoln et al., 2002). Three studies were unclear in their explanation of random sequence generation, and thus the risk of bias was unclear (Jønsson et al., 1993;Stuifbergen et al., 2012;Tesar et al., 2005). Two studies had no control group and therefore there was a high risk of selection bias (Gentry, 2008;Shevil & Finlayson, 2010).
Allocation in group studies. Six group studies were judged as having a low risk of selection bias on the basis of adequate group allocation concealment using a computerised random number generator by an independent unit (Carr et al., 2014;das Nair & Lincoln, TABLE 4. Risk of bias table for group studies, using the PEDro scale (Maher et al., 2003 Blinding of all therapists who administered therapy 2012; Lincoln et al., 2002), or having a separate member of staff, not involved with the study to complete allocation (Tesar et al., 2005), or using a closed envelope system (Jønsson et al., 1993;Stuifbergen et al., 2012). Two studies had no control group and therefore there was a high risk of selection bias (Gentry, 2008;Shevil & Finlayson, 2010).
Blinding in group studies. Four group studies were single blind (Carr et al., 2014;Lincoln et al., 2002;Stuifbergen et al., 2012), with a blinded outcome assessor, indicating a low risk of detection bias. Three studies had a high risk of bias (Gentry, 2008;Shevil & Finlayson, 2010;Tesar et al., 2005) as they had no blinding procedures, and Jønsson et al. (1993) provided an unclear description of the blinding procedures employed.
Incomplete outcome data in group studies. Four group studies addressed incomplete data, indicating a low risk of attrition bias. In one study (das Nair & Lincoln, 2012), listwise deletion was utilised and baseline data were imputed for missing follow-up data. In another study (Lincoln et al., 2002), analysis covered just those who completed outcomes, however it also included those who did not receive the intervention as planned in an intention-to-treat analysis. One study (Stuifbergen et al., 2012) replaced missing values with the last observation value carried forward if the participant did not complete later measurements, or imputed if an intermediate value was missing. In another study (Carr et al., 2014), if missed items occurred for less than 10% of questions in a questionnaire, the missing item was replaced with the mean for the questionnaire. The four remaining studies did not address incomplete data: two studies did not use intention-to-treat analysis after reporting dropouts (Gentry, 2008;Shevil & Finlayson, 2010) and two studies provided no explanation of how dropout data were dealt with (Jønsson et al., 1993;Tesar et al., 2005), thus the risk of bias was unclear.
Risk of bias in SCED studies. The risk of bias in the one SCED study was generally low (Lincoln et al., 2003). The measure of target behaviours was specified and the variability in behaviour was established through sufficient sampling during the baseline and treatment phase. Verification of treatment efficacy was demonstrated using statistical analysis and generalisation was assured through replication across subjects and transfer to beyond target behaviours. However, there was a high risk of bias in determining treatment efficacy since an AB design was used. There was also a high risk of observer bias, as inter-rater reliability was not established for measures.

Effects of interventions
Parametric and nonparametric statistical analyses were used to compare groups. Significance testing was reported in all studies, however the appropriate measures of variability were not.

Outcome 1: Subjective memory measures
Four studies (Carr et al., 2014;Lincoln et al., 2002Lincoln et al., , 2003 used subjective measures of participants' memory functioning. Three studies (Carr et al., 2014;Lincoln et al., 2002) used the EMQ (Sunderland et al., 1984), and one study (Lincoln et al., 2003) used diaries to record specific instances of memory difficulties that interfered with daily life. One study (Lincoln et al., 2003) found a significant effect of treatment on subjective memory functioning, demonstrated by a significant reduction (p < .01) in the frequency of reported memory problems per week from baseline to intervention. Subgroup analysis of MS participants from das Nair and  detected no significant effect of treatment; Carr et al. (2014) and Lincoln et al. (2002) found no significant treatment effect.

Outcome 2: Objective memory measures
Two studies Gentry, 2008) included objective measures of memory; both used the RBMT-E (Wilson et al., 1985). Subgroup analysis of MS participants from das Nair and  showed no significant effect of treatment; Gentry (2008) also found no significant long-term effect.

Outcome 3: Mood
Five studies (Carr et al., 2014;Jønsson et al., 1993;Lincoln et al., 2002;Tesar et al., 2005) included measures of participants' mood. All measured mood both immediately after treatment and at long-term follow-up. Three of these studies (Carr et al., 2014;Lincoln et al., 2002) used the GHQ (Goldberg, 1992) and two (Jønsson et al., 1993;Tesar et al., 2005) used the BDI (Beck et al., 1961). A significant effect of intervention was found at long-term follow-up in one study (Carr et al., 2014). Jønsson et al. (1993) found a significant effect of treatment on mood, however it was due to the control group worsening in mood over time.

Outcome 4: Quality of life
Two studies included a measure of quality of life (Carr et al., 2014;Lincoln et al., 2002). Carr et al. (2014) used the MS Impact Scale (Hobart et al., 2001), and Lincoln et al. (2002) used the SF-36 (Ware & Kosinski, 2001). No effect of treatment on quality of life was found either immediately or at long-term follow-up.

Outcome 5: Coping strategies for memory problems
Four studies Lincoln et al., 2002;Shevil & Finlayson, 2010;Stuifbergen et al., 2012) used measures of coping strategies for memory problems. One study (Shevil & Finlayson, 2010) found a significant treatment effect (p < .05) on the effectiveness of strategies used on the CSQ (Shevil & Finlayson, 2010), but no significant effect on the number of strategies used. One study (Stuifbergen et al., 2012) detected a significant treatment effect (p < .01) on the use of compensatory strategies on the MMQ-Strategy (Troyer & Rick, 2002). No significant treatment effect on coping strategies was reported in two studies Lincoln et al., 2002) on the Internal and External Memory Aids Questionnaire (das Nair & Lincoln, 2012) and the MAQ (Lincoln et al., 2002;Wilson & Moffat, 1984).

Summary of main results
Despite evidence demonstrating the existence of memory problems in people with MS and the associated everyday problems, literature examining the effectiveness of external memory aids on alleviating memory problems in people with MS remains weak. This review included a variety of study designs, in an attempt to collate all available evidence. However, few additional studies were identified that were not included in previous reviews confined to RCTs. Nine studies were included in this review: six were RCTs, two employed before-and-after group designs, and one was a SCED. One study specifically evaluated an external memory aid; the others were either memory rehabilitation or cognitive rehabilitation studies that included a component attending to external memory aids. These studies were published between 1993 and 2014 and the majority were of poor quality; lacking detailed description of the randomisation procedures, blinding, and dealing with incomplete outcome data. Although the one SCED (Lincoln et al., 2003) scored 8 out of 10 for methodological quality, the mean for group studies was only 5 out of 10. Only five of the included studies evaluated participant outcomes using ecologically valid memory measures (Carr et al., 2014;Gentry, 2008;Lincoln et al., 2002Lincoln et al., , 2003; five studies included measures of participants' mood, and only two assessed quality of life. The evidence for the effectiveness of teaching people with MS to use external memory aids to improve everyday memory functioning was limited, with only one study reporting an improvement on a subjective memory measure (Lincoln et al., 2003), and none demonstrating benefits on objective measures of memory. However, it should be noted that only four studies employed subjective memory measures, and only two used objective memory measures. Five studies assessed participants' mood, although only two studies reported positive results following intervention (Carr et al., 2014;Jønsson et al., 1993), and no studies reported an effect on quality of life. Five studies assessed the use of coping strategies for memory problems, with two reporting beneficial effects of treatment (Shevil & Finlayson, 2010;Stuifbergen et al., 2012).
There are several limitations of this review that need to be considered. Despite systematically searching seven electronic databases it is possible that not all relevant studies were identified. Studies, particularly SCEDs, may have been published in journals that were not covered by the databases, or may not have been identified with the search strategy used. Due to the nature of memory problems affecting many areas of life, it is possible that some relevant articles did not use the words applied in the search strategy. For example, if an article was named "problems at work", it would not have been included. Selection was also performed by only one author, which reduces the likelihood that errors are detected, compared with employing a review team. Another issue that should be considered is the change in the way SCEDs are classified. This review classifies SCEDs using the SCED scale (Tate et al., 2008), which includes AB designs, such as the Lincoln et al. (2003) study. A revised classification system has since been developed, the RoBiNT Scale (Tate et al., 2013), which states that AB designs should not be classified as SCEDs due to the inability to determine cause and effect, with the absence of ABA reversal or multiple baseline designs. Therefore it should be noted that the included SCED (Lincoln et al., 2003) does not provide staggered baselines.
This review evaluated the evidence for the use of external memory aids for people with MS, however only one study provided data on participants who had solely received dedicated training in the use of external memory aids. The majority of studies involved comprehensive cognitive rehabilitation programmes, and thus were aimed at tackling a range of cognitive deficits. Therefore, it is difficult to deduce how much time was spent on memory rehabilitation in general, let alone external memory aids specifically. Consequently, the results of this review suggested that there was no evidence to support or refute the effectiveness of external memory aids on subjective or objective reports of memory function.

Quality of the evidence
Evidence for the effectiveness of external memory aids for people with MS is poor. Only six RCTs were identified, of these, five were single blind, although it should be noted that the Jønsson et al. (1993) paper stated that blinding was problematic as patients could easily unmask their allocation in conversation. Eight of the nine studies were published after the publication of the CONSORT statement (Moher, Schultz, & Altman, 2003), however these guidelines were not followed in the majority of included studies. The randomisation protocol was unclear in three studies (Jønsson et al., 1993;Stuifbergen et al., 2012;Tesar et al., 2005). All RCTs appeared to have adequate allocation concealment. Inclusion and exclusion criteria were generally defined well and all described the flow of participants through the study. The description of interventions and control conditions were inadequate in the majority of studies, and the choice of outcome measures used was extremely poor, with only four studies employing ecologically valid memory measures. Feedback from participants was only obtained in two studies Tesar et al., 2005); responses to this questionnaire were positive.

Agreement and disagreement with other studies or reviews
This review adds to the recent Cochrane review on memory rehabilitation for people with MS (das Nair et al., 2012) by including five extra studies: two studies published since the review and three non-RCT designs. This review also evaluated only the treatment compared to control conditions in two previously included studies (das Nair Lincoln et al., 2002). When conducting this review there was the assumption that broadening the criteria to include non-RCTs would yield results that had been excluded from previous Cochrane reviews. However, only two CCTs and one SCED were identified. Our findings complement the das Nair et al. (2012) and Rosti-Otajarvi and Hamalainen (2014) Cochrane reviews, which both concluded that there was no evidence to suggest that memory rehabilitation was more effective than a control. This review supports the opinion from both previous reviews that studies evaluating the effectiveness of memory rehabilitation are of poor quality. This review also complements the findings of the Thomas et al. (2006) Cochrane review on psychological interventions for MS, showing that interventions designed to help people with cognitive impairments were inconclusive. Although this review does not support the conclusions of recent reviews of neuropsychological rehabilitation (Cicerone et al., 2011;de Joode et al., 2010) in their recommendations of the use of compensatory aids for people experiencing memory problems, it does support a recent review (Jamieson et al., 2014) that concluded there is still a specific need for investigations of technology for people with degenerative diseases.

CONCLUSIONS
This review found no evidence to support or refute that external memory aids improved everyday memory function, mood or quality of life for people with MS. Therefore, clinicians are encouraged to use SCED methodology to evaluate the effectiveness of any interventions they are providing in clinical practice for memory problems in people with MS.
It is also suggested that more RCTs are necessary to provide conclusive evidence as to whether or not external memory aids are effective at reducing memory problems for people with MS.