The effects of repetitive transcranial magnetic stimulation on empathy: a systematic review and meta-analysis

Empathy is a multi-dimensional concept with affective and cognitive components, the latter often referred to as Theory of Mind (ToM). Impaired empathy is prevalent in people with neuropsychiatric disorders, such as personality disorder, psychopathy, and schizophrenia, highlighting the need to develop therapeutic interventions to address this. Repetitive transcranial magnetic stimulation (rTMS), a non-invasive therapeutic technique that has been effective in treating various neuropsychiatric conditions, can be potentially used to modulate empathy. To our knowledge, no systematic reviews or meta-analyses in this field have been conducted. The aim of the current study was to review the literature on the use of rTMS to modulate empathy in adults. Seven electronic databases (AMED, Cochrane library, EMBASE, Medline, Pubmed, PsycInfo, and Web of Science) were searched using appropriate search terms. Twenty-two studies were identified, all bar one study involved interventions in healthy rather than clinical populations, and 18 of them, providing results for 24 trials, were included in the meta-analyses. Results showed an overall small, but statistically significant, effect in favour of active rTMS in healthy individuals. Differential effects across cognitive and affective ToM were evident. Subgroup analyses for cognitive ToM revealed significant effect sizes on excitatory rTMS, offline paradigms, and non-randomised design trials. Subgroup analyses for affective ToM revealed significant effect sizes on excitatory rTMS, offline paradigms, and non-randomised design trials. Meta-regression revealed no significant sources of heterogeneity. In conclusion, rTMS may have discernible effects on different components of empathy. Further research is required to examine the effects of rTMS on empathy in clinical and non-clinical populations, using appropriate empathy tasks and rTMS protocols.

Successful human socialisation is heavily influenced by the abilities to detect and understand cognitive and emotional processes in others. These abilities are referred to as the Theory of Mind (ToM) and empathy (Gallese, 2003;Young et al. 2010;Keuken et al. 2011;Krall et al. 2016). Clinicians and researchers use these terms interchangeably, but there is no universal consensus on their definitions and constructs. For example, some authors regard empathy as a two-component construct with affective and cognitive components (Reniers et al. 2011), while others (Blair, 2005) have proposed a three-component construct by adding a motor component to reflect the act of mirroring the motor responses of the observed person (motor empathy). Some commentators view cognitive empathy as synonym to ToM, which is the ability to attribute mental states, such as desires, intentions, and beliefs, to others (Frith & Frith, 1999). Some authors have favoured a ToM model with two distinct components, namely affective and cognitive (Kalbe et al. 2010). Others have suggested that empathy and ToM encompass similar underlying abilities that are discernible at the neural level (Reniers et al. 2014). More recently, Dvash & Shamay-Tsoory (2014) argued in favour of a two-component construct of empathy, namely emotional and cognitive empathy (also refered to as ToM), with distinct neuroanatomical underpinnings (Fig. 1). According to this model, cognitive empathy (ToM) has two distinct subcomponents, namely affective ToM and cognitive ToM. Several brain regions have been implicated in cognitive ToM, including medial prefrontal cortex (mPFC), dorsolateral prefrontal cortex (DLPFC), temporoparietal junction (TPJ), and temporal poles (Frith & Frith, 1999;Völlm et al. 2006;Carrington & Bailey, 2009;Reniers et al. 2014). Brain areas implicated in the regulation of affective ToM include mPFC, particularly the ventral portion (Shamay-Tsoory & Aharon-Peretz, 2007;Shamay-Tsoory et al. 2009;Sebastian et al. 2012), inferior frontal gyrus (IFG), anterior cingulate cortex, and amygdala (Shamay-Tsoory et al. 2009;Gonzalez-Liencres et al. 2013;Gentili et al. 2015).
Self-report inventories commonly used to measure empathy include the Hogan Empathy Scale (Hogan, 1969), the Interpersonal Reactivity Index (IRI; Egger et al. 1997), the Balanced Emotional Empathy Scale (BEES; Mehrabian, 2000), the Empathy Quotient (EQ; Behan et al. 2015), and the Questionnaire of Cognitive and Affective Empathy (QCAE; Reniers et al. 2011). Behavioural measures of cognitive empathy (ToM) are primarily performance-based and include such tasks as first-order (Baron-Cohen et al. 1985) and second-order false-belief (Baron-Cohen, 1989) tasks for assessing cognitive ToM, the Reading the Mind in the Eyes (RMET) for evaluating affective ToM ( Baron-Cohen et al. 2001), and the Faux Pas Recognition (FPR) test (Stone et al. 1998) and the Yoni task (Shamay-Tsoory & Aharon-Peretz, 2007) for assessing both affective and cognitive ToM. Impairment of social functioning consequent upon impaired empathy has been reported in a range of neuropsychiatric conditions, including psychopathy, antisocial personality disorder (Dolan & Fullam, 2004), schizophrenia (Bragado-Jimenez & Taylor, 2012), major depressive disorder (MDD; Schreiter et al. 2013), autistic spectrum disorder (ASD; Shimoni et al. 2012), temporal lobe epilepsy (Li et al. 2013), Alzheimer's disease (Laisney et al. 2013), Parkinson's disease (Yu et al. 2012), and other neurodegenerative diseases (Poletti et al. 2012). Empathy is highly correlated with violence (Jolliffe & Farrington, 2004) and plays a pivotal role in the violence inhibition system . Thus, enhancement of empathy has been regarded as a major treatment goal in criminogenic programmes (Day et al. 2010;Reidy et al. 2013). However, conventional psychological interventions for empathy enhancement have proved less effective in certain offender groups, particularly those with psychopathy (Reidy et al. 2013), highlighting the need to develop alternative therapeutic interventions to enhance empathy, of which transcranial magnetic stimulation (TMS), especially its repetitive format (rTMS), is an example (Glenn & Raine, 2008;Glannon, 2014).
TMS is a non-invasive technique used to deliver brief, high-intensity magnetic pulses to the brain inducing localised neuronal depolarisation to regulate cortical excitability that underlies the modulation of cortical networks (Luber & Lisanby, 2014). In general, high-frequency (55 Hz) rTMS and its newer version, intermittent θ burst stimulation, facilitate cortical excitability, whereas low-frequency (about 1 Hz) rTMS and continuous θ burst stimulation contribute to opposite effects (Pascual-Leone et al. 2000;Huang et al. 2005;Wassermann & Zimmermann, 2012). rTMS has been used to treat a variety of neurological and psychiatric diseases (see Wassermann & Zimmermann, 2012) and to enhance cognitive functions in healthy volunteers (see Hsu et al. 2015) and in people with MDD (Serafini et al. 2015). Online Supplementary Table S1 provides more information about the effects of TMS in clinical populations. Additionally, rTMS has been used to modulate empathy with some promising effects (see Hetu et al. 2012;Schuwerk et al. 2014a). However, findings are inconsistent likely due to differences in the tasks used to measure empathy, experimental designs, targeted brain regions, and rTMS parameters, including the paradigms used (i.e. online or offline), stimulus intensity [measured as a percentage of resting motor threshold (rMT) or of maximum stimulator output (MSO)], frequency, and number of pulses.
We therefore aimed to conduct a systematic review and meta-analysis of the literature on the effects of rTMS on empathy in healthy and clinical populations to integrate the evidence base and to determine if certain TMS parameters or brain regions selected are associated with stronger effects on specific domains of empathy. While effective interventions involving healthy individuals could potentially be extended to clinical populations, as we shall describe later in this review, all the studies included in this review, bar one study, involved interventions in healthy groups. Due to the overlaps between the concepts of empathy and ToM, in this review we have conceptualised empathy in accordance with the model proposed by Dvash & Shamay-Tsoory (2014) as outlined above.  We followed PRISMA-P guidelines Shamseer et al. 2015) in the reporting of this review where applicable.

Data sources
Using the terms 'transcranial magnetic stimulation' or 'TMS' combined with 'theory of mind', 'ToM', 'empath$', 'mentali$', 'role taking', or 'perspective taking', a systematic search of the literature on the effects of TMS on empathy was conducted on 25 May 2016 of seven electronic databases (AMED, Cochrane library, EMBASE, Medline, PsycInfo, Pubmed, Web of Science). The International Clinical Trials Registry Platform (World Health Organisation), Dissertation Abstracts, Google, and the library catalogues of the University of Nottingham were also searched to identify grey literature in the field. No filters were added regarding the age of study participants, publication time or language of publication (see online Supplementary Table S2 for search syntax). References of eligible articles were searched manually for potentially eligible studies missed by the electronic searches.

Study selection
Empirical studies were included in the review if they: (1) involved adult participants without dementia or other major neurological conditions; (2) used rTMS as an active intervention; (3) had a comparison group or control condition; and (4) used behavioural tasks to assess empathy. Of the 508 papers originally identified, 22 met the inclusion criteria (see online Supplementary  Fig. S1 and Table S3) and were quality assessed using the quality assessment tool for quantitative studies (National Collaborating Centre for Methods & Tools, 2008) on the domains of selection bias, study design, confounders, blinding, data collection method, withdrawals and dropouts, intervention integrity, and statistical analyses.
Of the 22 studies included in the review, four (Uddin et al. 2006;Balconi et al. 2010;Hoekert et al. 2010;Lev-Ran et al. 2012) were excluded from the meta-analyses due to lack of sufficient data to allow effect size calculation and only after exhausting attempts to obtain this information from the authors.

Data extraction and analyses
A standardised form was used to extract information concerning authors, study objectives, sample characteristics, inclusion/exclusion criteria, study design, experimental processes, rTMS protocols, outcome variables, and analytic strategy.
We originally intended to conduct separate meta-analyses of studies involving clinical populations and healthy individuals using the random-effects model and, where applicable, in accordance with the model proposed by Dvash & Shamay-Tsoory (2014) with its components: cognitive empathy (i.e. ToM, including cognitive ToM and affective ToM) and affective empathy. However, this has not been possible due to there being only one study in the field (Enticott et al. 2014). Therefore, the meta-analyses presented in this review include only studies involving healthy subjects. Measures of cognitive ToM included the cognitive component of the Yoni task, moral judgement, false-belief tasks, and action-understanding tools. Measures of affective ToM included the RMET, tasks of facial expression recognition, the affective component of the Yoni task, affective go/no-go tasks, the faux pas test, and emotional egocentricity. While it can be argued that facial expression recognition is not a test of empathic abilities, the model proposed by Dvash & Shamay-Tsoory (2014) regards emotional recognition as a component of affective ToM. This view has been supported by other commentators (Poletti et al. 2012). Therefore, tasks measuring emotional recognition, such as facial expression recognition taks, were included in the review.
Effect size was regarded as positive if the active rTMS effect was in the predicted direction and negative if it was in the opposite direction. Moreover, when a study entailed multiple stimulation sites, each trial of the different stimulation sites was used as the unit of analysis for the purpose of meta-analysis. A pooled effect size was used if a study provided multiple outcomes (e.g. accuracy and reaction time, score of each subscale, or short-term and long-term performance). Only the comparison between experimental and sham group (condition) was selected when a trial consisted of more than one control group or condition (e.g. one group receiving rTMS at a control site and another receiving sham stimulation). Effect sizes represented as Hedges' g and 95% confident intervals (CI) were calculated according to the differences between experimental (real stimulation) and control (sham stimulation) conditions in post-stimulation evaluations or 'online' performance divided by pooled standard deviation.
The Q and I 2 statistics (Higgins & Thompson, 2002;Higgins et al. 2003) were used to assess consistency between studies. The Q statistic represents the level of heterogeneity, while the I 2 index specifies the total variation from between-study variance. A p value 40.05 and an I 2 value of >40% were deemed as indicative of moderate heterogeneity. Funnel plots (Egger & Smith, 1995), the Egger test (Egger et al. 1997), and Begg and Mazumdar rank correlation tests (Begg & Mazumdar, 1994) were used to test for the presence of a potential publication bias. In cases where publication bias was evident, the Trim and Fill procedure (Egger & Smith, 1995) was applied to correct it.
In order to identify variables that could contribute to alternation of empathy, pre-specified subgroup analyses were performed with the unit of trial by merging the data according to the rTMS parameters, including effect ('excitatory' v. 'inhibitory'), stimulation paradigm ('online' v. 'offline'), study design ('randomised' v. 'non-randomised'), stimulation site, and task of outcome measurement.
Meta-regression was employed to examine the impact of between-study variation on study effect sizes. The effect size from each trial was set as the dependent variable while age, gender, intensity of stimulation, total pulses per condition, and weighted number of pulses (i.e. total number of rTMS pulses multiplied by intensity) were selected as predictor variables. All the quantitative analyses were performed using Stata 13.1 (StataCorp, 2013). Table 1 summaries study characteristics. In summary, 22 studies involving 466 participants (82% males; mean age 24.45 years; range 18-59 years) were included in the review. For studies recruiting participants from clinical populations, there was only one study (Enticott et al. 2014), recruiting patients with ASD as subjects. Sixteen of the included studies were conducted in Europe, three in North America (Uddin et al. 2006;Young et al. 2010;Keuken et al. 2011), two in Australia (Krause et al. 2012;Enticott et al. 2014), and one in Israel (Lev-Ran et al. 2012). The most common study design employed was non-randomised cross-over (n = 15), allocating the sequence of intervention conditions with counterbalancing (n = 10) or unspecified (n = 5) method. Of the six studies randomly allocating participants, two (Keuken et al. 2011;Enticott et al. 2014) were parallel randomised controlled trials and the other four (Costa et al. 2008;Kalbe et al. 2010;Giardina et al. 2011;Lev-Ran et al. 2012) were randomised cross-over trials. The remaining one between-subject study (Silani et al. 2013) did not mention the method of participant allocation.

Study characteristics
Various tasks were used to assess empathy, including facial expression recognition tasks with materials derived from Ekman & Friesen (1976), the RMET or its modified version, the Yoni task, scenarios using video clips assessing individuals' capability of social judgement or action understanding, the false belief task and the faux pas task. With regard to published self-report instruments, only one study (Enticott et al. 2014) selected a self-report measure, the IRI, as the empathy measure. The number of pulses within each experimental session ranged from 120 to 3000. The majority of the reviewed studies (n = 15) set the intensity of the pulses to 100% or more of rMT, while other four studies used subthreshold intensity (Costa et al. 2008;Hoekert et al. 2010;Giardina et al. 2011;Michael et al. 2014). The remaining three studies (Young et al. 2010;Keuken et al. 2011;Krall et al. 2016) selected MSO as the index of intensity. The DLPFC, mPFC (ventral or dorsal portion), TPJ, and IFG were targeted as the main sites for stimulation. The most common control condition was vertex stimulation (n = 11). Five studies did not report the detail of their sham protocol.

Quality assessment
Of the twenty-two studies included, only one study (Enticott et al. 2014) attracted a rating of 'strong', 19 studies were rated as 'moderate', and two studies as 'weak' (online Supplementary Table S4). Poor rating on selection bias was the most common reason for not reaching the 'strong' quality threshold. The two weak ratings were due to vulnerability to confounders (Silani et al. 2013) and poor description of the reliability and validity of the outcome measures used (Michael et al. 2014). For rTMS reproducibility, most of the reviewed studies (n = 16) provided all necessary parameters, but two studies (Balconi et al. 2010;Silani et al. 2013) failed to provide information in relation to the type of coil utilised, and four studies (Pobric & Hamilton, 2006;Costa et al. 2008;Balconi et al. 2011;Balconi & Bortolotti, 2012) lacked comprehensive information about the duration of the intervention. Only three studies described adverse effects relating to the administration of rTMS, with one study indicating no adverse effects observed (Young et al. 2010) and the other two studies reporting minor post-rTMS side effects (Enticott et al. 2014) and one syncope event (Kalbe et al. 2010).

Effects of rTMS on empathy in clinical populations
Since there was only one trial (Enticott et al. 2014) involving participants with a mental disorder, it was not possible to conduct a meta-analysis to examine the rTMS effect on empathy in clinical populations. This study (Enticott et al. 2014) showed that deep highfrequency rTMS applied bilaterally to the dorsal mPFC in patients with ASD did not have a statistically significant facilitatory effects on empathy (g = −0.22; 95% CI −1.55 to −0.01, p = 0.016), cognitive empathy

Effects of rTMS on empathy in healthy volunteers
Twenty-four trials extracted from reports of 17 studies were included for the meta-analysis of the effects of rTMS on empathy. This revealed a significant small overall effect size (g = 0.29; 95% CI 0.10 to 0.48, p = 0.003) as plotted in Fig. 2a. A moderate level of heterogeneity was observed across the studies (Q 23 = 39.22, p = 0.019; I 2 = 41.4%). Separate meta-analyses were conducted for trials involving cognitive empathy with its two components; cognitive and affective ToM. However, it was not possible to conduct a metaanalysis on the effects of rTMS on affective empathy due to lack of studies in the field.

Discussion
This study aimed to examine the literature on the effects of rTMS on empathy and, where relevant, to determine which intervention parameters were associated with stronger effects. Our findings show that rTMS has a significant but small overall effect on empathy in healthy participants and that this effect varied according to empathy domains, cognitive or affective ToM. It has not been possible to draw valid conclusions regarding the effect of rTMS on empathy in clinical population as there was only one study conducted in the field.
The meta-analysis of rTMS studies relating to cognitive ToM revealed a non-significant effect size indicating that rTMS may not be effective in modulating cognitive ToM. Moreover, the results suggested that there might be five unpublished trials investigating this issue with negative findings. In contrast, a significant effect size was found on the meta-analysis of rTMS studies for affective ToM though the magnitude of effect was small. These findings of dissimilar effects of rTMS support the idea of examining subcomponents of empathy separately as they are associated with distinct brain regions (Dvash & Shamay-Tsoory, 2014).
Our subgroup analyses further identified parameters associated with a positive effect of rTMS, including excitatory v. inhibitory rTMS and online v. offline paradigms. However, these finding should be interpreted with caution due to the relatively small number of trials, particularly for excitatory rTMS. Although previous studies (Robertson et al. 2003) suggest that the duration of the rTMS after-effect only persists for half of the stimulation time, physiological evidence indicates that the rTMS after-effect decays gradually with time (Eisenegger et al. 2008). Nevertheless, given that completion of conventional tasks measuring empathy is time-consuming, it is less likely to detect significant rTMS effect on empathy from experiments with offline paradigm.
Surprisingly, the subgroup analysis by stimulation site did not reveal statistically significant mean effects across different brain regions pertaining to specific empathetic components. The literature suggests differential roles of specific brain regions: the dorsal part of mPFC and TPJ (particularly the right side) for cognitive ToM (Denny et al. 2012) and the ventral part of mPFC and IFG for affective ToM (Sebastian et al. 2012;Dal Monte et al. 2014). It would thus be expected to find significant effects if rTMS is administered to these regions, but not to other regions. However, we found no significant effect applying rTMS to TPJ for cognitive ToM or IFG for affective ToM and only one included trial (Keuken et al. 2011) explored affective ToM targeting at these crucial regions (e.g. IFG), a firm conclusion cannot be drawn at this stage. It is worth noting here that the issue of spatial resolution is an inherent limitation of TMS research. The issue may be further compromised when non-imaging-guided techniques are utilised to localise the stimulation sites. With this in mind, and since a considerable number of studies included in this review (Balconi et al. 2010;Balconi & Bortolotti, 2012;Krause et al. 2012;Schuwerk et al. 2014b) did not utilise imaging-guided techniques, we have categorised the studies according to the effects of TMS on relatively large regions of the brain rather than smaller ones while performing subgroup analyses. Nevertheless, the results need to be interpreted with caution.
Meta-regression revealed no differential effects in relation to participant characteristics (age, gender) or stimulation parameters (intensity, number of pulses, weighted number of pulses). This may be due to the low heterogeneity detected in relation to participants' age and gender ratio. Contrary to the findings of other meta-analytic studies (Chou et al. 2015), rTMS parameters did not contribute significantly to effect sizes. A number of explanations exist as to why these findings were not replicated in this review. First, the number of studies included in this review was slightly higher than 10, the minimum number required to attain sufficient statistical power (Borenstein et al. 2009). Second, the impact of the rTMS parameters may only be evident when rTMS is applied to the brain region corresponding to the task measured. Third, empathy is a multi-faceted construct involving a network of brain regions, and since the effects of TMS are dose-dependent, a larger number of sessions and pulses per session may be required to modulate empathy.
Future research should examine a number of pertinent issues. For example, some of the included studies (Balconi & Bortolotti, 2013;Balconi & Canavesio, 2016) suggested that baseline level of empathy can moderate the inhibitory effect of low-frequency rTMS on facial emotional recognition. Interestingly, they found people with higher levels of empathy performed better under control conditions than those with lower levels of empathy when the activity of the dorsal mPFC was inhibited. However, for the effect of facilitatory rTMS for enhancing empathetic ability, the role of baseline empathy level has not yet been investigated, which is obviously a crucial issue for rTMS in clinical application. In addition, as speculated in a number of included studies, the behavioural tasks selected might not be appropriate for outcome measures due to their low sensitivity to detect rTMS-induced effects (Keuken et al. 2011;Krause et al. 2012;Lev-Ran et al. 2012;Enticott et al. 2014;Schuwerk et al. 2014b). Finally, it might be too simplistic to expect that increased excitability contributes to behavioural improvement and decreased excitability to deterioration as others have also suggested (Sandrini et al. 2011).

Strengths and limitations
A major strength of this study is that some of the studies included were relatively well designed with low dropouts rates and high reproducibility of rTMS protocols. However, the study suffered a number of limitations in relation to selection bias, reflected by restricted participants' age range, recruitment resources and reporting adverse of effects, which is essential in TMS studies (Rossi et al. 2009). Further, the subgroup analysis of study design showed that more significant effects were found in non-randomised than randomised trials. This raises the question whether the results of the current study may be vulnerable to some methodological limitations. However, since a majority of included studies were rated as equivalently moderate in quality assessment, the source of heterogeneity is less likely from allocation bias and needs further investigation. While the research on rTMS application into alteration of empathy is still in its infancy, this systematic review with meta-analysis applied a broad range of search terms to enrol eligible studies with variant outcome measures and different rTMS protocols. We included both randomised and non-randomised trials as a considerable number of studies in this field used nonrandomised design. Multiple databases were thoroughly searched to minimise potential publication bias. However, a number of studies could not be included in the meta-analysis due to not reporting effect sizes, outcome measures not matching our inclusion criteria, and the presence of possible publication bias. The majority of included studies applied empathy tasks providing multiple outcomes, such as accuracy and reaction time. We dealt with these multiple outcomes by averaging the effect sizes though this may have underestimated the size of effect. The number of studies included in the meta-analysis is relatively small, and this in conjunction with considerable levels of heterogeneity across the studies may have affected the power of the study. Finally, only one study involving interventions in a clinical population was included in the review and no meta-analytic data could therefore be provided for clinical samples. This highlights the urgent need to conduct clinical trials in the field.

Conclusion
The present review with meta-analysis demonstrated that rTMS has a discernible contribution to the alteration in different components of empathy although the effect sizes may not be as favourable as expected. The most encouraging finding for clinical implications is the effect of excitatory rTMS on enhancing affective ToM. Therefore, this review may help researchers having an interest in exploring rTMS impacts on empathy tailor their rTMS protocols to maximise its effect. Future studies in the field can potentially examine the effects of excitatory rTMS in clinical populations with impaired empathetic capabilities, such as those with ASD, psychopathy, and schizophrenia. However, we do not currently know whether the same effects will be observed in these populations. rTMS parameters may have to be refined further to maximise the effects on crucial brain regions, and there is a need to develop ecologically validated and sensitive empathy tasks for rTMS experiments.