Avatar Therapy for people with schizophrenia or related disorders (Review)

low-certainty evidence). Around 20% of each group left the study early (risk ratio (RR) 1.06, 95% CI 0.59 to 1.89; studies = 1, participants = 150; moderate-certainty evidence). Analysis of quality of life scores (Manchester Short Assessment of Quality of Life (MANSA)) showed no clear difference between groups (MD 2.69, 95% CI –1.48 to 6.86; studies = 1, participants = 120; low-certainty evidence). No data were available for rehospitalisation rates, adverse events or functioning.


Library
Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews When Avatar Therapy was compared with treatment as usual average endpoint Positive and Negative Syndrome Scale -Positive (PANSS-P) scores were not di erent between treatment groups (MD -1.93, 95% CI -5.10 to 1.24; studies = 1, participants = 19; very low-certainty evidence). A measure of insight (Revised Beliefs about Voices Questionnaire; BAVQ-R) showed an e ect in favour of Avatar Therapy ; studies = 1, participants = 19; very low-certainty evidence). No one was rehospitalised in either group in the short term (risk di erence (RD) 0.00, 95% CI -0.20 to 0.20; studies = 1, participants = 19; low-certainty evidence). Numbers leaving the study early from each group were not clearly di erent -although more did leave from the Avatar Therapy group (6/14 versus 0/12; RR 11.27, 95% CI 0.70 to 181.41; studies = 1, participants = 26; low-certainty evidence). There was no clear di erence in anxiety between treatment groups (RR 5.54,95% CI 0.34 to 89.80; studies = 1, participants = 19; low-certainty evidence). For quality of life, average Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form (QLESQ-SF) scores favoured Avatar Therapy (MD 9.99, 95% CI 3.89 to 16.09; studies = 1, participants = 19; very low-certainty evidence). No study reported data for functioning.

Authors' conclusions
Our analyses of available data shows few, if any, consistent e ects of Avatar Therapy for people living with schizophrenia who experience auditory hallucinations. Where there are e ects, or suggestions of e ects, we are uncertain because of their risk of bias and their unclear clinical meaning. The theory behind Avatar Therapy is compelling but the practice needs testing in large, long, well-designed, well-reported randomised trials undertaken with help from -but not under the direction of -Avatar Therapy pioneers.

Review question
Is Avatar Therapy an e ective add-on treatment for people with schizophrenia and schizoa ective disorder?

Background
Auditory hallucination is perceiving voices when there is no external stimulus. Around 70% of people with schizophrenia experience these. Medication may help cause these to decrease or disappear. However, some people do not want to take medication and, in a proportion of people, it has little meaningful e ect. Avatar Therapy is an experimental technology that uses a visualised avatar face, voice and other sensory input to create an interactive computerised environment. We investigated the e ects of Avatar Therapy for improving auditory hallucinations for people with schizophrenia.

Searches
Cochrane Schizophrenia's Information Specialist searched through the Cochrane Schizophrenia Group's database for randomised trials (clinical studies where people are randomly put into one of two or more treatment groups) of people with schizophrenia receiving either Avatar Therapy or treatment as usual. We found 14 reports from four studies.
The evidence is current to April 2020.

Trials
Four studies met our inclusion criteria, three of which provided useable data and one is still underway. The reliability of the evidence for the three included studies was very low to low. All data were limited to six weeks' treatment and one week' follow-up and, in some cases, data were not usable (we contacted the authors). All the reported data are just for the short-term period of trials and it is apparent that we need more studies with medium-to long-term periods to be able to estimate the rate of Avatar Therapy's e ectiveness.

Conclusions
Three short-term studies with 195 participants were included in this review. Avatar Therapy was compared with treatment as usual and supportive counselling. Evidence from the trials was not high quality. Although there were some suggestions of positive e ects, because of the unclear meaning for front-line care, and the considerable risk of bias of each of the results, we cannot be certain of these e ects. There is just as great a risk of Avatar Therapy causing problems to people. More trials are needed, and those undertaking them should collaborate, Avatar Therapy for people with schizophrenia or related disorders (Review) Copyright © 2020 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd.

S U M M A R Y O F F I N D I N G S Summary of findings 1. Avatar Therapy compared to treatment as usual (all short-term) for schizophrenia or related disorders
Avatar Therapy compared to treatment as usual (all short-term) for schizophrenia or related disorders Patient or population: schizophrenia or related disorders This was the nearest proxy measure for our prestated binary outcome, clinically important change in mental state.
Mental state: specific -insight -average attitude to voices score (BAVQ-R, high = poor)short term *The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

BAVQ-R:
Revised Beliefs about Voices Questionnaire; CI: confidence interval; MD: mean difference; PANSS-P: Positive and Negative Syndrome Scale -Positive; QLESQ-SF: Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form; RCT: randomised controlled trial; RD: risk difference; RR: risk ratio.

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect. Moderate certainty: we are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different. Low certainty: our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect. Very low certainty: we have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.
a Risk of bias: 'serious' (downgraded one level) -poor reporting of randomisation, no substantive discussion of any blinding or consideration of blinding. Authors contacted but limited information available. b Indirectness: 'serious' (downgraded one level) -proxy scale for binary outcome prestated in protocol. c Imprecision: 'serious' (downgraded one level) -wide confidence intervals generated by very small study. This was the nearest proxy measure for our prestated binary outcome, clinically important change in positive symptoms. Avatar Therapy may reduce mental state scores slightly but it is unclear how these ratings relate to everyday life and functioning.
Mental state: specific -insight -average attitude to voices score (BAVQ-R, high = poor) -short term -MD 8.39 lower (14.31 lower to 2.47 lower) Avatar Therapy likely results in a little difference in a measure of insight (average attitude to voices score (BAVQ-R, high score = poor)). This was the nearest proxy measure for our prestated binary outcome, clinically important change in insight. We are unclear of the clinical meaning of this measure. Cochrane Database of Systematic Reviews

Description of the condition
Schizophrenia is a persistent, relapsing and disabling mental disorder that has a global prevalence between 0.4% and 0.7% (Saha 2005). This serious mental illness is characterised by 1. positive symptoms, such as perceptions with no cause (hallucinations) or false beliefs (delusions), or both; and 2. negative symptoms which include catatonic signs, disorganised thoughts and behaviour, apathy and lack of motivation (Carpenter 1994). Hearing voices (auditory hallucinations) is a common symptom of schizophrenia, which is o en treatment-resistant. For those people with schizophrenia who su er from hearing one or more voices, around 70% of people with schizophrenia experience these (Comer 2017), and 30% will have auditory hallucinations that are not alleviated by taking medication (Kane 1996).

Description of the intervention
Antipsychotic medications are the main line of treatment for schizophrenia; however, some of the symptoms of schizophrenia, such as auditory hallucinations, are treatment-resistant (Kane 1996). Non-pharmaceutical psychotherapeutic interventions such as cognitive behavioural therapy (CBT) are o en suggested as additional treatments. However, CBT's e ect size for treating the positive symptoms is small, and its e ect on relapse and the negative symptoms of schizophrenia has not been proved (McKenna 2014).
Virtual reality has also been used for people with schizophrenia for social skills training, to improve thinking and understanding processes, and for treatment support (Kip 2019); however, Avatar Therapy is a relatively recently suggested treatment which could be used as an adjunct to the usual antipsychotic treatment, particularly for treatment-resistant auditory hallucinations (Le 2013).
Avatar Therapy has been suggested for people who experience distressing voices, hypothesising that it could reduce the severity and rate of auditory hallucinations (Huckvale 2013;ISRCTN65314790 2013). During the therapy, a psychiatrist assists the patient to create an avatar, an audio-visual entity, using a computer program. This avatar is modified so that its characteristics match the voices that bother the patient (Le 2013). Then the therapist uses this avatar to talk back to the patient in therapy sessions and they can exercise taught methods to cope with the voices. While the person with schizophrenia is encouraged to establish dialogues with the avatar and be resistant to the hallucination, the therapist then manages the avatar so it is gradually controlled by the patient and the avatar's mode changes from persecutory to supportive during the therapy sessions (Le 2013). Audio recordings of all sessions are made so that the patient can listen to them at home using an MP3 Player (Rus-Calafell 2014). The studies have cited Le 2013 as the first trial; however, it is unclear if the other studies used the same Avatar Therapy intervention design or made their own intervention.

How the intervention might work
One of the main aspects people with schizophrenia describe about their auditory hallucinations is that they o en feel helpless in coping with the voices (Le 2013). Interacting with the hallucination may help people with schizophrenia. It has been reported that people with schizophrenia who are able to talk back to the voices could have more control over the voices (Nayani 1996); and their su ering gradually is reduced (Le 2013). However, interacting with a hallucination involves interacting with an invisible character and can be di icult because of lack of mutual interaction and body/face language. Many people with schizophrenia o en imagine a voiceassociated face when hearing a hallucination, but cannot see the face to interact with it. An avatar is both an audio (voice) and a visual (face) entity, and can be used as an audio-visual interface between the hallucination and the patient to facilitate the establishment of a conversation between patient and the voice (Le 2013). Also, there might be other advantages in talking to an avatar; first, Avatar Therapy sessions can be recorded and taken home with the patient to use when needed -a psychiatrist is unlikely to be available at all times and places; second, an avatar's voice can be made to be very similar to the hallucination the patient hears; an avatar's voice might help people with schizophrenia to practise how to take control of voices.

Why it is important to do this review
There is currently no systematic review of Avatar Therapy for treatment-resistant auditory hallucinations among people with schizophrenia. We aimed to examine whether Avatar Therapy could have a therapeutic e ect on auditory hallucinations in people with schizophrenia. In addition, since there are primary studies to be systematically reviewed to meet our aim and Avatar Therapy is a new treatment, this review found and synthesised the relevant studies on the e ects of Avatar Therapy for people with schizophrenia who experience treatment-resistant auditory hallucinations so as to provide clinically useful evidence for informed decision-making among clinicians, people with schizophrenia and health policy makers.

O B J E C T I V E S
To examine the e ects of Avatar Therapy for people with schizophrenia or related disorders.

Types of studies
All relevant randomised controlled trials; if a trial was described as 'double blind' but implied randomisation, we included such trials in a sensitivity analysis (see Sensitivity analysis). If their inclusion did not result in a substantive di erence, they remained in the analyses. If their inclusion resulted in important clinically significant but not necessarily statistically significant di erences, we did not add the data from these lower-quality studies to the results of the better trials, but presented such data within a subcategory. We found no quasi-randomised studies, such as those allocating by alternate days of the week to exclude. Where people were given additional treatments within Avatar Therapy, we only included data if the adjunct treatment was evenly distributed between groups and it was only the Avatar Therapy that was randomised.

Types of participants
Patients, however defined, with schizophrenia or related disorders, including schizophreniform disorder, schizoa ective disorder and Cochrane Database of Systematic Reviews delusional disorder, again, by any means of diagnosis, of any age, gender or ethnicity.
We were interested in ensuring that information was as relevant to the current care of people with schizophrenia as possible so propose to clearly highlight the current clinical state (acute, early postacute, partial remission, remission) as well as the stage (prodromal, first episode, early illness, persistent) and as to whether the studies primarily focused on people with particular problems (e.g. negative symptoms, treatment-resistant illnesses).

Avatar Therapy
An avatar is a computer-generated audio-visual character utilised to establish a conversation between the person with schizophrenia and the voices that person hears, to cope with the voices.
We assessed all the randomised controlled trials in which Avatar Therapy was one of the study arms either alone or in combination with other interventions. Avatar Therapy is an adjunctive treatment for 'treatment as usual'.

Control interventions
Any other add-on (to the participants usual care) intervention or treatment as usual.

Supportive counselling
Any face-to-face supportive counselling approach.

Types of outcome measures
We divided all outcomes into short term (less than six months), medium term (six to 12 months) and long term (over 12 months).
We sought to report binary outcomes recording clear and clinically meaningful degrees of change (e.g. global impression of much improved, or more than 50% improvement on a rating scale defined within the trials). Therea er, we listed other binary outcomes and then those that were continuous. See Di erences between protocol and review.

'Summary of findings' table
We used the GRADE approach to interpret findings (Schünemann 2011); and used GRADE profiler (GRADEpro; tech.cochrane.org/ revman/gradepro) to import data from Review Manager 5 (tech.cochrane.org/revman) to create 'Summary of findings' tables. These tables provide outcome-specific information concerning the overall certainty of the evidence from each included study in the comparison, the magnitude of e ect of the interventions examined, and the sum of available data on all primary outcomes and on selected secondary outcomes. This summary guided our conclusions and recommendations. We selected the following main (short term) outcomes for inclusion in the 'Summary of findings' If data were not available for these prespecified outcomes but were available for ones that were similar, we presented the closest outcome available, but considered this when grading the finding.

Library
Trusted evidence. Informed decisions. Better health.

Search methods for identification of studies
Electronic searches

Cochrane Schizophrenia Group's Study-Based Register of Trials
On 8 December 2016, 9 November 2018 and 14 April 2020, the information specialist searched the register using the following search strategy: (*Avatar Therapy*) in Intervention Field of STUDY In such a study-based register, searching the major concept retrieves all the synonyms and relevant studies because all the studies have already been organised based on their interventions and linked to the relevant topics (Shokraneh 2017;Shokraneh 2019). This register is compiled by systematic searches of major resources (AMED, BIOSIS, CENTRAL, CINAHL, ClinicalTrials.Gov, Embase, MEDLINE, PsycINFO, PubMed, WHO ICTRP) and their monthly updates; ProQuest Dissertations and Theses A&I and its quarterly update; Chinese databases (CBM, CNKI, and Wanfang) and their annual updates; handsearches; grey literature and conference proceedings (see Group's website; schizophrenia.cochrane.org/ register-trials). There is no language, date, document type, or publication status limitations for inclusion of records into the register.

Reference searching
We inspected references of all included studies for further relevant studies ( Figure 1). Cochrane Database of Systematic Reviews

Personal contact
We contacted the first author of a study for additional information regarding non-reported data (NCT03148639). We noted the outcome of this contact in the Characteristics of included studies table.

Selection of studies
Review authors (GA, TK and FS) independently inspected citations from the searches and identified relevant abstracts; GA independently re-inspected a random 20% sample of these abstracts to ensure reliability of selection. Where disputes arose, we acquired the full report for a more detailed scrutiny. Review authors (GA, TK and FS) then obtained and inspected full reports of the abstracts or reports meeting the review criteria. GA re-inspected a random 20% of these full reports to ensure reliability of selection.
Where it was not possible to resolve disagreement by discussion, we contacted the authors of the study concerned for clarification (NCT03148639).

Extraction
Review authors (GA, TK and FS) independently extracted data from all included studies. In addition, to ensure reliability, GA independently extracted data from a random sample of these studies, comprising 10% of the total. We attempted to extract data presented only in graphs and figures whenever possible, but only included if two review authors independently obtained the same result. We contacted authors through an open-ended request in order to obtain missing information or for clarification whenever necessary. If studies were multicentre, then, where possible, we extracted data relevant to each component centre separately.

Forms
We extracted data onto standard, pre-designed, simple forms.

Scale-derived data
We included continuous data from rating scales only if: 1. the psychometric properties of the measuring instrument had been described in a peer-reviewed journal (Marshall 2000); and 2. the measuring instrument had not been written or modified by one of the trialists for that particular trial. 3. the instrument should be a global assessment of an area of functioning and not sub-scores which are not, in themselves, validated or shown to be reliable. However, there are exceptions; we included sub-scores from mental state scales measuring positive and negative symptoms of schizophrenia.
Ideally, the measuring instrument should either be a self-report or completed by an independent rater or relative (not the therapist). We realise that this is not o en reported clearly: in Description of studies, we noted if this was the case or not.

Endpoint versus change data
There are advantages of both endpoint and change data. Change data can remove a component of between-person variability from the analysis. Also, calculation of change needs two assessments (baseline and endpoint) which can be di icult in unstable and di icult-to-measure conditions such as schizophrenia. We primarily used endpoint data, and would only use change data if the former were not available. Endpoint and change data would combined in the analysis, as we prefer using mean di erences (MD) rather than standardised mean di erences throughout (Higgins 2011).

Skewed data
Continuous data on clinical and social outcomes are o en not normally distributed. To avoid the pitfall of applying parametric tests to non-parametric data, we applied the following standards to all data before inclusion.
For endpoint data: • When a scale started from the finite number zero, we subtracted the lowest possible value from the mean, and divided this by the standard deviation (SD). If this value was lower than one, it strongly suggested that the data were skewed and we excluded these data. If this ratio was higher than one but less than two, there wss suggestion that the data were skewed: we entered these data and tested whether their inclusion or exclusion would change the results substantially. Finally, if the ratio was larger than two, we included these data, because it was less likely that they were skewed (Altman 1996; Higgins 2011); • If a scale started from a positive value (such as the Positive and Negative Syndrome Scale (PANSS), which can have values from 30 to 210 (Kay 1986)), we modified the calculation described above to take the scale starting point into account. In these cases, skewed data are present if 2 SD > (S -S min ), where S is the mean score and S min is the minimum score.
Please note: we would enter data from studies of at least 200 participants, for example, in the analysis irrespective of the above rules, because skewed data pose less of a problem in large studies. We also would have entered change data as when continuous data were presented on a scale that included a possibility of negative values (such as change data), if it was di icult to determine whether data were skewed or not. We would have presented and entered change data into statistical analyses.

Common measurement
To facilitate comparison between trials, we would have converted variables that were reported in di erent metrics, such as days in hospital (mean days per year, per week or per month) to a common metric (e.g. mean days per month).

Conversion of continuous to binary
Where possible, we would have converted outcome measures to dichotomous data. This can be done by identifying cut-o points on rating scales and dividing participants accordingly into 'clinically improved' or 'not clinically improved'. It is generally assumed that if there is a 50% reduction in a scale-derived score such as the Brief Psychiatric Rating Scale (BPRS, Overall 1962) or the PANSS (Kay 1986), this could be considered as a clinically significant Cochrane Database of Systematic Reviews response (Leucht 2005a;Leucht 2005b). If data based on these thresholds were not available, we would have used the primary cuto presented by the original authors.

Direction of graphs
Where possible, we entered data in such a way that the area to the le of the line of no e ect indicated a favourable outcome for Avatar Therapy. Where keeping to this made it impossible to avoid outcome titles with clumsy double-negatives (e.g. 'not unimproved'), we reported data where the le of the line indicated an unfavourable outcome and noted in the relevant graphs.

Assessment of risk of bias in included studies
Review authors (GA and FS) independently assessed risk of bias using criteria described in the Cochrane Handbook for Systemic Reviews of Interventions to assess trial quality (Higgins 2011).
These criteria are based on evidence of associations between overestimate of e ect and high risk of bias of the article such as sequence generation, allocation concealment, blinding, incomplete outcome data and selective reporting, or the way in which these 'domains' are reported.
If the raters disagreed, the final rating was made by consensus, with the involvement of another member of the review group. Where inadequate details of randomisation and other characteristics of trials were provided, we attempted to contact authors of the studies to obtain further information. We reported non-concurrence in quality assessment, but if disputes arose regarding the category to which a trial was to be allocated, we resolved this by discussion.
We noted the level of risk of bias in the text of the review, 'Risk of bias' summary ( Cochrane Database of Systematic Reviews

Random sequence generation (selection bias)
Allocation concealment (selection bias) Blinding of participants and personnel (performance bias): All outcomes Blinding of outcome assessment (detection bias): All outcomes Incomplete outcome data (attrition bias): All outcomes Selective reporting (reporting bias) Other bias 0% 25% 50% 75% 100% Low risk of bias Unclear risk of bias High risk of bias

Binary data
For binary outcomes, we calculated a standard estimation of the risk ratio (RR) and its 95% confidence interval (CI), as it has been shown that RR is more intuitive than odds ratios (Boissel 1999); and that odds ratios tend to be interpreted as RR by clinicians (Deeks 2000). Although the number needed to treat for an additional beneficial outcome (NNTB) and the number needed to treat for an additional harmful outcome (NNTH), with their CIs, are intuitively attractive to clinicians, they are problematic to calculate and interpret in meta-analyses (Hutton 2009). For binary data presented in the 'Summary of findings' tables, we, where possible, would have calculated illustrative comparative risks.

Continuous data
For continuous outcomes, we estimated mean di erence (MD) between groups, with 95% CIs. We preferred not to calculate e ect size measures (standardised mean di erence (SMD)). However, if scales of very considerable similarity were used, we presumed there was a small di erence in measurement, and we calculated e ect size and transform the e ect back to the units of one or more of the specific instruments.

Cluster trials
Studies increasingly employ 'cluster randomisation' (such as randomisation by clinician or practice) but analysis and pooling of clustered data poses problems. First, authors o en fail to account for intraclass correlation in clustered studies, leading to a unitof-analysis error whereby P values are spuriously low, CIs unduly narrow and statistical significance overestimated (Divine 1992). This causes type I errors (Bland 1997; Gulliford 1999).
Where clustering was not accounted for in primary studies, we presented data in a table, with a (*) symbol to indicate the presence of a probable unit of analysis error. We sought to contact first authors of studies to obtain intraclass correlation coe icients (ICC) for their clustered data and to adjust for this by using accepted methods (Gulliford 1999). Where clustering had been incorporated into the analysis of primary studies, we presented these data as if from a non-cluster randomised study, but adjust for the clustering e ect.
We sought statistical advice and have been advised that the binary data from cluster trials presented in a report should be divided by a 'design e ect.' This is calculated using the mean number of participants per cluster (m) and the ICC: thus, design e ect = 1 + (m -1) × ICC (Donner 2002). If the ICC was not reported, we assumed it to be 0.1 (Ukoumunne 1999).
If cluster studies were appropriately analysed taking into account ICCs and relevant data documented in the report, synthesis with other studies would have been possible using the generic inverse variance technique.

Cross-over trials
A major concern of cross-over trials was the carry-over e ect. This occurs if an e ect (e.g. pharmacological, physiological or psychological) of the treatment in the first phase is carried over to the second phase. As a consequence, on entry to the second phase the participants can di er systematically from their initial state despite a washout phase. For the same reason, cross-over trials are not appropriate if the condition of interest is unstable (Elbourne 2002). As both e ects are very likely in severe mental illness, we would only have used data of the first phase of cross-over studies.

Studies with multiple treatment groups
Where a study involved more than two treatment arms, if relevant, the additional treatment arms would have been presented in comparisons. If data were binary, these would simply have been added and combined within the two-by-two table. If data were continuous, we would have combined data following the formula in Section 7.7.3.8 (Combining groups) of the Cochrane Handbook for Systemic Reviews of Interventions (Higgins 2011). Where the additional treatment arms were not relevant, we would not have reproduced these data.

Overall loss of credibility
At some degree of loss to follow-up, data must lose credibility (Xia 2009). We chose that, for any particular outcome, should more than 50% of data be unaccounted for, we would not reproduce these data or use them within analyses. If, however, more than 50% of those in one arm of a study were lost, but the total loss was less than 50%, we would have addressed this within the 'Summary of findings' tables by downgrading the certainty of the evidence.

Library
Trusted evidence. Informed decisions. Better health.

Cochrane Database of Systematic Reviews
Finally, we also downgraded the certainty of the evidence within the 'Summary of findings' tables should loss have been 25% to 50% in total.

Binary
In the case where attrition for a binary outcome was between 0% and 50% and where these data were not clearly described, we presented data on a 'once-randomised-always-analyse' basis (an intention-to-treat (ITT) analysis). Those leaving the study early were all assumed to have the same rates of negative outcome as those who completed, with the exception of the outcome of death and adverse e ects. For these outcomes, the rate of those who stay in the study -in that particular arm of the trial -would have been used for those who did not. We undertook a sensitivity analysis testing how prone the primary outcomes were to change when data only from people who completed the study to that point were compared to the (ITT) analysis using the above assumptions.

Attrition
We used data where attrition for a continuous outcome was between 0% and 50%, and data only from people who complete the study to that point were reported.

Standard deviations
If SDs were not reported, we would have tried to obtain the missing values from the authors. If these were not available, where there were missing measures of variance for continuous data, but an exact standard error (SE) and CIs available for group means, and either P value or t value available for di erences in mean, we could have calculated SDs according to the rules described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). When only the SE was reported, SDs were calculated by the formula SD = SE × √(n). The Cochrane Handbook for Systematic Reviews of Interventions presents detailed formulae for estimating SDs from P, t or F values; CIs; ranges or other statistics (Higgins 2011). If these formulae did not apply, we would have calculated the SDs according to a validated imputation method, which is based on the SDs of the other included studies (Furukawa 2006). Although some of these imputation strategies can introduce error, the alternative would have been to exclude a given study's outcome and thus to lose information. Nevertheless, we examined the validity of the imputations in a sensitivity analysis that excluded imputed values.

Assumptions about participants who le trials early or were lost to follow-up
Various methods are available to account for participants who le the trials early or were lost to follow-up. Some trials just present the results of study completers; others use the method of last observation carried forward (LOCF); while more recently, methods such as multiple imputation or mixed-e ects models for repeated measurements (MMRM) have become more of a standard. While the latter methods seem to be somewhat better than LOCF (Leon 2006), we feel that the high percentage of participants leaving the studies early and di erences between groups in their reasons for doing so is o en the core problem in randomised schizophrenia trials. Therefore, we did not exclude studies based on the statistical approach used. However, by preference we used the more sophisticated approaches, that is, we preferred to use MMRM or multiple-imputation to LOCF, and we only presented completer analyses if some type of ITT data were not available at all. Moreover, we addressed this issue in the item 'Incomplete outcome data' of the 'Risk of bias' tool.

Clinical heterogeneity
We considered all included studies initially, without seeing comparison data, to judge clinical heterogeneity. We simply inspected all studies for people or situations who were clearly outliers or situations that we had not predicted would arise and, where found, discussed such situations or participant groups.

Methodological heterogeneity
We considered all included studies initially, without seeing comparison data, to judge methodological heterogeneity. We simply inspected all studies for clearly outlying methods, which we had not predicted, would arise and discussed any such methodological outliers.

Visual inspection
We inspected graphs visually to investigate the possibility of statistical heterogeneity.

Employing the I statistic
We investigated heterogeneity between studies by considering the I statistic alongside the Chi P value. The I statistic provides an estimate of the percentage of inconsistency thought to be due to chance (Higgins 2003). The importance of the observed value of I statistic depends on 1. magnitude and direction of e ects and 2. strength of evidence for heterogeneity (e.g. P value from Chi test, or a CI for the I statistic). An I statistic estimate of about 50% or greater accompanied by a statistically significant Chi statistic would be interpreted as evidence of substantial levels of heterogeneity (Section 9.5.2; Higgins 2011). When we found substantial levels of heterogeneity in the primary outcome, we explored reasons for heterogeneity (Subgroup analysis and investigation of heterogeneity).

Protocol versus full study
Reporting biases arise when the dissemination of research findings is influenced by the nature and direction of results. These are described in Section 10.1 of the Cochrane Handbook for Systemic Reviews of Interventions (Higgins 2011). We searched for protocols of included randomised trials. If the protocol was available, we compared outcomes in the protocol and in the published report. If the protocol was not available, we compared outcomes listed in the Methods section of the trial report with actually reported results.

Funnel plot
Reporting biases arise when the dissemination of research findings is influenced by the nature and direction of results (Egger 1997). These are described in Section 10 of the Cochrane Handbook for Systemic Reviews of Interventions (Higgins 2011). We are aware that funnel plots may be useful in investigating reporting biases but are of limited power to detect small-study e ects. We did not use funnel Cochrane Database of Systematic Reviews plots for outcomes where there were 10 or fewer studies, or where all studies were of similar sizes. In other cases, where funnel plots were possible, we sought statistical advice in their interpretation.

Data synthesis
We understand that there is no closed argument for preference for use of fixed-e ect or random-e ects models. The random-e ects method incorporates an assumption that the di erent studies are estimating di erent, yet related, intervention e ects. This o en seems to be true to us and the random-e ects model takes into account di erences between studies even if there is no statistically significant heterogeneity. However, there is a disadvantage to the random-e ects model as it puts added weight onto small studies, which o en are the most biased ones. Depending on the direction of e ect, these studies can either inflate or deflate the e ect size.
We chose the random-e ects model for all analyses. The reader is, however, able to choose to inspect the data using the fixed-e ect model.

Primary outcomes
We did not conduct a subgroup analysis as we did not anticipate su icient power to carry it out.

Clinical state, stage or problem
We proposed to undertake this review and provide an overview of the e ects of Avatar Therapy for people with schizophrenia in general. In addition, we reported data on subgroups of people in the same clinical state, stage and with similar problems.

Investigation of heterogeneity
We reported if inconsistency was high. First, we investigated whether data had been entered correctly. Second, if data were correct, we inspected the graph visually and removed outlying studies successively to see if homogeneity was restored. For this review, we decided that should this occur with data contributing to the summary finding of no more than around 10% of the total weighting, we presented data. If not, we would not pool these data and would discuss any issues. We knew of no supporting research for this 10% cut-o but investigated use of prediction intervals as an alternative to this unsatisfactory state.
When unanticipated clinical or methodological heterogeneity were obvious we would have simply stated hypotheses regarding these for future reviews or versions of this review. We did not anticipate undertaking analyses relating to these.

Implication of randomisation
We aimed to include trials in a sensitivity analysis if they were described in some way that implied randomisation. For the primary outcomes, we included these studies and if there was no substantive di erence when the implied randomised studies were added to those with better description of randomisation, then all data employed from these studies.

Assumptions for lost binary data
Where assumptions had to be made regarding people lost to followup (see Dealing with missing data), we would have compared the findings of the primary outcomes when we used our assumption compared with completer data only. If there was a substantial di erence, we would have reported results and discuss them, but continued to employ our assumption.
Where assumptions had to be made regarding missing SD data (see Dealing with missing data), we would have compared the findings of primary outcomes when we used our assumption compared with complete data only. We would have undertaken a sensitivity analysis to test how prone results were to change when completer data only were compared to the imputed data using the above assumption. If there was a substantial di erence, we would have reported results and discussed them, but continued to employ our assumption.

Risk of bias
We analysed the e ects of excluding trials that were judged at high risk of bias across one or more of the domains of randomisation (implied as randomised with no further details available), allocation concealment, blinding and outcome reporting for the meta-analysis of the primary outcome (see Assessment of risk of bias in included studies). If the exclusion of trials at high risk of bias did not substantially alter the direction of e ect or the precision of the e ect estimates, then data from these trials were included in the analysis (Figure 2).

Imputed values
We undertook a sensitivity analysis to assess the e ects of including data from trials where we used imputed values for ICC in calculating the design e ect in cluster-randomised trials.
If there were substantial di erences in the direction or precision of e ect estimates in any of the sensitivity analyses, we did not pool data from the excluded trials with the other trials contributing to the outcome, but presented them separately.

Fixed and random e ects
We synthesised data using a fixed-e ect model; however, we also synthesised data for the primary outcomes using random-e ect model to evaluate whether this altered the significance of the results.

Description of studies
The search identified three studies as eligible for inclusion in this review (ISRCTN65314790; Le 2013; NCT03148639), and one ongoing study (NCT03585127). For more details, see Characteristics of included studies, Characteristics of excluded studies, and Characteristics of ongoing studies tables.

Results of the search
The search strategy was run on the February 2013, December 2016 and 9 November 2018, which found 14 possibly relevant references. We retrieved all of these for detailed evaluation. Following inspection of the abstracts and, when necessary, full papers, the 14 reports referred to four di erent studies. We did Library Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews not exclude any study because all four of them met our inclusion criteria. However, one study was identified as an ongoing study (NCT03585127). See also Figure 1.

Included studies
The three included trials randomised 195 participants. We extracted data from the full-text publication of these trials, which were published on 21 February 2013 (Le 2013), 23 November 2017 (ISRCTN65314790), and 18 February 2018 (NCT03148639). The characteristics of this trial are also summarised in the Characteristics of included studies table.

Design and duration
All studies were stated to be randomised. Study duration across the three included trials was identical; short-term and for seven weeks. Study design varied: two were partial crossover (Le 2013; NCT03148639), and one was cross-sectional (ISRCTN65314790). Blinding for two of the trials was single blind (ISRCTN65314790; Le 2013), and one had no blinding (NCT03148639). Researchers in the UK carried out two of the trials (ISRCTN65314790; Le 2013), and one took place in Canada (NCT03148639).

Participants
All participants were randomly assigned to the treatment groups.
In the ISRCTN65314790 study, participants had a primary diagnosis of schizophrenia spectrum (International Classification of Diseases 10th Edition -Schizophrenia, schizotypal, delusional, and other non-mood psychotic disorders; ICD10 F20-29) or a ective disorder (F30-39) with psychotic symptoms, paranoid schizophrenia, schizoa ective disorder, bipolar disorder, unspecified nonorganic psychosis, schizophrenia unspecified and depression with psychotic symptoms. These people also had a history of enduring auditory verbal hallucinations during previous 12 months despite continued treatment. The participants in Le 2013 had been diagnosed by hearing persecutory voices for at least six months, which had not responded adequately to antipsychotic medication with history of hearing voices for more than one year. In NCT03148639 participants were diagnosed with schizophrenia or schizoa ective disorder with history of treatment-resistant schizophrenia.
The range of age of participants was 14 to 75 years with parental consent obtained for those under 18 years. All studies included males and females and with two studies clearly reporting male to female ratio (ISRCTN65314790; Le 2013).

Size
The number of participants included in each study ranged between 19 and 150 but only one study included over 100 participants (ISRCTN65314790).

Setting
Le 2013 stated that they were focused on inpatients in hospital at community mental health teams in Camden and Islington Mental Health Trust, UK. ISRCTN65314790 took place in south London and Maudsley NHS Trust, UK at sites remote from the clinic, while NCT03148639 took place in the Institut Universitaire en Santé Mentale de Montréal, Canada, and in the community.

Avatar Therapy
ISRCTN65314790 reported that during the Avatar Therapy, participants created a computerised representation of the entity that they believed was the source of their main voice and then the team set up the avatar in an introductory session, which included a comprehensive assessment of the voice(s) and included verbatim content. About 10 to 15 minutes of each session involved face-toface work with the avatar, wherein the therapist facilitated a direct dialogue between the participant and the avatar. Participants sat in one room facing their avatar on a computer monitor. The therapist sat in a second room with a control panel that allowed them to speak in his or her own voice, or as the avatar.
Le 2013 and ISRCTN65314790 reported that the participants created a representation of the entity they imagines as the source of their avatar using computerised face animation so ware and in the third study (NCT03148639), they reported that the participants created an avatar best resembling the most distressing person or entity believed to be the source of the malevolent voice, which was designed to closely have both the face and the voice of the 'persecutor'. Over the course of the Avatar Therapy, the avatar's interaction with the participant became gradually less abusive and more supportive.

Treatment as usual/supportive counselling
Two trials compared Avatar Therapy with treatment as usual (Le 2013; NCT03148639). In Le 2013, treatment as usual was the patient's ongoing antipsychotic medication prescribed and supervised by their referring psychiatrist, and in NCT03148639, they o ered antipsychotic treatment and usual meetings with their treating clinicians. ISRCTN65314790 used supportive counselling as a control group. In this study, the intervention comprised a manual-based, face-to-face supportive counselling approach adapted with permission from that employed by the SoCRATES (Acute Stroke or Transient Ischemic Attack Treated With Aspirin or Ticagrelor and Patient Outcomes) Trial Group delivered by graduate assistant psychologists who were recruited based on extensive experience of working therapeutically in a psychosis context.

General
The trials reported mental state, insight, global state, leaving the study early, adverse events and quality of life. None of the included studies reported death, relapse or direct economic evaluation of Avatar Therapy. Most reported outcomes were continuous and many were skewed. Cochrane Database of Systematic Reviews DASS-21 is a set of three self-report scales designed to measure the emotional states of depression, anxiety and stress. Each of the three DASS-21 scales contains seven items, divided into subscales with similar content.
• Scale for the Assessment of Positive Symptoms (SAPS) (Andreasen 1984) SAPS is a 34-item scale where each item is being rated between 0 and 5 by clinicians to measure positive symptoms in schizophrenia across four positive symptom domains: delusions, positive formal thought disorder or disconnected thinking, bizarre behaviour and hallucinations. The higher score is an indicator of more severe positive symptoms.
• Positive and Negative Syndrome Scale (PANSS) (Kay 1986) PANSS consists of 30 items and is rated using a seven-point severity scale from absent to severe. It measures positive symptoms, negative symptoms and an overall total score of general psychopathology. A higher score indicates more severe symptoms. Mean total score from this scale were reported by five studies, three studies collected separate data for both positive and negative symptoms (Ahmed 2015; Bryce 2018; Holzer 2014).
• Psychotic Symptom Rating Scales (PSYRATS) (Haddock 1999) PSYRATS are semi-structured interviews designed to assess the subjective characteristics of hallucinations and delusions. This is used to capture the severity of several dimensions of auditory hallucinations and delusions across two subclass, an auditory hallucination subscale and a delusions subscale. The auditory hallucination subscale includes 11 items while the delusional subscale includes six items, with both subscales using a 5point ordinal scale to rate symptoms. Scores for the auditory hallucination subscale can range from 0 to 44 and for the delusional subscale can range from 0 to 24, with higher scores indicating more severe symptoms. All items are scored 0 to 4, according to general criteria: 0 = no problem, 1 = minimal or occasional, 2 = minor to moderate, 3 = major and 4 = maximum severity.
• Scale for the Assessment of Negative Symptoms (SANS) (Andreasen 1989) SANS was the first instrument developed to provide comprehensive assessment of negative symptoms in schizophrenia. It consists of five scales that evaluate five di erent aspects of negative symptoms: alogia, a ective blunting, avolition-apathy, anhedoniaasociality and attentional impairment. Each of these negative symptoms can be rated globally, but, in addition, detailed observations are made in order to achieve the global rating. It is complemented by SAPS, which permits detailed evaluation and global ratings of hallucinations, delusions, positive formal thought disorder and bizarre behaviour. Together, the two scales provide a comprehensive set of rating scales that measure the symptoms of schizophrenia and to assess their change over time.

• Perceived Stress Reactivity Scale (PSRS) (Scholtz 2011)
PSRS is a 23-item questionnaire with five subscales and one overall scale, based on an existing German-language instrument. Perceived stress reactivity and related constructs were assessed in 2040 participants from the UK, the US and Germany. The fivefactor structure of the PSRS was found to be similar in the three countries. In the US sample, the questionnaire was applied using two modes of administration (paper-pencil and computerised), and measures were repeated a er four weeks. Measurement invariance analyses demonstrated full invariance across mode of administration and partial invariance across gender and countries. Scale scores di ered between countries and genders, with women scoring higher on most scales.

Mental state: insight
• Revised Beliefs about Voices Questionnaire (BAVQ-R) (Chadwick 2000) The BAVQ-R was designed to assess these constructs, together with two styles of responding (engagement and resistance). The BAVQ-R is widely used in clinical and research settings, yet it has not received validation of its constructs and factor structure. This is used to measure the beliefs people hold about auditory hallucinations, and both their emotional and behavioural reaction to them. The scale has 35-items across five subscales. The subscales relating to beliefs are a six-item malevolence subscale, a sixitem benevolence subscale and a six-item omnipotence subscale. The two subscales relating to emotion and behavioural reactions are a five-item resistance subscale and a four-item engagement subscale. All responses are rated on a 4-point scale from 0 to 3 and the high score means extremely distressed.

Quality of life
• Manchester Short Assessment of Quality of Life (MANSA) (Priebe 1999) The MANSA consists of three sections: • personal details that are supposed to be consistent over time (date of birth, gender, ethnic origin, and diagnosis); • personal details that may potentially vary over time and have to be re-documented if change has occurred (education; employment status including type of occupation and working hours per week; monthly income; state benefits; living situation including number of children, people the patient lives with, and type of residence); • only 16 questions are to be asked every time the instrument is applied. Four of these questions are termed objective and to be answered with yes or no. Twelve questions are strictly subjective.
The subjective questions obtain satisfaction with life as a whole, job (or sheltered employment, or training/education, or unemployment/retirement), financial situation, number and quality of friendships, leisure activities, accommodation, personal safety, people whose the patient lives with (or living alone), sex life, relationship with family, physical health and mental health. Like in the Lancashire Quality of Life Profile, satisfaction is rated on 7-point rating scales (1 = negative extreme, 7 = positive extreme).
• Rosenberg Self-Esteem Scale (RSES) (Rosenberg 1965) The RSES is widely used in social-science research. It uses a scale of 0 to 40 where a score less than 15 may indicate a problematic low self-esteem.
The RSES is designed similar to the social-survey questionnaires. It is a 10-item Likert-type scale with items answered on a 4-Cochrane Library Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews point scale -from strongly agree to strongly disagree. Five of the items have positively worded statements and five have negatively worded statements. The scale measures state self-esteem by asking the respondents to reflect on their current feelings. RRSES is considered a reliable and valid quantitative tool for self-esteem assessment.
Implicit measures of self-esteem began to be used in the 1980s. These rely on indirect measures of cognitive processing thought to be linked to implicit self-esteem, including the Name Letter Task. Such indirect measures are designed to reduce awareness of the process of assessment. When used to assess implicit self-esteem, psychologists feature self-relevant stimuli to the participant and then measure how quickly a person identifies positive or negative stimuli. For example, if a woman were given the self-relevant stimuli of female and mother, psychologists would measure how quickly she identified the negative word, evil, or the positive word, kind.

Missing outcomes
Data were reported on the majority of primary outcomes. However, none of the trials addressed economic outcomes. All three studies reported their data as mean endpoint rather than change.
Several outcomes were in e ect missing because of problematic reporting. ISRCTN65314790 did not separate data for each group for the Reactions to Research Participation Questionnaire (RRPQ) rendering data useless. NCT03148639 reported anxiety and fear data only within a simple graph from which numbers could not be extracted.
Finally, all the three studies reported skewed data which are di icult to analyses within systematic reviews.

Funding
One study was funded by the National Institute of Health

Conflicts of Interest
Two of the authors of ISRCTN65314790 declared patents pending on the avatar system and in Le 2013 there also could be conflicts of interest given that the authors involved in the study were the same team that designed Avatar Therapy.

Excluded studies
There were no excluded studies.

Studies awaiting classification
There are no studies awaiting classification.

Ongoing studies
There is currently one ongoing study in this review (NCT03585127). This small study comparing Avatar Therapy with a more cognitive behavioural approach should be complete (due to finish December 2019) and we will include the results in the review update.

Risk of bias in included studies
Overall risk of bias in individual studies are also illustrated in Figure  2; Figure 3; and Characteristics of included studies table.

Allocation
We assessed two studies at low risk for random sequence generation. Le 2013 randomised using a computer-generated series with 12 blocks by an independent statistician and in ISRCTN65314790, the participants were randomly assigned (1:1) to receive Avatar Therapy or supportive counselling with randomised permuted blocks. NCT03148639 was at high risk as they described participants randomly allocated (1:1 ratio) to either Avatar Therapy or treatment as usual but this ratio does not fit with the actual number of participants in the two groups.
For allocation concealment, ISRCTN65314790 was at low risk as an independent web-based service was provided to maintain allocation concealment. Le 2013 was at unclear risk as, although participants were randomised into one of two groups by an independent statistician using a computer-generated series with blocks of 12. It is not clear, however, if all of the team were informed of the allocated group or care co-ordinators/case managers, blinded to other allocation decisions, informed the study's participants. We assessed NCT03148639 at unclear risk as they did not provide any clear and detailed information about allocation concealment.

Blinding
We ranked ISRCTN65314790 at unclear risk for both performance and detection bias as the study was blind only for assessors. Le 2013 had an unclear risk of performance and detection bias as it was not clear whether there was any blinding. NCT03148639 was also at unclear risk as they did not report any information on blinding.

Incomplete outcome data
For incomplete outcome reporting we had to rank ISRCTN65314790 at high risk, as there were no clear data about the adverse events and unusable information about satisfaction (RRPQ). We ranked Le 2013 at high risk, as there was attrition from the intervention group; however, no further analysis was undertaken to account for these missing data. Also there was di erence between protocol and full publication of the trial about the duration of the trialschanging from 11 months to six weeks. We ranked NCT03148639 at high risk, as they reported postcross-over data -amalgamating the Avatar Therapy plus treatment as usual rather than the outcomes for Avatar Therapy and treatment as usual separately. We contacted the authors via email for more information. They supplied data on Library Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews loss to follow-up, rehospitalisation and the mean outcome data for Avatar Therapy group before crossover.

Selective reporting
None of the studies were free from selective reporting. ISRCTN65314790 was at unclear risk, as they frequently used "significantly greater improvement" but did not describe how this was measured, they also reported confusing information in their CONSORT flow diagram. Le 2013 was at high risk for reporting bias as some of the important outcomes such as adverse events were not clearly reported and a protocol was not published prior to the study results being reported. NCT03148639 was at high risk, as there was no clear information in order to allocation concealment and blinding, adverse events and number of participants.

Other potential sources of bias
None of the studies were free from other biases. We had concerns with ISRCTN65314790, as two of authors declared patents pending on the avatar system used in the study. We also assessed Le 2013 at high risk of other bias, as the study was funded by the NIHR and Bridging Funding from Camden & Islington NHS Foundation Trust but the authors involved in the study were the same team that had designed Avatar Therapy. In addition, information in the published trial was di erent to the submitted abstract. NCT03148639 was at unclear risk as two of the study authors were holders of a grant from Otsuka Pharmaceuticals.

E ects of interventions
See: Summary of findings 1 Avatar Therapy compared to treatment as usual (all short-term) for schizophrenia or related disorders; Summary of findings 2 Avatar Therapy compared to supportive counselling (all short-term) for schizophrenia or related disorders We identified three randomised trials from which it was possible to extract numerical data. We categorised studies into two comparisons: Avatar Therapy versus treatment as usual and Avatar Therapy versus supportive counselling.

Mental state: 1c. Specific -insight -average endpoint attitude to voice score (BAVQ-R attitude to voices, high = poor, skewed data) -short term
Data reported for this outcome were skewed and we have presented them as 'other data' (Analysis 1.3).

Mental state: 1d. Specific -depression -average endpoint score (various scales, high = poor, skewed data) -short term
Data reported for this outcome were skewed and we have presented them as 'other data' (Analysis 1.4).

Mental state: 2c. General -total -average endpoint score (PSRS, high = poor, skewed data) -short term
Data reported for this outcome were skewed and we have presented them as 'other data' (Analysis 1.8).

Global state: 1. Any change -needing counselling and support -short term
Analysis of reported data showed no clear di erence between treatment groups for this outcome (RR 1.85, 95% CI 0.09 to 40.05; studies = 1; participants = 19; Analysis 1.9).

Leaving study early: 1. For any reason -short term
Analysis of reported data showed no clear di erence between treatment groups for this outcome (RR 11.27, 95% CI 0.70 to 181.41; studies = 1; participants = 26; low-certainty evidence; Analysis 1.11).

Leaving study early: 2. For specific reason -short term
One trial reported specific reasons for attrition (Analysis 1.12).

Change in medication
Analysis of reported data showed no clear di erence between treatment groups for this outcome (RR 2.60, 95% CI 0.12 to 58.48; studies = 1; participants = 26).

Fear of voice
Analysis of reported data showed no clear di erence between treatment groups for this outcome (RR 9.53, 95% CI 0.58 to 156.49; studies = 1; participants = 26).

Library
Cochrane Database of Systematic Reviews

Adverse events: anxiety -short term
One trial reported incidence of anxiety (Analysis 1.13).

A er first session
Analysis of reported data showed no clear di erence between treatment groups for this outcome (RR 5.54, 95% CI 0.34 to 89.80; studies = 1; participants = 19; low-certainty evidence).

Self-reported
No participants reported self-reported incidence of anxiety.

Quality of life: average endpoint score (QLESQ-SF, high = good) -short term
Analyses of reported quality of life data showed participants in the Avatar Therapy group had clearly higher average QLWSQ-SF endpoint scores (MD 9.99, 95% CI 3.89 to 16.09; studies = 1; participants = 19; very low-certainty evidence; Analysis 1.14).

Mental state: 1a. Specific -positive -average endpoint score (SAPS, high = good, skewed data)
Data reported for this outcome were skewed and we have presented them as 'other data' (Analysis 2.1).

Mental state: 1b. Specific -insight -average endpoint score (various scales, high = poor) -short term
Various measures of insight were reported, using various scales (Analysis 2.2).

Acceptance of voice (Voices Acceptance and Action Scale; VAAS)
Analysis of insight data showed participants in the Avatar Therapy group had clearly higher average VAAS 'acceptance of voice' endpoint scores (MD 4.73, 95% CI 1.40 to 8.06; studies = 1; participants = 124).

Power of voice (Voice Power Di erential Scale; VPDS)
Analysis of reported data showed no clear di erence between treatment groups for this outcome (MD -0.36, 95% CI -0.89 to 0.17; studies = 1; participants = 115).

Mental state: 1c. Specific -depression and anxiety -average endpoint score (high = poor, skewed data) -short term
Data reported for this outcome were skewed and we have presented them as 'other data' (Analysis 2.3).

Mental state: 1d. Specific -negative -average endpoint score (SANS, high = good, skewed data) -short term
Data reported for this outcome were skewed and we have presented them as 'other data' (Analysis 2.4).

Leaving study early: 1a. For any reason
Analysis of reported data showed no clear di erence between treatment groups for this outcome (RR 1.06, 95% CI 0.59 to 1.89; studies = 1; participants = 150; moderate-certainty evidence; Analysis 2.6).

Leaving study early: 1b. For specific reason -short term
One trial reported specific reasons for leaving the study early (Analysis 2.7).

Not contactable
Analysis of reported data showed no clear di erence between treatment groups for this outcome (RR 3.00, 95% CI 0.63 to 14.39; studies = 1; participants = 150).

Refused
Analysis of reported data showed no clear di erence between treatment groups for this outcome (RR 0.83, 95% CI 0.38 to 1.81; studies = 1; participants = 150).

Unwell
Analysis of reported data showed no clear di erence between treatment groups for this outcome (RR 0.67, 95% CI 0.11 to 3.88; studies = 1; participants = 150).

Attended no sessions
Analysis of reported data showed no clear di erence between treatment groups for this outcome (RR 0.33, 95% CI 0.13 to 0.87; studies = 1; participants = 150).

Quality of life: average endpoint score (various scales, high = good) -short term
Quality of life data were reported using various scales (Analysis 2.8).

Avatar Therapy compared to treatment as usual (all shortterm) for schizophrenia or related disorders
Two trials compared Avatar Therapy to treatment as usual (Le 2013; NCT03148639).

Mental state
It is possible that we, in our protocol, set the expectations too high for measures of mental state. We had hoped to find some binary outcomes such as 'improved to an important extent' and the small trial (participants = 19) only reported continuous measures. It is understandable that a pilot study takes fine-grain measures as outcomes to see if there is a suggestion on a shi on these, hopefully sensitive, measures. There was, however, no suggestion, even on these measures and we found no data for binary outcomes.
The study did not make data available so that binary analyses could be undertaken, even if this would have been grossly underpowered. We would like more data sets to be made available so that combination of findings could be achieved in the future as new trials emerge. Currently our certainty about any mental state evidence is very low -but there are very few data to have any degree of certainty over.

Global state
The same applied to global state measures. There were very few data from one small short-term trial (participants = 19). However, relapse was reported -albeit in the negative. This 'more concrete' outcome is useful when it is impossible to blind outcome measuring.

Leaving the study early
Six of 14 people le the Avatar Therapy group, and none le the trial from the treatment as usual group -out of 12 people assigned to that group. This could be a function of the trial design -allowing exiting from the avatar group with more ease than the treatment as usual group -and probably this is the explanation. However, there is a worry that this could indicate that the Avatar Therapy approach was, somehow, unacceptable to people with schizophrenia. The evidence is of low certainty (Analysis 1.11), but this is one finding that should be closely monitored in future studies.

Anxiety -a er first session
There was no clear di erence between Avatar Therapy and treatment as usual for number of participants with anxiety a er their first session of treatment but we have low certainty for this finding (Analysis 1.13), and it does seem to concur with the 'leaving the study early' finding reported and discussed above. The therapy could increase anxiety. This, in itself, may not be problematic and could have a positive e ect, but this is speculation at this point. We have less evidence of a positive e ect than we have of a suggestion of the Avatar Therapy being unacceptable and causing anxiety.

Average endpoint score (QLESQ-SF, high score = good)
Average scores were not what we prestated in our protocol as a preferred outcome. We had high levels of uncertainty about any finding related to quality of life but the one small trial that reported on this found a possible small positive e ect for Avatar Therapy (participants = 19; Analysis 1.14). This could be a random findinga spurious or rogue result. It is impossible to know. Should it be a real e ect, it will be possible to replicate in future, larger trials.

Functioning
Simple recording of functioning, on day-to-day tasks, is not di icult to add to trial design. We look forward to this being included in future studies.

Avatar Therapy compared to supportive counselling (all short-term) for schizophrenia or related disorders
One trial compared Avatar Therapy to supportive counselling (ISRCTN65314790).

Mental state
The trial (participants = 124) did not report binary data -or data in a way that could be made into binary. The nearest proxy measure for a general mental state outcome did, however, show benefit for those allocated to Avatar Therapy (low-certainty evidence; Analysis 2.5). It could be that the supportive counselling was detrimental and that the Avatar Therapy approach better only in contrast to this. However, if this is a real e ect, this should be replicated, and, explained. We remain unclear what an average drop of around 5 points on the PSYRATS score would mean for a person with schizophrenia.
When it came to a more specific measure (insight) the finding was not dissimilar (low-certainty evidence; Analysis 2.2). Again, it does seem positive that there is a drop of around 8 points on the BAVQ-R -but we are unclear if this would really make any noticeable di erence clinically.

Global state
The trial did not report this relatively simple outcome. This is a missed opportunity as such a finding would be of interest and of value to future updates as more data become available.

Leaving the study early
The study had a small proportion of participants leaving early (about 20%), over a long period of time, which was balanced between groups (moderate-certainty evidence; Analysis 2.6).

Adverse e ects: at least one important adverse e ect
The trial did not address adverse e ects, which is concerning as, at the very least, studies should report them. It is known that supportive therapy can have adverse consequences (Buckley 2015), so not having a clear mechanism for recording these would not seem to be the best design of trial. A drug trial not attempting to record adverse e ects would not be allowed to proceed. It is a di icult argument that the therapy trials should have di erent and lesser standards.

Library
Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews

Quality of life: any change in quality of life
We found very low-certainty evidence for quality of life (Analysis 2.8) -and there was no clear di erence between groups. These are important outcomes that should be replicated in future trials.

Functioning
The trial did not address functioning, which greatly devalued this potentially important trial.
Overall completeness and applicability of evidence

Completeness
This systematic review included three studies (participants = 195). Two studies compared Avatar Therapy with treatment as usual (Le 2013; NCT03148639), and one study compared Avatar Therapy with supportive counselling (ISRCTN65314790). Pooling of data from these studies was not possible. All three studies reported shortterm outcomes (seven sessions in seven weeks). Therefore, we have no information about the medium-or long-term e ects of the interventions used in these trials and the numerical data we do have are uncertain.
None of the studies reported information related to general functioning, cognitive functioning and economic outcomes (direct, indirect and cost e ective). The data that are reported are not fully consistent across the small studies making synthesis problematic. Much of the data reported were skewed. This is not a criticism of these data as it is likely that many of the so-called continuous measures do produce data with considerable skew. The criticism is that these data are not publicly available. Reporting was incomplete.

Applicability
The three studies had a wide group of participants and included both men and women between 14 to 75 years old. All studies, however, were conducted in high-income countries, and centres of excellence so, currently, would be di icult to apply outside of that environment. Much more widespread, generalisable, real-world work would have to be done to support wide applicability.

Quality of the evidence
In these decades a er CONSORT (www.consort-statement.org), no trial should be unclear in how randomisation was undertaken and then concealed. The three included studies were threatened by risks of inclusion of bias at this most fundamental of stages. With thoughtful design and good conduct and reporting, all future Avatar Therapy studies could have risk of bias minimised. However, at present, the certainty of the evidence is not good and all results have to be treated with considerable caution.

Potential biases in the review process
To avoid introducing our own bias to this review we strictly followed Cochrane methods for conducting reviews and have reported all available processes, methods and data transparently so that they can be checked if needed. We would welcome any comments or additional data to improve this review.

Agreements and disagreements with other studies or reviews
To the best of our knowledge, this is the only systematic review of Avatar Therapy.

A U T H O R S ' C O N C L U S I O N S
Implications for practice

For people with schizophrenia
Currently, there is no clear evidence for -or against -using Avatar Therapy as a treatment for people with serious mental illness. The intervention is still an experimental treatment. People with schizophrenia could help generate much more evidence by taking part in good evaluative studies but also make this contingent on all data from their trials being made publicly available.

For clinicians
This review is unable, at the moment, to provide adequate evidence to inform clinicians about the value of Avatar Therapy for people with schizophrenia. If this therapy is being considered, the person with schizophrenia should be informed of the experimental nature of the treatment and the details should be explained to them in a transparent way so that they could make an informed decision.

For policy makers
Currently, there is no clear evidence for encouraging Avatar Therapy as a treatment for people with schizophrenia. Furthermore, there are no data in terms of economic outcomes for using Avatar Therapy in a clinical setting. We believe that more high-quality and longterm studies should be undertaken in this area to explore the e ects, safety and costs of this novel intervention.

Implications for research 1. General
Given that there is insu icient research to determine if Avatar Therapy is e ective, there is a need for further research to establish its value -or lack of it -in this population. In this fast-moving area, it is likely that additional studies are being planned. All of them should report the standards required by CONSORT (www.consortstatement.org). In addition, the need for studies completed by teams without a 'stake' in Avatar Therapy is absolutely necessary, as is the ensuring of the public release of all data relevant to such studies.

Trials
We know that the design of trials of any treatment is a painstaking process that takes time and great e ort. However, we also have given this topic some thought and suggested an outline design (Table 1). We recognise that there is one ongoing study (NCT03585127), but this, too, is small (participants = 50) and is unlikely to definitively answer clinically meaningful questions. The current available data were extracted from three small short-term studies with fairly high risks of bias. Due to the nature of Avatar Therapy, blinding participants is problematic but the use of suitable outcomes could partly compensate for this. There is a real di iculty that researchers with enthusiasm for a therapy are the ones to undertake the trials on their own invention -and are likely to Library Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews continue to be. Every e ort should be made -and be seen to be made -to ensure distance of the pioneering inventors of Avatar Therapy from its evaluation without the loss of the necessary expertise.

A C K N O W L E D G E M E N T S
Thanks to Clive Adams and Claire Irving at Cochrane Schizophrenia Group's editorial team in the Nottingham University for their support in completing this review. Thanks also to Cochrane Schizophrenia Group Editorial Base in Nottingham that produces and maintains standard text for use in the Methods section of their reviews.
We would like to thank Mahdi Moazzen for preparing and completing the protocol.
We would like to thank Alice Waugh for peer reviewing the protocol and Cochrane Copy Edit Support for copy editing.
We would like to thank Laura Dellazizzo who kindly supplied data on lost to follow-up, rehospitalisation and Avatar Therapy group before crossover (NCT03148639).
Parts of this review were generated using RevMan HAL v 4.2. More information about RevMan HAL is available at schizophrenia.cochrane.org/revman-hal-v4,

Library
Trusted evidence. Informed decisions. Better health. Interventions 1. Avatar Therapy: participants created a computerised representation of the entity that they believed was the source of their main voice and then the team set up the avatar in an introductory session, which included a comprehensive assessment of the voice(s) and included verbatim content.

Study characteristics
10-15 minutes of each session involved face-to-face work with the avatar, where in the therapist facilitated a direct dialogue between the participant and the avatar. Participants sat in 1 room facing their avatar on a computer monitor. The therapist sat in a second room with a control panel that allowed them to speak in his or her own voice, or as the avatar. N= 75.
2. Supportive counselling: a manual-based, face-to-face supportive counselling approach adapted with permission from that employed by the SoCRATES Trial Group delivered by graduate assistant psychologists who were recruited on the basis of extensive experience of working therapeutically in a psychosis context. N= 75. Cochrane Database of Systematic Reviews

Bias Authors' judgement Support for judgement
Random sequence generation (selection bias) Low risk Quote: "Participants were randomly assigned (1:1) to receive AVATAR therapy or supportive counselling with randomised permuted blocks (block size randomly varying between two and six". Blinding of outcome assessment (detection bias) All outcomes Low risk Quote: "All assessments were done by research assessors who were masked to therapy allocation. To avoid unmasking, we ensured that assessors did not have access to clinical records after the baseline (pre-randomisation) assessment or access to the therapy database at any stage, that all assessments were done at sites remote from the clinic, and that participants were reminded before each assessment not to disclose their allocation."

Comment: low risk
Incomplete outcome data (attrition bias) All outcomes High risk Quote: no clear information about the adverse events, RRPQ and Air.

Comment: high risk
Selective reporting (reporting bias) Unclear risk All important outcomes have been reported but a few of the outcomes in registered protocol have not been reported in final report of the trials.

Comment: unclear
Other bias High risk Quote: "one of authors is part funded by the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London. Two of the authors declared patents pending on the avatar system. The funding source had no input in the preparation of this manuscript".
Quote: "significantly greater improvement" used frequently and confusing information in the CONSORT flow diagram.
Comment: high risk

Study characteristics
Methods Allocation: randomised Blindness: single Duration: 11 months of treatment stated in protocol but in the final study it stated 6 sessions of 30 minutes + 1 week' follow-up, because there was just 1 therapist, which limited the sessions a History: 22 participants had complete medical compliance, 2 partial and 2 none. 15 had ≥ 10 years of hearing voices, 3 had 6-10 years, and 8 had 1-5 years.
Inclusion criteria: experienced persecutory auditory hallucinations for ≥ 6 months and these hallucinations had not responded adequately to antipsychotic medications.
Excluded: organic brain disease and substance misuse Interventions 1. Avatar Therapy: participants created a representation of the entity they imagined as the source of their auditory hallucinations using computerised face animation software. N= 14.
2. Treatment as usual (delayed therapy group): participant's ongoing antipsychotic medication prescribed and supervised by their referring psychiatrist. N= 12.

Notes
It is unclear how the participants were approached for recruitment. Participants were recruited from the community mental health team; however it is unclear if all participants in the team were approached or care co-ordinators/case managers selectively informed participants who they thought were appropriate to enter the study. a There was a difference between protocol and final study report about the duration of study which has a change from 11 months to 6 weeks + 1 week' follow-up.
b The immediate therapy group did not crossover into a no-therapy block as we expected the therapy to have carry-over effects.
c Parental consent was obtained for participants aged under 18 years.
The study was approved by the West Kent Research Ethics Committee.

Bias Authors' judgement Support for judgement
Random sequence generation (selection bias) Low risk Quote: "after baseline assessment, patients were randomised into one of two groups using a computer-generated series with blocks of 12, generated by an independent statistician".

Comment: low risk
Le 2013 (Continued) Avatar Therapy for people with schizophrenia or related disorders (

Cochrane Database of Systematic Reviews
Allocation concealment (selection bias) Unclear risk Quote: "patients were randomised into one of two groups using a computer-generated series with blocks of 12, generated by an independent statistician".
Comment: it was unclear if all participants in the team were approached or care co-ordinators/case managers selectively informed participants who they thought were appropriate to enter the study.

Comment: unclear risk
Blinding of participants and personnel (performance bias) All outcomes Unclear risk Quote: "both groups of patients were assessed at each time point by a user-researcher who knew neither the group to which each patient was assigned, nor the design of the trial." Comment: participants were not blind to the intervention. The authors did not describe this. We rated unclear only because it appeared that Avatar Therapy could not be blinded; however, it may be possible to provide sham therapy.

Comment: unclear
Blinding of outcome assessment (detection bias) All outcomes Unclear risk Comment: blinding of assessor at follow-up time points was the only information available.

Comment: unclear
Incomplete outcome data (attrition bias) All outcomes High risk Comment: no useful and clear information about attrition.
Comment: some participants le early from the intervention group; however, no further analysis was undertaken to account for their missing data.
Comment: there was a difference between protocol and article about the duration of the study from 11 months (protocol) to 6 weeks (actual length of study).

Comment: high risk
Selective reporting (reporting bias) High risk Comment: no useful information on adverse events and attrition.
Comment: some of the important outcomes such as adverse events were not clearly reported. Furthermore, protocol was not published prior to the study results being reported. Mental state: self-reported anxiety, fear -unable to extract data from graph Notes Wrote to authors 24 November 2018 and Dr Laura Dellazizzo on behalf of Dr Alexandre Dumais, kindly replied supplying data on loss to follow-up, rehospitalisation and Avatar Therapy group before crossover.
a The total amount of weeks was 7 comprising of 1 avatar creation session and 6 therapeutic sessions. b There is no clear and detailed information about it from the baseline.

Bias Authors' judgement Support for judgement
Random sequence generation (selection bias) High risk Quote: "patients fulfilling inclusion criteria were randomly allocated (1:1 ratio) to either VR-assisted therapy (VRT) or treatment-as-usual (TAU)".

Library
Trusted evidence. Informed decisions. Better health.

Cochrane Database of Systematic Reviews
Comment: concern as the number of participants in 2 random groups are different which is not possible in 1:1 ratio. High risk Comment: there was no information in the article because they had reported the crossover data (therapy +treatment as usual) as outcome report instead of reporting therapy outcome report separately. We contacted them via email for more information, and they kindly replied supplying data on loss to follow-up, rehospitalisation and the mean outcome data for avatar therapy group before crossover, but unfortunately, they did not share the change data for this outcome with us.

Characteristics of ongoing studies [ordered by study ID]
Study name Avatar Therapy in comparison to cognitive behavioral therapy for treatment-resistant schizophrenia

A D D I T I O N A L T A B L E S Methods
Allocation: randomised (clearly described) Blinding: single blind (outcomes assessor) They will receive Avatar Therapy sessions after phase one is finished.

Outcomes
Mental state (binary outcomes)

Relapses (binary outcomes)
Quality of life (binary outcomes) Costs: cost of services, cost of care

Cochrane Database of Systematic Reviews
Adverse events related to yoga (number and type of injuries) Service outcomes: days in hospital, time attending outpatient psychiatric clinic

Notes
Adherence should be logged with participants expected to adhere to 70-75% of scheduled sessions. Table 1. Design of a future study (Continued) DSM-IV: Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition; n: number of participants.

H I S T O R Y
Protocol first published: Issue 9, 2015 Review first published: Issue 5, 2020

C O N T R I B U T I O N S O F A U T H O R S
GA: data extraction, data entry, review writing and data analyses.
TK: data extraction, data entry.
FS: protocol development, search, screening and final check before submission.

D E C L A R A T I O N S O F I N T E R E S T
GA: none.

Internal sources
• No source of support, Other

External sources
• No sources of support supplied

General
Since protocol publication (Moazzen 2015), we have updated the methods section of the review using the latest template provided by Cochrane Schizophrenia Group. These amendments, slightly a ected the review criteria and objectives.

Objectives
We have altered the wording of the objectives to more accurately reflect the review title.

Types of outcome measures
We have made changes to the layout and format of Types of outcome measures in line with latest Cochrane Schizophrenia Group presentation of outcomes.
For example -significant response is now clinically important change.
There were too many mental state outcomes listed as primary outcomes in the protocol. We have specified two (positive symptoms and insight) and added at least one adverse e ect, in line with MECIR requirements to include an adverse e ect outcome in primary outcomes.