Background sounds and hearing-aid users: A scoping review.

OBJECTIVES
A scoping review focused on background sounds and adult hearing-aid users, including aspects of aversiveness and interference. The aim was to establish the current body of knowledge, identify knowledge gaps, and to suggest possible future directions for research.


DESIGN
Data were gathered using a systematic search strategy, consistent with scoping review methodology.


STUDY SAMPLE
Searches of public databases between 1988 and 2014 returned 1182 published records. After exclusions for duplicates and out-of- scope works, 75 records remained for further analysis. Content analysis was used to group the records into five separate themes.


RESULTS
Content analysis indicated numerous themes relating to background sounds. Five broad emergent themes addressed the development and validation of outcome instruments, satisfaction surveys, assessments of hearing-aid technology and signal processing, acclimatization to the device post-fitting, and non-auditory influences on benefit and satisfaction.


CONCLUSIONS
A large proportion of hearing-aid users still find particular hearing-aid features and attributes dissatisfying when listening in background sounds. Many conclusions are limited by methodological drawbacks in study design and too many different outcome instruments. Future research needs to address these issues, while controlling for hearing-aid fitting.

outcomes; satisfaction; acclimatization; non-auditory influences Hearing loss is a major public health issue affecting over 164 million over the age of 65 worldwide, i.e. 33% of the world's population above 65 years, according to the most recent World Health Organization estimates (Stevens et al, 2013). The most common form of treatment for hearing loss in adults is the provision of a hearing aid. However, hearing-aid adoption has remained stubbornly low, despite improvements in technology and fitting. In the United States, of an estimated 26.7 million persons with hearing loss425 dB, only 3.6 million use a hearing aid (Chien & Lin, 2012). This amounts to more than 22 million adults with unaided hearing loss in the United States alone.
If a person has unsuccessful or negative hearing-aid experiences then he/she will be less likely to use the device. Difficulty with background sounds is consistently listed as one of the major problems adult listeners have with hearing aids (e.g. Brooks, 1985;Hickson et al, 2010;Palmer et al, 2006). This article reports a review of what has been written in the public domain about background sounds and adult hearing-aid users, especially from the perspective of aversiveness, interference, annoyance, complaint (or satisfaction). Background sounds include any sort of sound that is not the targeted focus of listening. We used a scoping review which is a rigorous technique to summarize relevant literature in a field of interest (Levac et al, 2010). It sought to identify where knowledge has been established, where findings are suggestive but not definitive, where there are gaps in the existing body of knowledge, and where new research might be directed (cf. Arksey & O'Malley 2005).
In an earlier scoping review about non-usage of hearing aids in adults, McCormack and Fortnum (2013) identified 10 published articles that systematically examined the principle reasons for nonusage. Five of those 10 articles mentioned 'noisy situations/ background noise' (literal wording given by the authors) as a motivating reason, with such responses ranging from 22 to 52% (see Table 2 in McCormack & Fortnum, 2013). Since this topic is somewhat broad, here we briefly summarize the descriptions given by those five reports, in chronological order. Kochkin (2000) reported the results of MarkeTrak V survey series in which 2720 hearing-aid consumers were contacted and asked to respond, in narrative form, on their hearing-aid experiences. The theme of background sounds included reports that hearing aids did not work in difficult listening situations or amplified loud noises sometimes painfully, or that background noise was annoying, distracting, or unacceptable. Tomita et al (2001) used the Consumer Assessment Strategy test battery with 59 hearing-aid users. Of those, 22% listed 'picks up background noise' (Table 6, pp. 287) as their reason for non-use, but no further elaboration was given. Vuorialho et al (2006) conducted structured interviews of 76 hearing-aid recipients about their experiences. Interviewees were asked about reasons for non-use, and 'background noise amplified by hearing aid' (Table 3, pp. 357) was mentioned by 56% as their primary reason for non-use. However, the methodology for arriving at this theme from the qualitative data was not specified. Bertoli et al (2009) conducted a survey of hearing-aid users. Respondents who had indicated that they used their aids only occasionally (n ¼ 990) or never (n ¼ 96) were asked to select the underlying reasons from pre-defined options on a questionnaire. 'noisy situations are disturbing' (Table 5, pp. 187) was the most frequently selected response (52% of respondents). Hartley et al (2010) administered a questionnaire to 322 elderly hearing-aid owners. Among the 78 (24%) who reported never using their hearing aid, 22 people stated their principal reason was that sounds were 'too noisy' (pp. 646). The participants were not asked to explain their response; but the authors speculated that some misinterpreted the question. For example, a participant may have reported that his or her aid was 'too noisy' due to the maximum power output being set too high or when used in an environment of high-level background noise.
Of these five articles identified by McCormack and Fortnum (2013), none of them provided a detailed explanation about the exact nature of the problems that adult hearing-aid users experience with background sounds. From the literature reviewed thus far, such a problem is not well-defined.
The present scoping review summarized what is known about background sounds and adult hearing-aid users, including aspects of aversiveness and interference. The review specifically examined literature controlled by commercial publishers (e.g. peer-reviewed journals) and also grey research literature which refers to more informally published academic material (such as technical reports, conference abstracts, consumer surveys, working papers from research groups or committees, and student theses).

Methods
The methods for this scoping review were largely based on the following steps outlined in Arksey & O'Malley (2005): (1) identifying potentially relevant records; (2) selecting relevant records; (3) extracting data items; and (4) collating, summarizing, and reporting the results. This final step included a thematic analysis to group the records according to their main findings relevant to the goal of the present review. We chose not to undertake a consultation of consumers and stakeholders as a final (optional) step (cf. Arksey & O'Malley, 2005).

Identifying potentially relevant records
Five search engines were employed: PubMed, Web of Science, PsycINFO, CINAHL, and Google Scholar. Google Scholar was used to identify grey literature records, in addition to peer-reviewed articles. For inclusion as a grey literature record, the full text had to be accessible (such as in a conference proceeding, web page, or direct from the author).
The search was limited to records produced between January 1, 1988 and January 31, 2014. This start date was chosen to reflect the first complete calendar year following the introduction of digital hearing aids and the time period of the review encompasses the evolution of digital signal processing including compression and noise reduction, as well as directional microphone technology.
Overall, we used four independent search strategies applied to each of the search engines in turn. The precise terms defined within these four search strategies are reported in Figure 1. One strategy identified records relating to hearing aids and sound or noise, one identified records relating to hearing aids and annoyance, aversiveness, or interference, one centred on complaints from hearing-aid users, and one focused on satisfaction (or dissatisfaction) with hearing aids. Interference could include, but was not restricted to, energetic and informational masking, and the term 'masking' was not used as an explicit search term. The sets of search terms were applied to the titles and abstracts only, and the terms 'tinnitus', 'cochlear implant', and 'bone conduction' were used as exclusions, following McCormack and Fortnum (2013). Wherever the search engine made it possible (PubMed and PsycINFO), search results also excluded research conducted with animals and children.
These search strategies returned many thousands of records using Google Scholar and only the first 30 records (corresponding to the first three pages) were examined for relevant titles. We acknowledge that the completeness of our search was limited in this respect, but since Google Scholar orders results by relevance, we are confident that the most cited records were considered. These search strategies returned a total of 1182 records, which were then pooled together for further scrutiny.
A supplementary search stage occurred later in the process, because it was informed by the title selection of the initial 1182 records. This stage is shown in the lower right hand side of Figure 1.

Selecting relevant records
Duplicate records (n ¼ 377) were excluded. The next selection step considered titles only and the criterion for exclusion was based solely on whether the title indicated that the content of the record was within the scope of the research question, with no bias according to the number and type of records retained. A total of 560 records were excluded mostly because they involved children, animals, cochlear implants, bone-anchored hearing aids, drug trials, or where the emphasis was on another sensory modality, such as vision or haptic perception, and hearing was a secondary focus. Scoping reviews ideally avoid bias by sharing tasks across multiple co-authors (Levac et al, 2010). Hence, the first author conducted the initial selection process, and this was subsequently checked by the second author for agreement with the 'out of scope' decisions. One record was reinstated after discussion. Eight records were excluded because the full-text beyond the title was not available, and two records were not available in English. This stage of the selection process retained 234 records.
The process of identifying potentially relevant records was iterative. Three further search strategies identified 12 additional records bringing the total to 246. One search on other work that referenced those records from the list of 234 identified three new records. Three further records were identified by searching on the names of seven key researchers who appeared frequently in the selected list of records indicating that they were particularly active and influential within the scope of the topic (Bentler, Cox, Freyaldenhoven, Gatehouse, Humes, Keidser, and Nabelek).
A final search consulted known literature reviews on the topic identified within the list of 234 and six additional records were identified by this strategy.
Both authors independently conducted the second selection step, which considered the record abstract (or page one of technical reports, etc.). Again the criterion for exclusion was based solely on whether the content of the record was within the scope of the research question, with no considerations as to the number and type of records retained. Five further records were excluded because they were judged to be out of scope. A large number of records (n ¼ 166) were excluded because they were judged not to provide sufficient information to extract meaningful data as described in the data extraction procedure. These records focused on other hearing related issues, such as hearing status, or need for recovery after work. Overall therefore, 75 records were passed onto the stage of data extraction. Full references to all these records are listed in the Supplementary Materials, available in the online version of the journal. Background sounds and hearing-aid users: A scoping review 3

Extracting data items
A template for data extraction was agreed upon by both authors who then independently extracted information on the main findings of the record and the findings that were directly relevant to our scoping review question. Other data items were considered: year, country of origin, participant population, hearing status of participants, sample size, research setting, type of intervention, research design, interval between assessments, and outcome measures. These data items provide key information about the scope and details of each record, enabling the authors to look for common themes and to identify possible gaps in the literature. Data extraction was conducted independently by the two authors. A meeting was convened to resolve discrepancies on data extraction and agree on a final data set.

Collating themes, summarizing and reporting the results
Seventy-five records represent a large amount of information. To provide a structure for the subsequent content analysis and narrative synthesis, the records were first organized thematically. To do this, authors independently noted the main theme for each record and then met to discuss possible thematic structures, using the criteria that themes should be broad and should adequately represent all of the records, but with no single theme containing less than five records. Authors then independently reclassified all 75 records according to these themes and met to agree on a classification. While we note that the content of individual records does not necessarily fall exclusively in one theme or another, our classification focused on the main findings of the record as they relate to the present research question.

Results
Five broad themes were defined: (1) Outcome instruments. This theme was focused on development and/or validation of specific tools for measuring hearing-aid benefit and those tools included items on background sounds. This includes questionnaires and tests of listening performance. (2) General satisfaction. This theme gathered all records that reported ratings of general satisfaction with hearing aids. Sometimes this information has been gathered by questionnaire, but we have considered all those records relating to overall satisfaction as a theme in its own right. (3) Hearing-aid technology. This theme included all records which primarily reported the effects of new technological features on listening performance in background sounds. (4) Acclimatization. This theme encompassed records that focused on how hearing-aids users adapt to their devices with respect to background sounds. (5) Non-auditory influences. The final theme included all records whose primary aim was to investigate how aided listening was affected by various non-auditory factors.
The remaining Results section is organized in two parts. The first part provides an overview of the thematic analysis, in particular describing the scope and the main findings of the records grouped according to the five themes. Where appropriate, this part also reports some of the extracted details of the research design, type of intervention and interval between assessments. The second part provides an overview of the remaining data extraction across the five themes. By pooling together details of the data items across themes, we describe some of the general trends to emerge from the literature. These are reported under the following subheadings: Evolution over time (i.e. year and country of origin), Internal validity (i.e. the participant population, sample size, and hearing status of participants), Core measures (outcome measures), and ecological validity (i.e. research setting).

Thematic analysis OUTCOME INSTRUMENT (N ¼ 14)
These records report development or validation of an outcome instrument, either self-report questionnaires or tests of listening performance that involve background sounds. Questionnaires are the Profile of Hearing Aid Performance (PHAP, Cox & Gilmore, 1990), Abbreviated Profile of Hearing Aid Benefit (APHAB, Cox & Alexander, 1995), Satisfaction with Amplification in Daily Life (SADL, , Performance Inventory for Profound and Severe Loss (PIPSL, Owens & Raggio, 1988), and Profile of Aided Loudness (PAL, Palmer et al, 1999). Many of these instruments have been motivated by their clinical application; such as predicting likely success with amplification or troubleshooting an unsuccessful fitting (APHAB, Cox & Alexander, 1995) or using the response profiles as a basis for individual or group exercise and discussion (Owens & Raggio, 1988). These questionnaires contain items asking about personal experience with background sounds. For example, the PHAP purposefully includes questions on communication in adverse listening conditions and annoyance of environmental sounds. The SADL includes a question about sense of frustration when the hearing aid picks up sounds that negatively affect hearing. We note that these questionnaire items map onto two of our literature search strategies: interference and aversiveness. One short questionnaire rated loudness with four different environmental sounds (Munro & Patel, 1998). Questionnaire data may be limited in value if the respondent's retrospective recall is inaccurate, and so ecological momentary assessment may be a useful alterative. One record reported such a method using a personal digital assistant with daily alerts which prompted participants to answer a short series of outcome describing their experiences with challenging listening situations (Galvez et al, 2012). Comparison with a conventional preand post-outcome questionnaire confirmed that this new method did not exacerbate participants' self-perceived hearing handicap and so it is a feasible method worthy of further consideration.
A prediction made by Nabelek and colleagues (1991) was that a person's willingness to listen to speech in background noise is more indicative of hearing-aid use than a performance score for speech perception in noise. This led to the development of the acceptable noise level (ANL) test. Several records reported convergent validity (high correlations with similar questionnaires) and/or discriminant validity (low correlations with different questionnaires) for the ANL (Freyaldenhoven et al, , 2008, and similarly for PHAP (Purdy & Jerram, 1998), PAL (Mackersie, 2007) and uncomfortable loudness levels (Munro & Patel, 1998). For example, the ANL and APHAB have a low correlation (i.e. high discriminant validity) indicating that they possibly capture different aspects of aided listening (Freyaldenhoven et al, 2008). One record identified from the grey literature was a conference presentation describing a novel test in which participants rate sound exemplars presented at different levels; the Sound Acceptability Test (SAT: Johnson et al, 2012).

GENERAL SATISFACTION (N ¼ 7)
Seven records report overall satisfaction ratings with hearing aids. Most evidence comes from consumer surveys. For example, MarkeTrak surveys in the US reveal that approximately one third of respondents are dissatisfied with the performance of hearing aids in noisy situations (Kochkin 2000(Kochkin , 2002. Hearing-aid attributes and listening situations both contribute to general satisfaction. In Australia, the EARtrak survey of hearing-aid users (Hickson et al, 2010) reported that some of the strongest predictors of hearing-aid outcome were comfort with loud sounds, and conversations in outdoors or in large groups. Again, we note that these variables are associated with interference from and aversiveness of background sounds.
A number of smaller scale hearing-aid user surveys have also been conducted. In a survey of 175 experienced users, speech in noise was again rated as one of the most important attributes of hearing aids (27%), but also the most frequent source of dissatisfaction (30%) (Meister et al, 2002). A structured telephone interview with 177 users found 92% satisfaction (Kaplan-Neeman et al, 2012). Satisfaction and hours of hearing-aid use per day were closely associated, a relationship that the authors attribute to the acclimatization process. One of the main reasons for dissatisfaction was excessive amplification in background noise.

HEARING-AID TECHNOLOGY (N ¼ 35)
There was one literature review in this theme, but it was published almost 20 years ago (Keidser et al, 1996). The remaining records reported experimental studies assessing hearing-aid participants, typically exploring the effects of prototype or available technological innovations on listening in background noise. Ten records assessed the benefit of hearing-aid noise reduction technology for speech communication, consistent with our search strategy of background sounds and interference (Kuk & Tyler, 1990;Mueller et al, 2006;Palmer et al, 2006;Chalupper & Powers, 2007;Keidser et al, 2007;Bentler et al, 2008;Wang et al, 2009;Zakis et al, 2009;Lowery & Plyler, 2010;Liu et al, 2012). Typically, repeated measure assessments were conducted in small samples of hearing-aid users (n ¼ 10-31), within a single test session. Outcomes were tests of speech in noise performance, but the choice of test varied widely across studies. This observation underpins our general conclusion that there is little consensus on the best way to assess technological features of hearing aids.
Six records assessed the effect of compression using a range of methods from questionnaire surveys (Johnson et al, 2010) to repeated-measures design using speech in noise performance (Dolan & Wonderlick, 2000;Gatehouse et al, 2006a) and satisfaction or quality ratings Shi et al, 2007) and loudness and satisfaction ratings from the PAL (Blamey & Martin, 2009). Five records considered microphone settings comparing omnidirectional with directional (Blamey et al, 2006;Gnewikow et al, 2009;Ricketts et al, 2003;Surr et al, 2002;Walden et al, 2000). All five used a repeated-measures design and mixed outcome instruments (e.g. four used the Connected Speech Test, CST and the Profile of Hearing Aid Benefit, PHAB). Only two experimental studies directly compared analogue and digital hearing aids both using a repeated-measures design (Bille et al, 1999;Wood & Lutman, 2004). Three studies directly compared unilateral and bilateral hearing-aid fitting (Cox et al, 2011;Köbler et al, 2001;Marrone et al, 2008). It is interesting to contrast the different conclusions drawn. While questionnaire data demonstrate superior speech in noise listening with two hearing aids (Köbler et al, 2001), 46% of patients actually prefer wearing just one (Cox et al, 2011).
Just over one third of the experimental studies reported (13/33) used a combination of performance and self-report measures. The primary performance based outcome was a speech-in-noise threshold (n ¼ 10 studies), while the PHAB or its abbreviated form was also commonly administered (n ¼ 12 studies). Eight of the latter studies specifically focused on the impact of hearing aids on the aversiveness of background sounds; while two others measured a related concept, annoyance, using a self-rating scale. Again these findings are consistent with our search strategy of background sounds and aversiveness.
In summary, the main finding is that new technological innovations usually improve listening performance in noise, particularly on tests conducted in a controlled environment. Exceptions are evident. For example, Palmer et al (2006) reported that amplification with digital noise reduction increased problems on the aversiveness subscale of the APHAB at three-week's hearing-aid post-fitting. The impact on real-world listening performance is likely to be complex as Gatehouse et al (2006b) noted that real-world benefits of technological features may differ between individuals according to their social lifestyle (i.e. everyday listening situations). Eighteen studies had 30 or fewer participants and so it is unclear how reliable and generalizable are the results reported.
A general observation is that a benefit on one measure does not necessarily predict a benefit on the other (e.g. Abrams et al, 2012;Arlinger et al, 2007;Keidser, 1995;Walden et al, 2000;Zakis et al, 2009). For example, Walden et al (2000) concluded that while directional microphones improved scores on the CST compared with omnidirectional microphones, they did little to alleviate selfreported aversiveness of background sounds (using the Profile of Hearing Aid Benefit, PHAB). Moreover, the participants did not notice a difference in everyday listening. Keidser (1995) found that reducing annoyance and maximizing speech compression required different hearing-aid settings, a more sloping linear response (authors' term) benefitted understanding speech in low-frequency noise, whereas low-frequency compression minimized the annoyance of low-frequency noise. Arlinger et al (2007) concluded that digital hearing aids reduced interference from background sounds for speech perception, but did not affect self-reported aversiveness. These findings suggest that interference on speech by background noise and the experience of the background noise itself seem to be two somewhat independent factors affecting hearing-aid success.

ACCLIMATIZATION (N ¼ 8)
In the context of the scoping review, acclimatization refers to the process of getting used to hearing aids with respect to background sounds. All eight records reported experimental studies, assessing new and experienced hearing-aid users in either repeated measures or parallel-group designs. One of the largest studies was a follow-up of 164 participants who were tested six years after their initial assessment and hearing-aid fitting (Takahashi et al, 2007). Only the PHAB questionnaire was administered at both time points. Most subscales of the PHAB, including ease of communication and background noise, revealed a long-term benefit of about 25-35 points (benefit is calculated as the unaided minus the aided score). However, scores on the aversiveness subscale of PHAB (which asks about listening to potentially aversive background sounds) remained around a negative 10 points across the six year period indicating ongoing problems. The same pattern of results has been reported by Haskell et al (2002) for 360 participants over a three-month period. Ongoing problems in adapting to background noise and group conversations are common complaints (Stephens & Meredith, 1991), with difficulties remaining even 12 months after hearingaid fitting (Bentler et al, 1993).
A different perspective is afforded by studies reporting listening performance. For example, Ahlstrom et al (2009) assessed 21 hearing-aid users' willingness to tolerate background sounds in a spatial version of the acceptable noise level (ANL) procedure. After a 3-6 month acclimatization period, they found that people tolerated less favourable SNRs with hearing aids than without. However, the effects were on the order of 2 dB, which may not translate into a noticeable improvement in everyday listening. Munro and Lutman (2004) have also cautioned on the applicability of ANL measures to hearing-aid use in the real-world.
In summary, despite the small number of experimental studies there is some agreement that hearing aid users do not adapt to potentially aversive background sounds over time.

NON-AUDITORY INFLUENCES (N ¼ 9)
One record was a literature review that considered how hearing-aid satisfaction is related to intrinsic (experience, expectation, personality, and attitude), and extrinsic (usage, type of hearing aids, sound quality, listening situations, and problems in hearing-aid use) influences (Wong et al, 2003). The remaining records were experimental studies investigating how aided listening is affected by various non-auditory factors. One record considered whether hearing-aid fitting and verification influenced ratings of aversiveness (using the APHAB), but demonstrated this not to be so (Abrams et al, 2012). The remaining records considered a range of influences, namely cognitive ageing (Helfer & Freyman, 2008), working memory capacity (Ng et al, 2013), and verbal processing speed (Picou et al, 2013), personality factors (Cox et al, , 2007, and social lifestyle (Gatehouse et al, 2006b;Wu & Bentler, 2012). These influencing factors have been assessed in samples of less than 30 participants, with the exception of one study on listening effort (n ¼ 50, Gatehouse et al, 2006) and the two studies on personality (n ¼ 83 and n ¼ 205, Cox et al, , 2007. There were no outcome measures in common across records. Hence, it is not possible to make any reliable or generalizable conclusions from the present literature.

OTHER (N ¼ 2)
Two remaining records did not easily fit into any of the above themes and so are reported here as 'other' (Kochkin, 2000;Davies et al, 2001). The first was a report from the MarkeTrak survey which gathered reasons for hearing-aid non-use through personal narratives from almost one million respondents. The second described a qualitative social survey determining the extent to which acoustic problems in the built environment affect the elderly.

General trends
A number of different variables were charted to spot any general trends: year, country of origin, participant population, hearing status of participants, sample size, research setting, research design, type of intervention, interval between assessments, and outcome measures. EVOLUTION OVER TIME Figure 2 plots the count of records over time. From the mid-1990s there was a step change in the number of records indicating growing awareness and interest from the research community in the issues of background sounds for hearing-aid users. The majority of records (n ¼ 49) emanate from the USA, followed by Europe (n ¼ 17) and Australia (n ¼ 9). Within Europe, 47% of records were led by UK authors. This distribution largely reflects the influence of a few major laboratories with a high number of outputs.

INTERNAL VALIDITY
Internal validity refers to the study design and conduct. Ideally, intervention studies should have a high internal validity, so that any observed changes can be attributed to the intervention, not to other possible causes.
Twenty-eight out of the 34 records evaluating hearing-aid technology used a repeated-measures design. Under some circumstances, the effect being measured may change because of the number of times the participant is tested. Repeated measures designs are most likely to be affected as scores are susceptible to regression to the mean, and, for performance measures, practice effects can be another confounding factor. While this design might be preferred given the heterogeneity of the test population, six records did include a control group which is an effective way to rule out such threats to internal validity.

EXTERNAL VALIDITY
External validity refers to the generalizability of the findings. Adequate sample size is a common marker of external validity, and this requires a priori justification for the size of the expected effects given the variance of the measurement scores. Across the 75 records in this scoping review, the sample sizes ranged from 8 to over 3000, with a median of 43 ( Figure 3). However, we note that justification of sample size was given in only one record (Cox et al, 2011).

CORE MEASURES
Those records that measure outcome using the same instrument (i.e. a 'core' measure) lend themselves to meta-analysis: a powerful way to draw reliable conclusions, especially when individual experimental studies may have certain methodological limitations such as small sample size. The review highlights a range of different outcome instruments in use. Of those questionnaire instruments with specific relevance to assessing the effects of background sounds on hearing-aid users, across all 75 records there were the following uses: PHAB (n ¼ 8); APHAB (n ¼ 12), APHAB aversiveness subscale (n ¼ 3), the Glasgow Hearing Aid Benefit Profile (GHABP) (n ¼ 8), SADL (n ¼ 6), PIPSL (n ¼ 1), PAL (n ¼ 4) and the Munro and Patel loudness scale (n ¼ 2). Use of SADL was most frequent in those records assessing acclimatization or nonauditory factors. Of the performance tests with specific relevance to assessing the effects of background sounds on hearing-aid users, across all 75 records there were the following uses: ANL (n ¼ 6), SAT (n ¼ 1), a speech in noise reception threshold measure (n ¼ 12), the hearing-in-noise test (HINT, n ¼ 4), CST (n ¼ 4), and the speech-in-noise (SPIN) test (n ¼ 2). Use of the CST was limited to those records assessing hearing-aid technology.

ECOLOGICAL VALIDITY
Ideally, experimental studies should have a high ecological validity, so that results are relevant to the everyday listening situations that hearing-aid users encounter. While the records assessing general satisfaction all involved data collection relating to personal experiences in real-world settings, the records evaluating different effects of hearing-aid technology were typically conducted in the laboratory under artificially controlled and constricted listening environments. For example, with a focus on a direct comparison between alternative technological innovations or with respect to a 'standard' hearing aid (e.g. Bille et al, 1999;Dolan & Wonderlick, 2000;Marrone et al, 2008), or recruiting patients only if they met certain eligibility criteria based on degree and/or etiology of hearing loss (e.g. Lowery & Plyler, 2007;Bentler et al, 2008;Moore & Füllgrabe, 2010). Many of the listening performance tests use different artificial masker noises. The SPIN uses multi-talker babble; the HINT uses speech-spectrum noise, while the CST was originally developed with six-talker babble as noise, but the four records reported here used speech-spectrum noise and the ANL has been implemented using a variety of different background noises, such as multi-talker babble, cafeteria noise, speech-spectrum noise, or traffic noise. One common aspect in all these tests is that nonspeech environmental sounds are greatly under-represented in these maskers. In fact, Freyaldenhoven et al (2006) found that listeners' preference for different types of background sound, as measured by the ANL, was not related to their acceptance of background noise.

Discussion
This scoping review explored issues relating to the effects of background noise on hearing-aid users in order to identify where knowledge has been established, where findings are suggestive but not definitive, where there are gaps in the existing body of knowledge and where innovative approaches may lie. The discussion gathers these findings together in summary form and makes a number of comments about topics that warrant further research.

Conclusions based on established knowledge
A large proportion of hearing-aid users (about one third) still find particular features and attributes of their device dissatisfying in the presence of background sounds. The most common causes for dissatisfaction relate listening in noisy environments and conversations in large groups, as well as the undesirable amplification of unwanted background sounds that are not the focus of attention. We identified at least two separate recurring concepts underlying the effects of background sounds: (1) interference of background sounds on speech communication and, (2) aversiveness of the background sounds. This is evident in the research questions posed by the records shown here, the outcome instruments used, and in some of the findings relating to general satisfaction and hearing-aid technology. We do admit that there is a potential circular bias of the search strategy. However, while we would expect to find issues relating to interference and aversiveness given the choice of search terms, it was not expected that these would be the only themes to recur throughout the process of collating and summarizing the results.
A wide range of outcome instruments are available for assessing the impact of background noise on aided listening. Development of patient-reported measures tends to have been motivated by clinical application for assessment and rehabilitation, while performance tests focus on laboratory-based measurement of speech perception and comprehension ability in noise under controlled conditions. The PHAB/APHAB are in wide usage across many domains of audiological research (Perez & Edmonds, 2012;Granberg et al, 2014), and the topic under review here is no different. In particular, the subscale assessing aversiveness to background sounds has been informative in longitudinal studies as it indicates that hearing-aid users do not adapt well to this aspect of aided listening despite years of device usage.

Suggestive findings
Substantial research effort has been directed towards the evaluation of hearing-aid technologies. Findings indicate that technological innovations usually benefit speech listening performance. However, as previously highlighted by Granberg et al (2014) most are small scale proof-of-concept studies. This is acceptable for experimental studies as long as the study design and participant population are carefully considered. If not, then findings may be unreliable. Our impression from the 75 records reviewed is that study methodology did not always reach such quality standards (see sections on Internal and External validity). We are certainly not the first to note this limitation. It is interesting that over 10 years ago, Wong et al (2003) similarly concluded, 'Inconsistent findings across studies and difficulties in evaluating the underlying relationships are probably caused by problems with the tools (e.g. lack of validity) and the methods used to evaluate relationships (e.g. correlation analyses evaluate association and not causal effect)' (pp. 117).
While self-ratings of speech communication seem to improve over time as individuals acclimatize to aided listening, the perceived aversiveness of background noise and listening in challenging noisy situations do not. What exactly determines the likelihood of successful acclimatization to aided listening is unclear. The greatest evidence concerns personality factors, but a number of other factors apply such as hearing-aid fitting and verification, cognitive ability, and social lifestyle. It could be informative to explore this multi-factorial space in order to better understand the time course of acclimatization and to identify which factors might particularly exacerbate or minimize dissatisfaction with background sounds. These could be potential targets for personalized rehabilitation.

Knowledge gaps
Very little research has ascertained exactly what sort of background sounds are perceived as annoying or aversive by hearing-aid users in real-life listening situations. None of the records identified has systematically quantified what background sounds are deemed annoying by hearing-aid users. However, there is at least one record outside the date range of our search relevant to this issue. Skagerstrand et al (2014) recently analysed daily diary recordings made by 60 new and experienced hearing-aid users. Findings indicate two types of sound sources causing common problems. First, verbal human sounds (55%) are annoying either where the verbal sounds masked wanted sounds, or simply as acoustical annoyance (e.g. pitch, level). Second, TV or radio sounds (42%) are annoying when there is a fluctuating sound level between speech and music or program and commercials. Age, degree of hearing loss, gender, hearing-aid experience appear to have no substantial influence, but those with 'simple' signal processing devices found verbal human sounds and vehicles more annoying than those who used 'advanced' signal processing. The authors highlight a need for more thorough investigation about why some sounds are considered as annoying and what are the determining factors, adding that knowing which background sounds hearing-aid users find annoying, and why, could help to target improvements in hearing-aid signal processing. However, this line of research first requires verification of hearing-aid fitting, especially in terms of the compression characteristics and loudness limiting so that the alternative explanation of incorrect fitting can be eliminated (cf. Abrams et al, 2012).
Our data extraction identified a broad range of outcome instruments for assessing issues related to background noise (benefit of aided listening in background noise and aversiveness). This makes comparisons across records difficult to undertake. This conclusion is in agreement with the more general systematic review of outcome measures used in research on adults with hearing loss, conducted by Granberg et al (2014). While consensus around choice of outcome instruments is warranted, one of the main challenges may be in overcoming the potential for researcher bias. The most widely used outcome instruments are partly explained, not by their wide adoption in the field, but by their use by a small number of groups with a high number of outputs. Notably, those same groups are also responsible for the development and validation of each tool. The APHAB/PHAP, GHABP and ANL are good examples.

Future research
The first knowledge gap discussed in the preceding section highlights the need for further research to address what are the quantitative characteristics of background sounds which interfere with speech communication and/or are perceived as aversive (annoying) and to conduct studies while controlling for hearing-aid fitting (Abrams et al, 2012). If there are quantitative characteristics of annoying sounds that can be differentiated from desirable speech then this new knowledge could help to target improvements in hearing-aid signal processing, but is likely to be challenging. Some of that challenge is encapsulated in the disappointing findings of one recent study exploring the predictive value of various acoustic (e.g. frequency of the spectral peak and spectral energy 3-16 kHz) and psychoacoustic (e.g. loudness, sharpness, and roughness) dimensions to ratings of the pleasantness of different environmental soundscapes (Hall et al, 2013). Predictor variables accounted for only 5% of the variance, leaving most of the variance unexplained. Hence further research is needed to define what are the characteristics of individual listeners who are more or less annoyed by the same sounds (see also Kidd et al, 2007).
The second knowledge gap highlights the need to agree standards for assessing hearing-aid benefit for listening in background noise and the subjective perception of aversiveness or annoyance. The wide variety of objective performance-based tests and subjective self-report instruments found in this review highlights the lack of agreement about what instrument to use for assessing hearing-aid benefit. Moreover, there are few instruments in use to assess aversiveness and annoyance per se. The International Outcome Inventory for Hearing Aids (IOI-HA) is an attempt to cover a core set of assessment domains (Cox & Alexander, 2002) but none of the seven items in the IOI-HA directly measures the effects of background sounds and aided listening.