Introduction

The “Use of Outcome Measures in Clinical Practice: Part 1” discussed the rationale for routine use of outcome measures in clinical practice. It described the development of the Hearing Aid Follow-Up Survey (HAFUS) and the issues surrounding its implementation. The details for the group of 303 patients (Group 1) that were described in the analysis of the HAFUS are included in Part 1. Also, the final HAFUS form can be seen in Appendix A of Part 1. Part 2 of this article discusses some of the factors that may have influenced outcomes of hearing aid fittings. It also discusses results we have accumulated over the subsequent three and a half years and some of the ways the HAFUS has been used in our practice. This project has been approved by the Human Subjects Institutional Review Board of Western Michigan University.

Comparison to Independent Variables

There are many variables that may contribute to the success or failure of a hearing aid fitting. We were interested in determining if any of the more commonly discussed independent variables might have affected the outcomes of our group of 303 patients. Specifically, we considered the patient variables of age, gender, degree of hearing loss, previous experience with hearing aids, and financial contribution from third-party payers to the purchase of their hearing aids. We also considered hearing aid variables including level of technology, use of one or two hearing aids, and whether the ear canal was partially or fully occluded versus open fit. 

FIGURE 1. Mean HAFUS score on the noise/reverberation subscale (items 10, 12, 15, 16) for patients separated by age decade. Error bars = 1 standard deviation. Decade 5 includes all patients age 59 years or younger. Decade 6 = ages 60–69 years. Decade 7 = ages 70–79 years. Decade 8 = ages 80–89 years. Decade 9 = all patients age 90 years or older.
FIGURE 1. Mean HAFUS score on the noise/reverberation subscale (items 10, 12, 15, 16) for patients separated by age decade. Error bars = 1 standard deviation. Decade 5 includes all patients age 59 years or younger. Decade 6 = ages 60–69 years. Decade 7 = ages 70–79 years. Decade 8 = ages 80–89 years. Decade 9 = all patients age 90 years or older.

In most cases, we chose to test for significance using only the noise/reverberation subscale, since the average scores were closer to the middle of the range than for other items. Also, our patients commonly tell us they would like to hear better in these more difficult situations. In some cases, we also tested for significance of other items that seemed relevant for the specific variable (e.g., bothered by own voice for occluded/open fit). Data was analyzed using Excel 2010 and IBM SPSS Statistics 24. Analysis used parametric methods as with other recent studies (Cox et al, 2016; Smith et al, 2013). 

Age comparisons were made with patients sorted into decades. Due to small numbers of patients under 60 years of age, they were combined into the 50s decade. Similarly, all those over 90 years of age were combined. This resulted in five groups. FIGURE 1 shows a gradual progression in difficulty hearing in noise/reverberation as age increases. Lower scores on the HAFUS indicate better outcomes. Analysis of variance (ANOVA) did demonstrate a significant age effect (F (4, 290)=2.85, p=.02). However, post hoc testing (Tukey) revealed significant differences (p=.021) only between patients aged 59 or younger and those 90 or older. Cohen’s d of .74 indicated this to be a medium effect. There was about 11 dB difference in hearing loss between the youngest and oldest groups (44 dB HL vs. 55 dB HL) which may at least partially explain the difference. 

There were 144 (47.5 percent) males and 159 (52.5 percent) females in the total group of 303 patients. The male mean age was 74 years and the mean pure tone four frequency average (PTA4=.5, 1, 2, 4 kHz) was 49 dB HL. For the females, the mean age was 77 years and the mean PTA4 was 52 dB HL. The mean noise subscale score was 2.65 (sd 1.06) for the males and 2.77 (sd 1.28) for the females. The difference was not significant (t (293)=-.837, p=.403). 

FIGURE 2. Mean HAFUS score on the noise/reverberation subscale (items 10, 12, 15, 16) for patients separated by 4 frequency average (.5, 1, 2, 4 kHz) degree of hearing loss. Error bars = 1 standard deviation. Decade 3 = PTA4 39 dB HL or less. Decade 4 = PTA4 40–49 dB HL. Decade 5 = PTA4 50–59 dB HL. Decade 6 = PTA4 60–69 dB HL. Group 7 = all patients with PTA4 70 dB HL or greater.
FIGURE 2. Mean HAFUS score on the noise/reverberation subscale (items 10, 12, 15, 16) for patients separated by 4 frequency average (.5, 1, 2, 4 kHz) degree of hearing loss. Error bars = 1 standard deviation. Decade 3 = PTA4 39 dB HL or less. Decade 4 = PTA4 40–49 dB HL. Decade 5 = PTA4 50–59 dB HL. Decade 6 = PTA4 60–69 dB HL. Group 7 = all patients with PTA4 70 dB HL or greater.

Hearing loss comparisons were made with patients separated into 10 dB decades, based on the PTA4 for the better ear in bilateral fittings or the aided ear in unilateral fittings. Due to the small numbers of patients with hearing loss less than 30 dB HL or greater than 79 dB HL, these were combined into groups 39 dB HL or less and 70 dB HL or greater. The mean data (see FIGURE 2) showed a significant increase in difficulty hearing in noise with increased thresholds (F (4, 290)=3.875, p=.004). However, Tukey post hoc testing showed the difference occurred only between those with 39 dB or less loss and those of 70 dB or greater loss (p=.009). Cohen’s d of .66 indicated this to be a medium effect. The difference could not be explained by age, as those in groups with the least and most hearing loss were of the same average age (69 years), while those in the middle groups were actually older (average age ranged from 76 dB–81 dB).

Experienced hearing aid users (n=212) reported significantly greater (t (293)=-3.032, p=.003) difficulty hearing in noise than new hearing aid users (n=83) with mean noise/reverberation subscale scores of 2.8 and 2.4, respectively. Cohen’s d of .41 indicated this to be a small effect. The mean and standard deviation for the two groups are shown in FIGURE 3. While the two groups were both about 76 years old, the milder hearing loss of the new users (PTA4 of 42 dB HL for new users vs. 54 dB HL for experienced users) seemed a likely explanation for the difference. 

FIGURE 3. Mean HAFUS score on the noise/reverberation subscale (items 10, 12, 15, 16) for patients separated by level of hearing aid experience. Error bars = 1 standard deviation. Group 1 = first time hearing aid wearers. Group 2 = experienced hearing aid wearers.
FIGURE 3. Mean HAFUS score on the noise/reverberation subscale (items 10, 12, 15, 16) for patients separated by level of hearing aid experience. Error bars = 1 standard deviation. Group 1 = first time hearing aid wearers. Group 2 = experienced hearing aid wearers.

Nearly half of our patients received some contribution to the cost of their hearing aids from insurance. Those with insurance (n=148) had an average age of 74 years and PTA4 of 52 dB HL. Those without insurance (n=155) had an average age of 78 years and PTA4 of 49 dB HL. The mean score on the noise/reverberation subscale was 2.73 and 2.68 for those with and without insurance, respectively. This difference was not significant (t (292)=.373, p=.71). 

Patients in our group were fit with diverse technology as evidenced by number of bands/channels in the hearing aids. There were instruments with 4, 6, 12, or 20 bands/channels fit to 36, 202, 45, and 20 patients, respectively. Mean PTA4 for the groups were similar, ranging from 50–57 dB HL. Means and standard deviations for the noise/reverberation subscale are shown in FIGURE 4. Analysis of variance did not reveal a significant difference among groups (F (3, 291)=.911, p=.436). The lack of difference across levels of technology has also been reported in studies by Cox et al (2016) and Johnson et al (2016) as well as earlier research by Humes et al (1996), Humes et al (2009), Humes et al (2009), among others. 

FIGURE 4. Mean HAFUS score on the noise/reverberation subscale (items 10, 12, 15, 16) for patients separated by level of technology in hearing aids. Error bars = 1 standard deviation. Group 4 = 4 bands/channels. Group 6 = 6 bands/channels. Group 12 = 12 bands/channels. Group 20 = 20 bands/channels.
FIGURE 4. Mean HAFUS score on the noise/reverberation subscale (items 10, 12, 15, 16) for patients separated by level of technology in hearing aids. Error bars = 1 standard deviation. Group 4 = 4 bands/channels. Group 6 = 6 bands/channels. Group 12 = 12 bands/channels. Group 20 = 20 bands/channels.

Current clinical practices call for bilateral fitting except when there are contraindications. Our counseling during the hearing aid selection process helps patients understand the potential benefits of bilateral amplification and also makes them aware that many patients are successful with a unilateral fitting. Of 295 patients with usable surveys for this comparison, 83 (28 percent) wore one hearing aid and 212 (72 percent) wore two hearing aids. However, not all of the bilateral wearers purchased two new hearing aids. Some were previous unilateral wearers who elected to purchase a single aid for the opposite ear. Some were bilateral wearers who elected to replace only one aid, perhaps due to loss, damage or a “unilateral only” insurance benefit. Preliminary analysis did not show a difference among these bilateral groups so they were combined. We chose to compare unilateral versus bilateral fittings on wear time (Q1), loudness (Q4, Q5), natural sound quality of own or other’s voices (Q7, Q8), hearing in quiet subscale (average Q9, Q11), or hearing in noise/reverberation subscale (average Q10, Q12, Q15, Q16). Means and standard deviations are shown in FIGURE 5. The t-test results did not reach significance (p>.05) for any comparison. This finding is consistent with research by Cox et al (2011). They found that a significant minority of their subjects preferred unilateral fitting. 

FIGURE 5. Mean HAFUS score for groups separated by unilateral (blue bars) or bilateral (red bars) hearing aid use. Error bars = 1 standard deviation. Q1 = hearing aid wear time. Q4 = hearing aids are loud enough for conversation. Q5 = hearing aids are not uncomfortably loud. Q7 = others voices have natural sound quality, Q8 = own voice has natural sound quality. Quiet = hearing in quiet items 9, 11. Noise = hearing in noise/reverberation items 10, 12, 15, 16.
FIGURE 5. Mean HAFUS score for groups separated by unilateral (blue bars) or bilateral (red bars) hearing aid use. Error bars = 1 standard deviation. Q1 = hearing aid wear time. Q4 = hearing aids are loud enough for conversation. Q5 = hearing aids are not uncomfortably loud. Q7 = others voices have natural sound quality, Q8 = own voice has natural sound quality. Quiet = hearing in quiet items 9, 11. Noise = hearing in noise/reverberation items 10, 12, 15, 16.

There are many styles of hearing aids currently in use. One of the important differences among them is the degree of ear canal occlusion. In fact, a recent survey was developed for exploring style preferences (Smith et al, 2013). We chose to compare outcomes based on level of occlusion. Group 1 (n=220) consisted of those patients fit with a conventional ear mold coupled to a behind-the-ear aid or a custom in-the-ear type aid of any size. The ear piece may have been fully occluding or incorporated some level of venting. Group 2 (n=75) consisted of those patients fit with a receiver-in-canal or thin, tube-style behind-the-ear aid terminated with a non-occluding dome. There were a few cases that started with one style of hearing aid but were changed to another style due to difficulties that became apparent during follow up visits. Several comparisons were made of these styles as there might also be expected to be differences in comfort, sound quality, and feedback (FIGURE 6). 

The average noise/reverberation subscale score was 2.8 for the occluded group and 2.4 for open fit group, indicating better performance for the latter. This difference was significant (t (293)=2.521, p=.012). A significant difference was found for wear time (t (291)=-3.013, p=.001), with the occluded group having greater time of use than the open fit group (Q1 score=1.12 vs. 1.28). Also, ease of handling yielded a significant difference between groups (t (296)=1.816, p=.007) with the open fit group reporting better scores than the occluded group (Q3 score =1.43 vs. 1.67, respectively). Cohen’s d indicated a small effect in each case (d=.26–.38). No significant difference (p>.05) was found for comfort of fit (Q2), loudness of average or strong sound (Q4, Q5), feedback (Q6), natural sound quality of own or others voices (Q7, Q8) or on the hearing in quiet subscale (average Q9, Q11). 

In our practice, non-occluding products are most often used for younger patients (70.5 years) with less hearing loss (PTA4 37 dB HL), especially in the low frequencies. More occluding products are typically recommended for older patients (77.8 years) with less dexterity and with greater hearing loss (PTA4 55 dB HL), especially in the low frequencies. The significant difference for hearing in noise may be due more to these differences than the degree of occlusion. The longer wear time for the occluded group is also likely explained by their larger hearing loss requiring more frequent use of amplification. In the Smith, et al (2013) study, better outcomes were found with non-occluding fits than traditional fits for both a clinical group and an experimental group. Unlike our groups, their fittings used the two different styles on patients with the same degree/configuration of hearing loss. However, we believe the outcomes from Smith, et al and the present study indicate that either style can be successful when fit appropriately.

FIGURE 6. Mean HAFUS score for groups separated by occluded/vented (blue bars) or open fit (red bars) hearing aid use. Error bars = 1 standard deviation. Q1 = hearing aid wear time. Q2 = comfort of fit. Q3 = ease of handling. Q4 = hearing aids are loud enough for conversation. Q5 = hearing aids are not uncomfortably loud. Q6 = squeal/feedback. Q7 = others voices have natural sound quality, Q8 = own voice has natural sound quality. Quiet = hearing in quiet items 9, 11. Noise = hearing in noise/reverberation items 10, 12, 15, 16.
FIGURE 6. Mean HAFUS score for groups separated by occluded/vented (blue bars) or open fit (red bars) hearing aid use. Error bars = 1 standard deviation. Q1 = hearing aid wear time. Q2 = comfort of fit. Q3 = ease of handling. Q4 = hearing aids are loud enough for conversation. Q5 = hearing aids are not uncomfortably loud. Q6 = squeal/feedback. Q7 = others voices have natural sound quality, Q8 = own voice has natural sound quality. Quiet = hearing in quiet items 9, 11. Noise = hearing in noise/reverberation items 10, 12, 15, 16. 

The factors of age, degree of hearing loss, hearing-aid experience, and ear-canal occlusion, which were significant in the above univariate analysis, were included in multivariate testing with the score on the noise/reverberation subscale as the dependent variable. There were no significant interaction effects among any of the factors (p>.05). When these factors were included in a regression analysis, they explained about nine percent of the total variance.

Group 2

As noted previously, our use of the HAFUS has been ongoing since 2010. We began accumulating results from a second group in December 2013. There were two minor modifications to the HAFUS following the initial group. First, the instructions added the statement, “you will not hurt our feelings” [with negative responses] in an attempt to reduce the “halo effect.” Second, item 6 was modified because it seemed patients may have been responding to the brief period of initial feedback that often occurs until the system stabilizes. Patients were asked to report on feedback that occurred after the first few seconds the aids were turned on. 

We began analysis with the next 356 consecutive surveys. We disqualified 34 surveys because they were not completed independently, did not respond to Q9 (hearing one other person in quiet), or provided the same response to all items. This left 322 surveys available for analysis. The average age of Group 1 and Group 2 was the same: 76 years. We did not gather all of the patient hearing loss and hearing aid data that was included in Group 1. The overall pattern of results for Group 2 was essentially the same as for Group 1. Means and standard deviations are shown in FIGURE 7. Results of ANOVA and post-hoc testing will be described together with the next group.

Group 3

There is an opportunity to assess hearing-aid outcomes any time that a patient returns to the office. We have used the HAFUS to assess outcomes during semiannual hearing aid clinics that we offer to our patients. Patients were notified of clinic dates in our newsletter and encouraged to schedule brief visits for routine hearing aid care. Those who requested an appointment due to significant issues were scheduled on another day to allow sufficient time to address their problems. Therefore, most of those attending the clinic were generally satisfied with their hearing aids. Patients completed the HAFUS upon arrival to the office. As we met with them, we were able to quickly scan the survey and determine if it appeared further assistance was warranted. 

FIGURE 7. Mean HAFUS scores for 3 groups separated by time after fitting that the scale was completed. Group one (blue bars), the initial data set of 303 patients and group two (red bars), subsequent group of 330 patients, completed the scale 2–3 months post hearing aid fitting. Group 3 (black bars), 162 patients, completed the scale an average of 3–4 years post fitting. Error bars = 1 standard deviation. Q1 = hearing aid wear time. Q2 = comfort of fit. Q3 = ease of handling. Q4 = hearing aids are loud enough for conversation. Q5 = hearing aids are not uncomfortably loud. Q6 = squeal/feedback. Q7 = others voices have natural sound quality, Q8 = own voice has natural sound quality. Quiet = hearing in quiet items 9, 11. Noise = hearing in noise/reverberation items 10, 12, 15, 16.
FIGURE 7. Mean HAFUS scores for 3 groups separated by time after fitting that the scale was completed. Group one (blue bars), the initial data set of 303 patients and group two (red bars), subsequent group of 330 patients, completed the scale 2–3 months post hearing aid fitting. Group 3 (black bars), 162 patients, completed the scale an average of 3–4 years post fitting. Error bars = 1 standard deviation. Q1 = hearing aid wear time. Q2 = comfort of fit. Q3 = ease of handling. Q4 = hearing aids are loud enough for conversation. Q5 = hearing aids are not uncomfortably loud. Q6 = squeal/feedback. Q7 = others voices have natural sound quality, Q8 = own voice has natural sound quality. Quiet = hearing in quiet items 9, 11. Noise = hearing in noise/reverberation items 10, 12, 15, 16. 

We compiled the results from the HAFUS at two of these occasions. The first event produced usable surveys from 84 patients. Their average age was 79 years (SD 9.90) and their hearing aids had been obtained an average of 3.02 years (SD 1.91) previously. The second event produced usable surveys from 78 patients. Their average age was 80 years (SD 8.52) and their hearing aids had been obtained an average of 4.0 years (SD 3.52) previously. We combined the results from these two events for comparison to our normative group (group 1) and the ongoing follow-up group (Group 2). Means and standard deviations are shown in FIGURE 7.

The general trends were quite similar. All three groups reported wearing their hearing aids most of the time. They all heard better in quiet situations than noisy situations. They all reported improved quality-of-life. Despite having hearing aids averaging three to four years older than the normative group, the average decrease in score across all 18 HAFUS items was only .15 scale units. 

Comparisons were completed for several HAFUS items we felt might be affected by hearing aid age or increased experience using ANOVA with Tukey post hoc testing. Item Q3, “hearing aids are easy to handle,” was better for Group 3 with older hearing aids than Group 2 but not for other comparisons (F (2, 748)=3.846, p =.022). Item Q4, “hearing aids are loud enough for most conversations,” was poorer for Group 3 than Groups 1 or 2 (F (2, 744)=6.617, p=.001). For item Q6, more feedback was noted for Group 1 than Group 2 but not for other comparisons (F (2, 753)=8.753, p=.001). It is possible that this difference is due to the wording change noted above that instructed patients to attend to feedback only after the first few seconds of turning on. Significant differences were not found for the quiet items or the quiet subscale. However, all of the individual noise items and the noise/reverberation subscale were significantly poorer for Group 3 than either Group 1 or 2 (depending on the comparison, statistical values ranged from F=4.360–8.166, df=2, 685–734, p=.013 - .001). The mean difference for noise items approximated .5 scale units. Cohen’s d was determined for each item with significant differences and in each case the effect was small (d=.24-.38)

Further review of the Group 3 data from the first event referenced above revealed that 50 of the 84 patients had also completed the HAFUS at the normal interval of two to three months post fitting. Their average patient age was 79 years. Their hearing aids had been obtained an average of 2.17 years previously, making these devices nearly a year newer than for the group as a whole. The mean difference for each of the 18 HAFUS items did not exceed .25 scale units except for item 6 where it reached .29 scale units and item 14 where it reached .5 scale units. It appears that the differences noted for the three groups had not yet emerged during the first two years after the hearing aids were purchased. Consequently, it appears the differences are due to those members of Group 3 with older hearing aids.     

It is possible that some of the changes observed in HAFUS scores are the result of greater experience in more diverse listening conditions (Group 3) than had occurred after the first two months of use (Groups 1 and 2). However, this is unlikely as the Group 3 subgroup that completed the HAFUS at both two to three months post fitting and two years post fitting showed little change. It seems more likely that the differences were due to changes in hearing or electroacoustic performance of the hearing aids, neither being assessed during the brief visits when the surveys were completed. Either of these circumstances would result in less audibility for Group 3 which would be more evident in difficult listening conditions. Our standard procedure for periodic hearing re-examinations includes both coupler and probe microphone measures with modification to hearing aid programming as indicated. We are encouraged that the HAFUS seems sensitive to differences over time. A decline in HAFUS scores may indicate a need for reevaluation of hearing or hearing aid performance. Dillon (2012) provided a review of long-term changes from several studies with different outcome measures. Major changes were not reported, consistent with the present analysis.

Use as a Pre-fitting Measure

We began using the HAFUS as a prefitting measure near the end of 2014. The instructions and survey items were modified by removing reference to the hearing aid in rating difficulty in eight communication situations and the effect of hearing loss on quality-of-life (Appendix A). The scale was then incorporated into our case history form. This gives us the opportunity to discuss situations which patients feel cause them difficulty and to help assess the patients’ perceived need for help. Comparing pre- and post-fitting HAFUS provides a measure of benefit, as with the APHAB (Cox and Alexander, 1995). To obtain a difference score, the aided response is subtracted from the unaided response after converting from alpha to numeric (A=1, B=2, etc.). However, determining a difference score for Item 18, quality of life, requires a transformation such that unaided “strongly agree,” a negative response, is assigned a score of 7 while aided condition “strongly agree,” a positive response, remains a score of 1, as with all other positive responses. 

FIGURE 8. Mean scores for HAFUS items 9–18 administered to 58 consecutive patients as both a prefitting (Group 2 = red bars) and a postfitting (Group 1 = blue bars) measure. Descriptions of Q9–Q18 are provided in Appendix A.
FIGURE 8. Mean scores for HAFUS items 9–18 administered to 58 consecutive patients as both a prefitting (Group 2 = red bars) and a postfitting (Group 1 = blue bars) measure. Descriptions of Q9–Q18 are provided in Appendix A.

We examined 58 consecutive cases for new patients who completed both pre- and post-fitting HAFUS surveys during the period of November 2014 to May 2016. The results are shown in FIGURE 8. The mean benefit ranged from 1.85 – 3.06 scale units across the nine items. Paired samples t-tests were significant beyond the .001 level (depending on the item, statistical values ranged from t=-10.286–5.380, df=50–57) for all comparisons. Cohen’s d ranged from 1.07–2.01, medium to large effects, on the nine items. We interpret this finding as an indication of the success of our fitting and follow up care. Furthermore, we believe the improvement observed is another indication of the validity of the HAFUS. For our patients, the provision of effective amplification appears more important than additional benefit that may be attributed to other factors (e.g., unilateral versus bilateral; level of technology). 

Conclusion

Incorporating outcome measures into routine use in our practice has been gratifying. We feel we are doing a better job of considering the patient perspective in the care we provide. It has given us confidence that the recommendations we make are justified. Our patients appreciate the concern we show by sending them the survey and particularly by contacting them when problems are indicated. 

The HAFUS is a quick and easy-to-use outcome measure. It is reliable and sensitive to communication changes associated with hearing aid use. Others may feel free to incorporate the HAFUS into their practices. However, there are many outcome measures available. Audiologists should select an outcome measure that they can consistently apply to help meet their goals. 

We believe that the outcomes we observed with our patients are consistent with research that has controlled variables such as unilateral/bilateral fitting or level of technology. As manufacturers introduce new products and technologies, independent research is not usually available to help practicing audiologists make evidence-based decisions. Sometimes it is many years later before such evidence exists. Audiologists must make decisions based on information from the manufacturer, clinical experience, and anecdotal evidence from patients who have either succeeded or failed with a particular fitting. Consistent use of outcome measures could give audiologists a growing body of evidence to support their clinical decisions while they wait on results from controlled studies.