Skip to main content

Verified by Psychology Today

The Role of Visual Perception in Misophonia

New research shows how visual perception can modulate the misophonic response.

Key points

  • In a newly published paper, researchers show how visual stimuli can reduce misophonic responses.
  • The researchers show that a visual stimulus that suggests an alternative source of an aversive sound can affect people's reactions to the sound.
  • The researchers released a sound-swapped video database to be used by misophonia researchers to develop new interventions to treat the disorder.

Misophonia is a condition in which people experience intense, negative physical and emotional reactions to certain trigger sounds. Trigger sounds often include everyday orofacial sounds, like chewing, slurping, or sniffing, but can also include repetitive human-produced sounds like finger tapping or pen clicking. Researchers estimate that misophonia affects upwards of 10 percent of the population, with as many as 4 percent of people experiencing clinically significant symptoms that interfere with daily functioning.

Although there is no known treatment for misophonia, the past decade has seen a wealth of new research on the condition. Studies have revealed comorbidities between misophonia and other disorders like obsessive-compulsive personality disorder (OCD), attention-deficit hyperactivity disorder (ADHD), and autism spectrum (ASD; see Jager et al., 2020). In addition, neuroscience research has shown distinct neural circuits in misophonia, including a stronger link between the auditory cortex and orofacial motor areas and stronger activation of the orofacial motor area in response to hearing trigger sounds.

However, new research suggests that it is not just the acoustical properties of sounds that lead to aversive reactions in misophonic sufferers; other sensory modalities also play a role.

The Role of Visual Perception

In a new paper published this month in Frontiers in Psychology, my colleagues and I examined whether visual stimuli could modulate the misophonic response. Using a methodology we introduced earlier (see Samermit, Saal, and Davidenko, 2019), we created a database of sound-swapped videos. Each pair of videos consisted of a trigger sound presented with either the original video source (OVS) or a non-trigger alternative–a positive attributable video source (PAVS). For example, the sound of someone chewing a crunchy chip could be paired with the original video (OVS) or with a video of someone tearing a piece of paper (PAVS). Critically, the PAVS video needed to be precisely synchronized with the trigger sound such that it seems like a plausible source of the sound (see example below).

Nicolas Davidenko
An original video source of someone chewing (left) and a positive attributable video source (someone tearing paper; left).
Source: Nicolas Davidenko

Please watch the three videos below (and note that they share the same sound!):

A Sound-Swapped Video Database

In our paper, we release a sound-swapped video database consisting of 18 OVS and 18 PAVS videos. The full set of videos, along with aggregate ratings of their pleasantness and detailed instructions for how to construct them, can be accessed here.

To validate our stimuli, a group of 102 naive participants from the University of California, Santa Cruz, evaluated the videos and completed a Misophonia Questionnaire (the MQ; Wu et al., 2014). Specifically, participants observed each 12-second video and rated how pleasant or unpleasant the sound was. After rating all of the sounds twice (once accompanied by the OVS video and once by the PAVS video, in a randomized order), participants completed the MQ.

The results indicated a robust effect of visual stimuli. Sounds were rated as significantly more pleasant when accompanied by the PAVS source compared to the OVS source. This effect was nearly universal, manifesting in 99 of the 102 participants. The figure below shows the difference in pleasantness ratings between PAVS- and OVS-paired sounds for each participant on the y-axis. The x-axis shows each participant's score on the misophonia severity scale.

Nicolas Davidenko
Pleasantness difference between PAVS- and OVS-paired sounds as a function of individuals' MQ score.
Source: Nicolas Davidenko

A correlation analysis revealed a moderate positive correlation between misophonia severity and the PAVS-OVS difference: Individuals with more misophonia symptoms tended to show a larger benefit of PAVS- over OVS-paired sounds.

We also examined whether the order of presentation made a difference; that is, does it matter whether an individual observes an OVS-paired sound before or after they observe the PAVS-paired sound? As it turns out, the order of presentation did matter. The average attenuation effect (i.e., the difference in pleasantness ratings) was 0.603 when the PAVS video was presented first, compared to 0.442 when the OVS video was presented first.

This order effect suggests that an individual's belief about the source of a sound matters. When an individual is exposed to the original video first, it reduces the effectiveness of seeing the PAVS video later. This is an important detail since it may constrain the long-term effectiveness of this manipulation.

Our research is in line with other recent findings that suggest beliefs and context matter in misophonia. For example, a study by Marie-Anick Savard and colleagues (published in the same issue of Frontiers) shows that responses to misophonic triggers depend on whether the listener correctly identifies the sound source.

Correctly identified trigger sounds led to more negative ratings. Intriguingly, there was no difference in identification abilities between those high and low misophonia symptoms, suggesting that misophonia is not likely to be driven by a bottom-up auditory process but rather relies on downstream neural and cognitive processes.

Developing an Intervention for Misophonia

In ongoing research, we are examining whether the visual attenuation of misophonic responses can be developed into a type of intervention. If an observer's belief about the source of a sound can modulate their physical and emotional reaction to that sound, it may be possible for individuals to regulate their own reactions by imagining alternative sources.

Returning to our original example, suppose an individual is triggered by the sound of chewing. If they are able to imagine the chewing sound as being produced by a non-trigger source (e.g., someone tearing a piece of paper), this imagery exercise may lead to a reduced emotional and physical response to the sound, which in turn can allow the individual to more easily tolerate the situation.

We hope that by releasing our database of sound-swapped videos, other researchers will use the stimuli to study whether exposure to alternative (non-triggering) sound sources can result in long-term benefits for those suffering from misophonia.


Samermit, P., Young, M., Allen, A.K., Trillo, H., Shankar, S., Klein, A., Kay, C., Mahzouni, G., Reddy, V., Hamilton, V., & Davidenko, N. (2022). Development and Evaluation of a Sound- Swapped Video (SSV) Database for Misophonia. Frontiers in Psychology, 13:890829.

Samermit, P., Saal, J., & Davidenko, N. (2019). Cross-sensory stimuli modulate reactions to aversive sounds. Multisensory Research, 32(3), 197-213.

Jager, I., de Koning, P., Bost, T., Denys, D., & Vulink, N. (2020). Misophonia: Phenomenology, comorbidity and demographics in a large sample. PloS one, 15(4), e0231390.

Wu, M. S., Lewin, A. B., Murphy, T. K., and Storch, E. A. (2014). Misophonia: incidence, phenomenology, and clinical correlates in an undergraduate student sample. Journal of Clinical Psychology, 70, 994–1007. doi: 10.1002/jclp.22098

Savard, M. A., Sares, A. G., Coffey, E. B., & Deroche, M. L. (2022). Specificity of Affective Responses in Misophonia Depends on Trigger Identification. Frontiers in Neuroscience, 722.

More from Nicolas Davidenko Ph.D.
More from Psychology Today
More from Nicolas Davidenko Ph.D.
More from Psychology Today