Binaural Unmasking

Author: Jim Johnston
Co-Author: Jürgen Herre

Background information

Binaural Unmasking, or Binaural Masking Level Difference (BLMD) [1], describes the phenomenon that masking effects can be reduced due to binaural hearing. Sounds that would be masked for monaural hearing can still be detected when additional binaural cues are available. In a real-world scenario, this can typically occur for sound sources in different spatial positions.

The phenomenon of BMLD is most prominently observed at low frequencies when a masker (i.e. the signal doing the masking) and a probe (i.e. the signal that is being masked) have specific interaural phase relationships.

The classic experiment is illustrated in Figure 1 and can be described as follows:

Using headphones, a narrowband noise masker of one critical bandwidth is presented together with a sinusoidal probe signal. The masker is presented identically at both ears, whereas the probe's phase relation between the ears is alternated, i.e. presented in phase, and then out of phase.
In the case where both masker and probe are the same in both ears, a masking threshold very much like the single-ear masking threshold is observed, i.e. for the example given, the threshold of masking for the tone probe is approximately 5.5 dB, as shown in the literature.
In the case where the probe is out of phase (but the masker still in phase), a difference between the signal with and without the probe is easily audible at this level. In fact, the threshold of masking is reduced by up to 15 dB.
For the full effect to be noticed, this must be done at or below 500 Hz center frequency, although some effect has been noted up to between 2 and 3 kHz.

Figure 1: Illustration of Binaural Masking Level Difference experiment using noise as masker (M) and a sine tone as probe (P)

Conversely, the phase of the masker can be alternated while the phase of the probe stays the same. In both cases, the interaural phase relation for the masker vs. probe is radically changed. The sensitivity of the auditory system to the two changes is not symmetric, however. It is also possible to repeat the experiment with a sinusoidal masker, and narrowband noise as the probe, however the effect, while present, is substantially reduced.

Sound examples

The two signal pairs presented here are the classic signals, created directly from MATLAB. Test signals A1 and A2 use in-phase probe tones of 400 Hz frequency with probe levels of 9 dB and 6 dB below the level of the masker, respectively. Test signals B1 and B2 use out-of-phase probe tones with the same frequency and probe levels of 9 dB and 6 dB below the levels of the masker, respectively. The masker noise is presented in-phase in all cases, despite the sometimes startling spatial effects arising from masking release.

Play Test Signal A1: Probe in-phase,
probe level 9 dB below masker
Play Test Signal A2: Probe in-phase,
probe level 6 dB below masker
Play Test Signal B1: Probe out-of-phase,
probe level 9 dB below masker
Play Test Signal B2: Probe out-of-phase,
probe level 6 dB below masker

When listening to signal pair A1/A2, increasing the probe level does not lead to an audible difference in the perceived sound since the probe is masked in both cases. This is in line with classic monaural psychoacoustic observations which assume a masking threshold of ca. 5.5-6 dB below probe level for this case. On the other hand, when listening to test signal B1 and comparing to test signal A1 (i.e. switching from an in-phase to an out-of-phase probe), an audible difference in sound impression is perceived due to the reduction of the masking threshold by the BMLD effect. This effect is even more pronounced when listening to test signal B2, i.e. with a level of 6 dB below the masker.

While these signals do not directly mimic most coding artifacts, they do provide a good example of the kind of artifacts to listen for, and one that has proven good at sensitizing listeners to more subtle coder-induced artifacts involving BMLD.

For natural signals, the effect of BMLD can be demonstrated with a mono speech signal (=in-phase) to which simulated coding noise is added in-phase or out-of-phase, respectively. The signal-to-noise-ratio (SNR) is the same for both cases. In the first example (16 dB SNR), the in-phase noise is above the masking threshold but it is only perceived as a slight distortion of the speech signal. In contrast, the out-of phase noise is perceived as spatially separated from the original signal, leading to much more noticeable artifacts than for in-phase signals. The artifact evoked by out-of-phase noise might be described as independent "scratching" located at the left and right ear positions. The same effect has been observed in certain well-known critical test items, such as Suzanne Vega / Tom's Diner, in mono vs. stereo at early tests of coded stereo material. Additionally, the second example (10 dB SNR) uses exaggerated artifacts to further demonstrate the differences in the spatial perception for clearly audible in-phase and out-of-phase noise.

PlayOriginal: Female Speech Mono
Play 16 dB SNR Noise Mono: Noise in-phase
speech slightly distorted
Play 16 dB SNR Noise BMLD: Noise out-of-phase
more noticeable artifacts
"scratching" at headphones
Play 10 dB SNR Noise Mono: Noise in-phase
speech audibly distorted
Play 10dB SNR Noise BMLD: Noise out-of-phase
noise spatially separated from speech

In order to hear imaging artifacts in general, the listener must stop, at least temporarily, focusing on the usual range of artifacts, and try to allow the stereo signal to construct a soundstage, and then listen to artifacts. More specifically, one should attempt to position of things in the soundstage, noting both omissions and commissions, i.e. new additions to the soundstage that are not present in the reference. Again, both errors of Omission and COmission must be noted, as either can happen.

It is worthwhile to note that while the classic BMLD does not operate above about 2 kHz, there is a similar effect based on the signal envelope that can create audible problems, for instance in certain recordings of a Harpsichord, an effect that can be noticed as a "dirty surround" at frequencies above 3 kHz. The occurance of such problems is very signal-specific, but again related to the spatial perception of the signal and any new or missing parts in the spatial response.

One interesting case that has been observed is in fact a "missing" artifact, where in rock cut called "Dorita" by Lou Reed, the stereophonic presentation of a coded signal appeared to be missing high frequencies, while either channel of the signal, presented monophonically (either in 1 or two speakers) did not. This effect was eventually correlated with the high-frequency envelope of the signal in the two channels. When the high-frequency envelope of the two channels was effectively randomized, (it being highly correlated, but delayed a bit, in the original) the auditory system appeared to understand the high-frequency sounds as not part of the guitar "auditory object". When the high-frequency envelope was corrected, the high frequencies returned to the sound of the guitar. Bear in mind that through this experience, the high frequencies sounded normal in a monophonic presentation, be it of left, right, or summed L+R channels.

References

[1] J. Blauert, "Spatial Hearing", MIT Press, 1983