AudioLabs - NBU: Neural Binaural Upmixing of Stereo Content

NBU: Neural Binaural Upmixing of Stereo Content

Philipp Grundhuber, Michael Lovedee-Turner, and Emanuël A. P. Habets

Published at the Proc. of the International Conference on Digital Audio Effects (DAFx), 2024.

Abstract

While immersive music productions have become popular in recent years, music content produced during the last decades has been predominantly mixed for stereo. This paper presents a data-driven approach to automatic binaural upmixing of stereo music. The network architecture HDemucs, previously utilized for both source separation and binauralization, is leveraged for an end-to-end approach to binaural upmixing. We employ two distinct datasets, demonstrating that while custom-designed training data enhances the accuracy of spatial positioning, the use of professionally mixed music yields superior spatialization. The trained networks show a capacity to process multiple simultaneous sources individually and add valid binaural cues, effectively positioning sources with an average azimuthal error of less than 11.3 degree. A listening test with binaural experts shows it outperforms digital signal processing-based approaches to binauralization of stereo content in terms of spaciousness while preserving audio quality.

Audio Examples

Original Stereo: unprocessed stereo-channel input signal
Binaural Upmix NBU S: Binaural upmix using NBU S, trained on studio mixes
Binaural Upmix NBU C+: Binaural upmix using NBU C+, trained on the Cambridge MT database with added silence

Music1¹

Music2¹

Music3²

¹ Wonder Under by Glad Rags from https://freemusicarchive.org/music/Glad_Rags/Wonder_Under under CC-BY License

² Every Time by Katy Kirby from https://freemusicarchive.org/music/Katy_Kirby/Katy_Kirby/Katy_Kirby_-_3_-_01_Every_Time/ under CC-BY License

International Audio Laboratories Erlangen

NBU: Neural Binaural Upmixing of Stereo Content

Abstract

Audio Examples