SEFGAN: Harvesting the Power of Normalizing Flows and GANs for Efficient High-Quality Speech Enhancement

Martin Strauss, Nicola Pia, Nagashree K. S. Rao and Bernd Edler

presented at WASPAA, 2023

Listening test items

The test items were created from the WSJ-CHiME3 dataset described in [5]. Proposed methods are SE-Flow+condNet and SEFGAN. The comparing methods are MetricGAN+ [2], Conv-TasNet [2] and SGMSE+ [3].

m_1.73

Activate

Infotrackswitch.js - open source multitrack audio player
https://github.com/audiolabs/trackswitch.js

  • Play
  • Stop
  • Repeat
  • --:--.--- / --:--.---
  • Noisy
    • Solo
  • Reference
    • Solo
  • LP35
    • Solo
  • MetricGAN+
    • Solo
  • Conv-TasNet
    • Solo
  • SGMSE+
    • Solo
  • SE-Flow+condNet
    • Solo
  • SEFGAN
    • Solo

f_0.66

Activate

Infotrackswitch.js - open source multitrack audio player
https://github.com/audiolabs/trackswitch.js

  • Play
  • Stop
  • Repeat
  • --:--.--- / --:--.---
  • Noisy
    • Solo
  • Reference
    • Solo
  • LP35
    • Solo
  • MetricGAN+
    • Solo
  • Conv-TasNet
    • Solo
  • SGMSE+
    • Solo
  • SE-Flow+condNet
    • Solo
  • SEFGAN
    • Solo

f_6.47

Activate

Infotrackswitch.js - open source multitrack audio player
https://github.com/audiolabs/trackswitch.js

  • Play
  • Stop
  • Repeat
  • --:--.--- / --:--.---
  • Noisy
    • Solo
  • Reference
    • Solo
  • LP35
    • Solo
  • MetricGAN+
    • Solo
  • Conv-TasNet
    • Solo
  • SGMSE+
    • Solo
  • SE-Flow+condNet
    • Solo
  • SEFGAN
    • Solo

m_0.6

Activate

Infotrackswitch.js - open source multitrack audio player
https://github.com/audiolabs/trackswitch.js

  • Play
  • Stop
  • Repeat
  • --:--.--- / --:--.---
  • Noisy
    • Solo
  • Reference
    • Solo
  • LP35
    • Solo
  • MetricGAN+
    • Solo
  • Conv-TasNet
    • Solo
  • SGMSE+
    • Solo
  • SE-Flow+condNet
    • Solo
  • SEFGAN
    • Solo

f_6.04

Activate

Infotrackswitch.js - open source multitrack audio player
https://github.com/audiolabs/trackswitch.js

  • Play
  • Stop
  • Repeat
  • --:--.--- / --:--.---
  • Noisy
    • Solo
  • Reference
    • Solo
  • LP35
    • Solo
  • MetricGAN+
    • Solo
  • Conv-TasNet
    • Solo
  • SGMSE+
    • Solo
  • SE-Flow+condNet
    • Solo
  • SEFGAN
    • Solo

f_1.69

Activate

Infotrackswitch.js - open source multitrack audio player
https://github.com/audiolabs/trackswitch.js

  • Play
  • Stop
  • Repeat
  • --:--.--- / --:--.---
  • Noisy
    • Solo
  • Reference
    • Solo
  • LP35
    • Solo
  • MetricGAN+
    • Solo
  • Conv-TasNet
    • Solo
  • SGMSE+
    • Solo
  • SE-Flow+condNet
    • Solo
  • SEFGAN
    • Solo

m_2.64

Activate

Infotrackswitch.js - open source multitrack audio player
https://github.com/audiolabs/trackswitch.js

  • Play
  • Stop
  • Repeat
  • --:--.--- / --:--.---
  • Noisy
    • Solo
  • Reference
    • Solo
  • LP35
    • Solo
  • MetricGAN+
    • Solo
  • Conv-TasNet
    • Solo
  • SGMSE+
    • Solo
  • SE-Flow+condNet
    • Solo
  • SEFGAN
    • Solo

m_2.67

Activate

Infotrackswitch.js - open source multitrack audio player
https://github.com/audiolabs/trackswitch.js

  • Play
  • Stop
  • Repeat
  • --:--.--- / --:--.---
  • Noisy
    • Solo
  • Reference
    • Solo
  • LP35
    • Solo
  • MetricGAN+
    • Solo
  • Conv-TasNet
    • Solo
  • SGMSE+
    • Solo
  • SE-Flow+condNet
    • Solo
  • SEFGAN
    • Solo

References

[1] S.-W. Fu, C. Yu, T.-A. Hsieh, P. Plantinga, M. Ravanelli, X. Lu, Y. Tsao, "MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement", in Proceedings Interspeech Conference, 2021, pp. 201–205.

[2] Luo and N. Mesgarani, "Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation," IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 27, pp. 1256-1266, 2019

[3] J. Richter, S. Welker, J.-M. Lemercier, B. Lay, T. Gerkmann, "Speech enhancement and dereverberation with diffusion-based generative models," IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 31, pp. 2351-2364, 2023