Source Separation and Restoration of Drum Sound Components in Music Recordings (SeReCo)

Logo_DFG Teaser_SeReCo Logo_FAU

In the SeReCo project, we developed fundamental decomposition techniques for separating and restoring drum sound events in music recordings. The project was funded by the German Research Foundation. On this website, we summarize the project's main objectives and provide links to project-related resources (data, demonstrators, websites) and publications.

Project Description

Source Separation and Restoration of Drum Sound Components in Music Recordings

Teaser_SeReCo

The general goal of music source separation is to decompose a music recording into its constituent signal components. One of the main problems is that the separated signals may suffer from severe audible artifacts. Considering the challenging scenario of percussive and non-harmonic sound sources, we developed in this project techniques and tools for separating and restoring drum sound events in a perceptually convincing way. We systematically approached this general source separation problem by considering a number of more specific objectives. A first goal was to develop cascaded techniques for decomposing a music mixture into (mid-level) harmonic, percussive, transient, and residual components. A second goal was to decompose drum tracks into individual drum sound events by exploiting specific properties of drum instruments. In particular, by adapting and extending Non-Negative Matrix Factor Deconvolution (NMFD) used as a central methodology of this project, we systematically studied how audio- and score-based side information can be generated, integrated, and exploited to guide the decomposition. To improve the perceptual quality of the separated drum events, a third goal was to research data-driven restoration approaches for reducing crosstalk and other undesired artifacts. Finally, we tested and evaluated our decomposition and restoration approaches by considering two different application scenarios: an audio editing application (decomposition and remixing of breakbeats) and a music analysis problem (swing ratio analysis of jazz music). Exploring novel algorithmic approaches for sound source separation within concrete application scenarios, this project contributed to fundamental research of practical relevance.

Projektbeschreibung

Quellentrennung und Restauration von Schlagzeugklängen in Musikaufnahmen

Teaser_SeReCo

Klangquellentrennung für Musiksignale zielt darauf ab, eine digitalisierte Musikaufnahme in zugrundeliegende Signalkomponenten zu zerlegen. Ein Hauptproblem liegt darin, dass unter Umständen deutlich hörbare Artefakte in den separierten Signalkomponenten entstehen können. In diesem Projekt haben wir Techniken und Algorithmen, die sich zur perzeptuell hochwertigen Abtrennung und Zerlegung von schlagzeugartigen Klangquellen eignen, entwickelt. Diese allgemeine Aufgabenstellung wurde systematisch durch Betrachtung von in Beziehung stehenden Teilproblemstellungen angegangen. Als ein erstes Teilproblem entwickelten wir Verfahren zur kaskadierten Zerlegung von Musikaufnahmen in harmonische, perkussive, transiente und weitere Mid-Level-Komponenten. Eine zweite Aufgabenstellung bestand in der Zerlegung von Schlagzeugaufnahmen in individuelle Schlagzeugklangkomponenten unter Berücksichtigung spezifischer Eigenschaften der beteiligten Instrumente. Insbesondere wurde als zentral Methodik dieses Projekts die als "Non-Negative Matrix Factor Deconvolution" bekannte Technik adaptiert und erweitert. Hierbei untersuchten wir systematisch, wie sich Audio- und Notentext-basierte Seiteninformation generieren, integrieren und zur Steuerung der Zerlegung ausnutzen lässt. Als eine weitere wichtige Aufgabenstellung erforschten wir datengetriebene Restaurationsverfahren zur Reduktion von Übersprechen und anderen ungewünschten Artefakten. Die unterschiedlichen Ansätze zur Signalzerlegung und -rekonstruktion wurden anhand zwei konkreter Aufgabenstellungen getestet und evaluiert: Zum einen wurdd eine Anwendung zur Audioeditierung (Zerlegung und "Remixen" von Breakbeats) und zum anderen ein Musikanalyseproblem (Swing-Analyse in Jazzmusik) betrachtet. Durch die Entwicklung neuartiger algorithmischer Ansätze zur Klangquellentrennung hat das vorliegende Projekt zur Grundlagenforschung mit konkretem Praxisbezug beigetragen.

Projected-Related Resources and Demonstrators

The following list provides an overview of the most important publicly accessible sources created in the ISAD project:

Projected-Related Publications

The following publications reflect the main scientific contributions of the work carried out in the SeReCo project.

  1. Stefan Balke, Christian Dittmar, Jakob Abeßer, Klaus Frieler, Martin Pfleiderer, and Meinard Müller
    Bridging the Gap: Enriching YouTube Videos with Jazz Music Annotations
    Frontiers in Digital Humanities, 2018. PDF Details Demo DOI
    @article{BalkeDAFPM18_JazzYoutube_Frontiers,
    author = {Stefan Balke and Christian Dittmar and Jakob Abe{\ss}er and Klaus Frieler and Martin Pfleiderer and Meinard M{\"u}ller},
    title = {Bridging the Gap: {E}nriching {Y}ou{T}ube Videos with Jazz Music Annotations},
    journal = {Frontiers in Digital Humanities},
    volume = {},
    number = {},
    pages = {},
    doi = {doi.org/10.3389/fdigh.2018.00001},
    year = {2018},
    url-details={https://www.frontiersin.org/articles/10.3389/fdigh.2018.00001/full},
    url-demo={http://mir.audiolabs.uni-erlangen.de/jazztube/},
    url-pdf={2018_BalkeDAFPM_JazzYouTube_Frontiers.pdf},
    }
  2. Christian Dittmar, Jonathan Driedger, Meinard Müller, and Jouni Paulus
    An Experimental Approach to Generalized Wiener Filtering in Music Source Separation
    In Proceedings of the European Signal Processing Conference (EUSIPCO): 1743–1747, 2016. PDF
    @inproceedings{DittmarDMP16_WienerFiltering_EUSIPCO,
    author    = {Christian Dittmar and Jonathan Driedger and Meinard M{\"u}ller and Jouni Paulus},
    title     = {An Experimental Approach to Generalized Wiener Filtering in Music Source Separation},
    booktitle = {Proceedings of the European Signal Processing Conference ({EUSIPCO})},
    address   = {Budapest, Hungary},
    year      = {2016},
    pages     = {1743--1747},
    url-pdf   = {2016_DittmarDMP_Wiener_EUSIPCO_ePrint.pdf}
    }
  3. Christian Dittmar, Patricio López-Serrano, and Meinard Müller
    Unifying Local and Global Methods for Harmonic-Percussive Source Separation
    In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018. Demo
    @inproceedings{DittmarLM18_HPSS_KAM_NMF_ICASSP,
    author    = {Christian Dittmar and Patricio L{\'o}pez-Serrano and Meinard M{\"u}ller},
    title     = {Unifying Local and Global Methods for Harmonic-Percussive Source Separation},
    booktitle = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
    address   = {Calgary, Canada},
    month     = {April},
    year      = {2018},
    url-demo  = {https://www.audiolabs-erlangen.de/resources/MIR/2018-ICASSP-HPSS_KAM_NMF},
    }
  4. Christian Dittmar, Martin Pfleiderer, Stefan Balke, and Meinard Müller
    A Swingogram Representation for Tracking Micro-Rhythmic Variation in Jazz Performances
    Journal of New Music Research, 47(2): 97–113, 2017. Demo DOI
    @article{DittmarPBM17_SwingRatio_JNMR,
    author = {Christian Dittmar and Martin Pfleiderer and Stefan Balke and Meinard M{\"u}ller},
    title = {A Swingogram Representation for Tracking Micro-Rhythmic Variation in Jazz Performances},
    journal = {Journal of New Music Research},
    volume = {47},
    number = {2},
    pages = {97--113},
    year = {2017},
    doi = {10.1080/09298215.2017.1367405},
    url-demo={https://www.audiolabs-erlangen.de/resources/MIR/2017-JNMR-SwingRatio},
    }
  5. Patricio López-Serrano, Matthew E. P. Davies, Jason Hockman, Christian Dittmar, and Meinard Müller
    Break-Informed Audio Decomposition For Interactive Redrumming
    In Late-Breaking and Demo Session of the International Conference on Music Information Retrieval (ISMIR), 2018. PDF Demo
    @inproceedings{LopezSerranoDHDM18_ReDrum_ISMIR-LBD,
    author = {Patricio L{\'o}pez-Serrano and Matthew E. P. Davies and Jason Hockman and Christian Dittmar and Meinard M{\"u}ller},
    title = {Break-Informed Audio Decomposition For Interactive Redrumming},
    booktitle = {Late-Breaking and Demo Session of the International Conference on Music Information Retrieval ({ISMIR})},
    address = {Paris, France},
    year = {2018},
    url-pdf = {2018_LopezSerranoDHDM_Redrum_ISMIR-LBD.pdf},
    url-demo={https://www.audiolabs-erlangen.de/resources/MIR/2018-ISMIR-LBD-Redrum},
    }
  6. Patricio López-Serrano, Christian Dittmar, and Meinard Müller
    Finding Drum Breaks in Digital Music Recordings
    In Proceedings of the International Symposium on Computer Music Modeling and Retrieval (CMMR), 2017.
    @inproceedings{LopezDM17_DrumBreaks_CMMR,
    author    = {Patricio L\'{o}pez-Serrano and Christian Dittmar and Meinard M{\"u}ller},
    title     = {Finding Drum Breaks in Digital Music Recordings},
    booktitle = {Proceedings of the International Symposium on Computer Music Modeling and Retrieval ({CMMR})},
    address = {Porto, Portugal},
    year      = {2017}
    }
  7. Patricio López-Serrano, Christian Dittmar, and Meinard Müller
    Mid-Level Audio Features Based on Cascaded Harmonic-Residual-Percussive Separation
    In Proceedings of the AES Conference on Semantic Audio, 2017. Details Demo
    @inproceedings{LopezDM17_SepHRP_AES,
    author    = {Patricio L\'{o}pez-Serrano and Christian Dittmar and Meinard M{\"u}ller},
    title     = {Mid-Level Audio Features Based on Cascaded Harmonic-Residual-Percussive Separation},
    booktitle = {Proceedings of the {AES} Conference on Semantic Audio},
    address = {Erlangen, Germany},
    year      = {2017},
    url-demo={https://www.audiolabs-erlangen.de/resources/MIR/2017-AES-CHRP},
    url-details = {http://www.aes.org/e-lib/browse.cfm?elib=18755},
    }
  8. Patricio López-Serrano, Christian Dittmar, Yigitcan Özer, and Meinard Müller
    NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization
    In Proceedings of the International Conference on Digital Audio Effects (DAFx), 2019. PDF Demo
    @inproceedings{LopezDOM19_ToolboxNMF_DAFx,
    author    = {Patricio L{\'o}pez-Serrano and Christian Dittmar and Yigitcan {\"O}zer and Meinard M{\"u}ller},
    title     = {{NMF} Toolbox: {M}usic Processing Applications of Nonnegative Matrix Factorization},
    booktitle = {Proceedings of the International Conference on Digital Audio Effects ({DAFx})},
    address   = {Birmingham, UK},
    year      = {2019},
    pages     = {},
    url-pdf   = {2019_LopezSerranoDOM_NMF_DAFx.pdf},
    url-demo  = {https://www.audiolabs-erlangen.de/resources/MIR/NMFtoolbox/}
    }
  9. Chih-Wei Wu, Christian Dittmar, Carl Southall, Richard Vogl, Gerhard Widmer, Jason Hockman, Meinard Müller, and Alexander Lerch
    A Review of Automatic Drum Transcription
    IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(9): 1457–1483, 2018. PDF Demo DOI
    @article{WuDSVWHML18_DrumTranscription_IEEE-TASLP,
    author = {Chih-Wei Wu and Christian Dittmar and Carl Southall and Richard Vogl and Gerhard Widmer and Jason Hockman and Meinard M{\"u}ller and Alexander Lerch},
    title = {A Review of Automatic Drum Transcription},
    journal = {{IEEE}/{ACM} Transactions on Audio, Speech, and Language Processing},
    year={2018},
    volume={26},
    number={9},
    pages={1457--1483},
    doi = {10.1109/TASLP.2018.2830113},
    url-pdf = {https://ieeexplore.ieee.org/document/8350302/},
    url-demo={https://www.audiolabs-erlangen.de/resources/MIR/2017-DrumTranscription-Survey},
    month={September},
    }
  10. Christof Weiß, Stefan Balke, Jakob Abeßer, and Meinard Müller
    Computational Corpus Analysis: A Case Study on Jazz Solos
    In Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR): 416–423, 2018. PDF DOI
    @inproceedings{WeissBAM18_JazzComplexity_ISMIR,
    author    = {Christof Wei{\ss} and Stefan Balke and Jakob Abe{\ss}er and Meinard M{\"u}ller},
    title     = {Computational Corpus Analysis: {A} Case Study on Jazz Solos},
    booktitle = {Proceedings of the 19th International Society for Music Information Retrieval Conference ({ISMIR})},
    pages     = {416--423},
    address   = {Paris, France},
    year      = {2018},
    doi       = {10.5281/zenodo.1492439},
    url-pdf   = {2018_WeissBAM_JazzComplexity_ISMIR_PrintedVersion.pdf}
    }

Projected-Related Ph.D. Thesis

Christian Dittmar won the Promotionspreis 2019 of the Staedtler Stiftung for his outstanding dissertation.

  1. Christian Dittmar
    Source Separation and Restoration of Drum Sounds in Music Recordings
    PhD Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg, 2018. PDF Details
    @phdthesis{Dittmar18_DrumSounds_PhD,
    author      = {Christian Dittmar},
    year        = {2018},
    title       = {Source Separation and Restoration of Drum Sounds in Music Recordings},
    school      = {Friedrich-Alexander-Universit{\"a}t Erlangen-N{\"u}rnberg},
    url-details = {https://opus4.kobv.de/opus4-fau/frontdoor/index/index/year/2018/docId/9767},
    url-pdf = {https://opus4.kobv.de/opus4-fau/files/9767/ChristianDittmarDissertation.pdf}
    }