AudioLabs - Publications

Publications

The publication list below may be outdated. For the latest and most comprehensive list of my publications, please visit my personal website.

Judith Bauer, Frank Zalkow, Meinard Müller, and Christian Dittmar
Detection of Lombard Speech Using Different Model Architectures and Speech Features
In Proceedings of the Conference on Speech Prosody, 2026.

@inproceedings{BauerZMD_LombardDetection_SpeechProsody,
author      = {Judith Bauer and Frank Zalkow and Meinard Müller and Christian Dittmar},
title       = {Detection of {L}ombard Speech Using Different Model Architectures and Speech Features},
booktitle   = {Proceedings of the Conference on Speech Prosody},
address     = {Philadelphia, PA, USA},
year        = {2026},
pages       = {},
note        = {accepted},
}

Kishor Kayyar Lakshminarayana, Frank Zalkow, Christian Dittmar, Nicola Pia, and Emanuël A. P. Habets
Low-Resource Text-to-Speech Synthesis Using Noise-Augmented Training of ForwardTacotron
In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025. PDF DOI

@inproceedings{KayyarZDPH25_LowResourceForwardTacotron_ICASSP,
author      = {Kishor Kayyar Lakshminarayana and Frank Zalkow and Christian Dittmar and Nicola Pia and Emanu{\"e}l A.\ P.\ Habets},
title       = {Low-Resource Text-to-Speech Synthesis Using Noise-Augmented Training of {ForwardTacotron}},
booktitle   = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
address     = {Hyderabad, India},
year        = {2025},
pages       = {},
doi         = {10.1109/ICASSP49660.2025.10890686},
url-pdf     = {https://ieeexplore.ieee.org/document/10890686},
}

Frank Zalkow, Paolo Sani, Kishor Kayyar Lakshminarayana, Emanuël A. P. Habets, Nicola Pia, and Christian Dittmar
Bridging the Training–Inference Gap in TTS: Training Strategies for Robust Generative Postprocessing for Low-Resource Speakers
In Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH): 2470–2474, 2025. PDF Details DOI

@inproceedings{ZalkowSKHPD25_LowResourceGenerativePostprocessing_INTERSPEECH,
author      = {Frank Zalkow and Paolo Sani and Kishor Kayyar Lakshminarayana and Emanu{\"e}l A.\ P. Habets and Nicola Pia and Christian Dittmar},
title       = {Bridging the Training–Inference Gap in {TTS}: {T}raining Strategies for Robust Generative Postprocessing for Low-Resource Speakers},
booktitle   = {Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH)},
address     = {Rotterdam, The Netherlands},
year        = {2025},
pages       = {2470--2474},
doi         = {10.21437/Interspeech.2025-854},
url-pdf     = {https://www.isca-archive.org/interspeech_2025/zalkow25_interspeech.html},
url-details = {https://www.audiolabs-erlangen.de/resources/NLUI/2025-Interspeech-LowResGen},
}

Subhayu Ghosh, Frank Zalkow, and Nanda Dulal Jana
Enhanced Audio-Visual Speech Synthesis via Multi-Discriminative Learning
IEEE Transactions on Multimedia, 2025. PDF DOI

@article{GoshEtAl25_AudioVisualTTS_TMM,
author    = {Subhayu Ghosh and Frank Zalkow and Nanda Dulal Jana},
title     = {Enhanced Audio-Visual Speech Synthesis via Multi-Discriminative Learning},
journal   = {{IEEE} Transactions on Multimedia},
volume    = {},
number    = {},
year      = {2025},
pages     = {},
doi       = {10.1109/TMM.2025.3645648},
url-pdf   = {https://ieeexplore.ieee.org/document/11304174},
}

Zahra Kolagar, Frank Zalkow, and Alessandra Zarcone
Investigating Methods for Mapping Learning Objectives to Bloom's Revised Taxonomy in Course Descriptions for Higher Education
In Proceedings of the Workshop on Innovative Use of NLP for Building Educational Applications (BEA): 415–445, 2025. PDF DOI

@inproceedings{KolagarZZ25_BloomsTaxonomy_BEA,
author      = {Zahra Kolagar and Frank Zalkow and Alessandra Zarcone},
title       = {Investigating Methods for Mapping Learning Objectives to Bloom's Revised Taxonomy in Course Descriptions for Higher Education},
booktitle   = {Proceedings of the Workshop on Innovative Use of {NLP} for Building Educational Applications ({BEA})},
address     = {Vienna, Austria},
year        = {2025},
pages       = {415--445},
doi         = {10.18653/v1/2025.bea-1.32},
url-pdf     = {https://aclanthology.org/2025.bea-1.32/},
}

Judith Bauer, Frank Zalkow, Meinard Müller, and Christian Dittmar
Explicit Emphasis Control in Text-to-Speech Synthesis
In Proceedings of the ISCA Speech Synthesis Workshop (SSW): 21–27, 2025. PDF DOI

@inproceedings{BauerZMD25_EmphasisControl_SSW,
author      = {Judith Bauer and Frank Zalkow and Meinard Müller and Christian Dittmar},
title       = {Explicit Emphasis Control in Text-to-Speech Synthesis},
booktitle   = {Proceedings of the ISCA Speech Synthesis Workshop ({SSW})},
address     = {Leeuwarden, The Netherlands},
year        = {2025},
pages       = {21--27},
doi         = {10.21437/SSW.2025-4},
url-pdf     = {https://www.isca-archive.org/ssw_2025/bauer25_ssw.html},
}

Frank Zalkow, Benedikt Schäfer, Thomas Moissl, Jonas Bücherl, Kerstin Markl, Sebastian Bothe, Francois Duchateau, Julia Dollase, Patric Kabus, Daniel Steinigen, Oliver Schmitt, and Fabian Küch
Generating Search-Engine-Optimized Headlines for Sports News
In Proceedings of the Conference on Natural Language Processing (KONVENS): 59–65, 2025. PDF

@inproceedings{ZalkowEtAl_SportsNews_KONVENS,
author      = {Frank Zalkow and Benedikt Schäfer and Thomas Moissl and Jonas Bücherl and Kerstin Markl and Sebastian Bothe and Francois Duchateau and Julia Dollase and Patric Kabus and Daniel Steinigen and Oliver Schmitt and Fabian Küch},
title       = {Generating Search-Engine-Optimized Headlines for Sports News},
booktitle   = {Proceedings of the Conference on Natural Language Processing ({KONVENS})},
address     = {Hildesheim, Germany},
year        = {2025},
pages       = {59--65},
url-pdf     = {https://aclanthology.org/2025.konvens-1.6/},
}

Laurynas Zavistanavicius, Frank Zalkow, Christian Dittmar, Robert L. Stevenson
Adapting the Fréchet Audio Distance as an Objective Metric for Text-to-Speech Quality Evaluation
In Proceedings of the ITG Conference on Speech Communication: 96–100, 2025. PDF

@inproceedings{ZavistanaviciusZDS25_FAD_ITG,
author      = {Laurynas Zavistanavicius, Frank Zalkow, Christian Dittmar, Robert L. Stevenson},
title       = {Adapting the {F}réchet Audio Distance as an Objective Metric for Text-to-Speech Quality Evaluation},
booktitle   = {Proceedings of the {ITG} Conference on Speech Communication},
address     = {Berlin, Germany},
year        = {2025},
pages       = {96--100},
url-pdf     = {https://ieeexplore.ieee.org/document/11264402},
}

Judith Bauer, Frank Zalkow, Meinard Müller, and Christian Dittmar
Evaluating the Impact of Prosody Feature Normalization on the Controllability of Pitch in Speech Synthesis
In Elektronische Sprachsignalverarbeitung (ESSV): 188–195, 2024. PDF DOI

@inproceedings{BauerEtAl2024_ProsodyNormalization_ESSV,
address   = {Regensburg, Germany},
author    = {Judith Bauer and Frank Zalkow and Meinard M\"{u}ller and Christian Dittmar},
booktitle = {Elektronische Sprachsignalverarbeitung ({ESSV})},
pages     = {188--195},
title     = {Evaluating the Impact of Prosody Feature Normalization on the Controllability of Pitch in Speech Synthesis},
year      = {2024},
doi       = {10.35096/othr/pub-7097},
url-pdf   = {https://nbn-resolving.org/urn:nbn:de:bvb:898-opus4-70976}
}

Subhayu Ghosh, Snehashis Sarkar, Sovan Ghosh, Frank Zalkow, and Nanda Dulal Jana
Audio-visual speech synthesis using vision transformer—enhanced autoencoders with ensemble of loss functions
Applied Intelligence, 54(6): 4507–4524, 2024. PDF Demo DOI

@article{GoshEtAl24_AudioVisualTTS_AppliedIntelligence,
author    = {Subhayu Ghosh and Snehashis Sarkar and Sovan Ghosh and Frank Zalkow and Nanda Dulal Jana},
title     = {Audio-visual speech synthesis using vision transformer--enhanced
autoencoders with ensemble of loss functions},
journal   = {Applied Intelligence},
volume    = {54},
number    = {6},
year      = {2024},
pages     = {4507--4524},
doi       = {10.1007/s10489-024-05380-7},
url-demo  = {https://github.com/Subhayu-ghosh/ViTAE-AVSS},
url-pdf   = {https://link.springer.com/article/10.1007/s10489-024-05380-7}
}

Florian Lux, Sarina Meyer, Lyonel Behringer, Frank Zalkow, Phat Do, Matt Coler, Emanuël A. P. Habets, and Ngoc Thang Vu
Meta Learning Text-to-Speech Synthesis in over 7000 Languages
In Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH): 4958–4962, 2024. PDF Demo DOI

@inproceedings{LuxEtAl2024_TTS7000Lang_Interspeech,
address   = {Kos, Greece},
author    = {Florian Lux and Sarina Meyer and Lyonel Behringer and Frank Zalkow and Phat Do and Matt Coler and Emanu\"{e}l A. P. Habets and Ngoc Thang Vu},
booktitle = {Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH)},
pages     = {4958--4962},
title     = {Meta Learning Text-to-Speech Synthesis in over 7000 Languages},
year      = {2024},
doi       = {10.21437/Interspeech.2024-1335},
url-pdf   = {https://www.isca-archive.org/interspeech_2024/lux24_interspeech.html},
url-demo  = {https://huggingface.co/spaces/Flux9665/MassivelyMultilingualTTS}
}

Arunava Kr. Kalita, Christian Dittmar, Paolo Sani, Frank Zalkow, Emanuël A. P. Habets, and Rusha Patra
PAD-VC: A Prosody-Aware Decoder for Any-to-Few Voice Conversion
In Proceedings of the International Workshop on Acoustic Signal Enhancement (IWAENC): 389–393, 2024. PDF Details DOI

@inproceedings{KalitaDSZHP24_PAD-VC_IWAENC,
author      = {Arunava Kr. Kalita and Christian Dittmar and Paolo Sani and Frank Zalkow and Emanu\"{e}l A. P. Habets and Rusha Patra},
title       = {{PAD-VC}: {A} Prosody-Aware Decoder for Any-to-Few Voice Conversion},
booktitle   = {Proceedings of the International Workshop on Acoustic Signal Enhancement ({IWAENC})},
address     = {Aalborg, Denmark},
year        = {2024},
pages       = {389--393},
doi         = {10.1109/IWAENC61483.2024.10694576},
url-pdf     = {https://ieeexplore.ieee.org/document/10694576},
url-details = {https://www.audiolabs-erlangen.de/resources/NLUI/2024-PAD-VC},
}

Frank Zalkow, Prachi Govalkar, Meinard Müller, Emanuël A. P. Habets, and Christian Dittmar
Evaluating Speech—Phoneme Alignment and Its Impact on Neural Text-To-Speech Synthesis
In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023. PDF Details DOI

@inproceedings{ZalkowGMHD23_EvalAlignmentTTS_ICASSP,
author      = {Frank Zalkow and Prachi Govalkar and Meinard M{\"u}ller and Emanu{\"e}l A.\ P.\ Habets and Christian Dittmar},
title       = {Evaluating Speech--Phoneme Alignment and Its Impact on Neural Text-To-Speech Synthesis},
booktitle   = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
address     = {Rhodes Island, Greece},
year        = {2023},
pages       = {},
doi         = {10.1109/ICASSP49357.2023.10097248},
url-pdf     = {https://ieeexplore.ieee.org/document/10097248},
url-details = {https://www.audiolabs-erlangen.de/resources/NLUI/2023-ICASSP-eval-alignment-tts},
}

Meinard Müller and Frank Zalkow
FMP Notebooks
In Peter Moormann and Nicolas Ruth (ed.): Musik und Internet: Aktuelle Phänomene populärer Kulturen, Springer VS: 237–247, 2023. DOI

@incollection{MuellerZalkow_FMP_BOOKCHAP,
title     = {{FMP} Notebooks},
author    = {Meinard M{\"u}ller and Frank Zalkow},
booktitle = {Musik und Internet: {A}ktuelle Phänomene popul{\"a}rer Kulturen},
editor    = {Peter Moormann and Nicolas Ruth},
publisher = {Springer VS},
address   = {Wiesbaden, Germany},
pages     = {237--247},
year      = {2023},
series    = {Musik und Medien},
doi       = {10.1007/978-3-658-39145-4}
}

Paolo Sani, Judith Bauer, Frank Zalkow, Emanuël A. P. Habets, and Christian Dittmar
Improving the Naturalness of Synthesized Spectograms for TTS Using GAN-Based Post-Processing
In Proceedings of the ITG Conference on Speech Communication: 270–274, 2023. PDF Details DOI

@inproceedings{SaniBZHD23_Postprocessing_ITG,
author      = {Paolo Sani and Judith Bauer and Frank Zalkow and Emanu{\"e}l A.\ P.\ Habets and Christian Dittmar},
title       = {Improving the Naturalness of Synthesized Spectograms for {TTS} Using {GAN}-Based Post-Processing},
booktitle   = {Proceedings of the {ITG} Conference on Speech Communication},
address     = {Aachen, Germany},
year        = {2023},
doi         = {10.30420/456164053},
pages       = {270--274},
url-pdf     = {https://ieeexplore.ieee.org/document/10363041},
url-details = {https://www.audiolabs-erlangen.de/resources/NLUI/2023-ITG-postprocessing},
}

Frank Zalkow, Paolo Sani, Michael Fast, Judith Bauer, Mohammad Joshaghani, Kishor Kayyar Lakshminarayana, Emanuël A. P. Habets, and Christian Dittmar
The AudioLabs System for the Blizzard Challenge 2023
In Proceedings of the Blizzard Challenge Workshop: 63–68, 2023. PDF DOI

@inproceedings{ZalkowEtAl23_AudioLabsBlizzard_Blizzard,
author      = {Frank Zalkow and Paolo Sani and Michael Fast and Judith Bauer and Mohammad Joshaghani and Kishor Kayyar Lakshminarayana and Emanu{\"e}l A.\ P.\ Habets and Christian Dittmar},
title       = {The {A}udio{L}abs System for the {B}lizzard {C}hallenge 2023},
booktitle   = {Proceedings of the Blizzard Challenge Workshop},
address     = {Grenoble, France},
year        = {2023},
doi         = {10.21437/Blizzard.2023-8},
pages       = {63--68},
url-pdf     = {https://www.isca-speech.org/archive/blizzard_2023/zalkow23_blizzard.html},
}

Christof Weiß, Vlora Arifi-Müller, Michael Krause, Frank Zalkow, Stephanie Klauk, Rainer Kleinertz, and Meinard Müller
Wagner Ring Dataset: A Complex Opera Scenario for Music Processing and Computational Musicology
Transactions of the International Society for Music Information Retrieval (TISMIR), 6(1): 135–149, 2023. PDF Demo DOI

@article{WeissEtAl23_WagnerRingDataset_TISMIR,
author    = {Christof Weiß and Vlora Arifi-M{\"u}ller and Michael Krause and Frank Zalkow and Stephanie Klauk and Rainer Kleinertz and Meinard M{\"u}ller},
title     = {{W}agner {R}ing {D}ataset: {A} Complex Opera Scenario for Music Processing and Computational Musicology},
journal   = {Transactions of the International Society for Music Information Retrieval ({TISMIR})},
volume    = {6},
number    = {1},
year      = {2023},
pages     = {135--149},
doi       = {10.5334/tismir.161},
url-demo  = {https://zenodo.org/records/7672157},
url-pdf   = {https://transactions.ismir.net/articles/10.5334/tismir.161}
}

Yi-Jen Shih, Shih-Lun Wu, Frank Zalkow, Meinard Müller, and Yi-Hsuan Yang
Theme Transformer: Symbolic Music Generation with Theme-Conditioned Transformer
IEEE Transactions on Multimedia, 25: 3495–3508, 2022. PDF Details DOI

@article{ShihWZMY22_ThemeTransformer_TMM,
author      = {Yi{-}Jen Shih and Shih{-}Lun Wu and Frank Zalkow and Meinard M{\"u}ller and Yi{-}Hsuan Yang},
title       = {Theme Transformer: {S}ymbolic Music Generation with Theme-Conditioned Transformer},
journal     = {{IEEE} Transactions on Multimedia},
volume      = {25},
pages       = {3495--3508},
year        = {2022},
doi         = {10.1109/TMM.2022.3161851},
url-details = {https://atosystem.github.io/ThemeTransformer},
url-pdf     = {https://ieeexplore.ieee.org/document/9740506},
}

Christof Weiß, Frank Zalkow, Vlora Arifi-Müller, Meinard Müller, Hendrik Vincent Koops, Anja Volk, and Harald G. Grohganz
Schubert Winterreise Dataset: A Multimodal Scenario for Music Analysis
ACM Journal on Computing and Cultural Heritage (JOCCH), 14(2), 2021. PDF Details DOI

@article{WeissZAMMKVG21_SWD_JOCCH,
author    = {Christof Wei{\ss} and Frank Zalkow and Vlora Arifi-M{\"u}ller and Meinard M{\"u}ller and Hendrik Vincent Koops and Anja Volk and Harald G. Grohganz},
title     = {{S}chubert {W}interreise Dataset: {A} Multimodal Scenario for Music Analysis},
journal   = {{ACM} Journal on Computing and Cultural Heritage ({JOCCH})},
volume    = {14},
number    = {2},
year      = {2021},
doi       = {10.1145/3429743},
url-pdf   = {https://dl.acm.org/doi/10.1145/3429743},
url-details = {https://doi.org/10.5281/zenodo.4431535}
}

Frank Zalkow, Julian Brandner, and Meinard Müller
Efficient Retrieval of Music Recordings Using Graph-Based Index Structures
Signals, 2(2): 336–352, 2021. PDF Details DOI

@article{ZalkowBM21_Indexing_Signals,
author      = {Frank Zalkow and Julian Brandner and Meinard M{\"u}ller},
title       = {Efficient Retrieval of Music Recordings Using Graph-Based Index Structures},
journal     = {Signals},
volume      = {2},
number      = {2},
year        = {2021},
doi         = {10.3390/signals2020021},
url-details = {https://www.audiolabs-erlangen.de/resources/MIR/2020_signals-indexing},
url-pdf     = {https://www.mdpi.com/2624-6120/2/2/21},
pages       = {336--352}
}

Frank Zalkow
Learning Audio Representations for Cross-Version Retrieval of Western Classical Music
PhD Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 2021. Details

@phdthesis{Zalkow21_Thesis_PhD,
author      = {Frank Zalkow},
title       = {Learning Audio Representations for Cross-Version Retrieval of Western Classical Music},
type        = {PhD thesis},
pages       = {172},
school      = {Friedrich-Alexander-Universit{\"a}t Erlangen-N{\"u}rnberg (FAU)},
address     = {Erlangen, Germany},
year        = {2021},
url-details = {https://nbn-resolving.org/urn:nbn:de:bvb:29-opus4-167774},
}

Meinard Müller and Frank Zalkow
libfmp: A Python Package for Fundamentals of Music Processing
Journal of Open Source Software (JOSS), 6(63), 2021. Details DOI

@article{MuellerZalkow21_libfmp_JOSS,
author      = {Meinard M{\"u}ller and Frank Zalkow},
title       = {{libfmp}: {A} {P}ython Package for Fundamentals of Music Processing},
journal     = {Journal of Open Source Software ({JOSS})},
volume      = {6},
number      = {63},
year        = {2021},
doi         = {10.21105/joss.03326},
url-details = {https://github.com/meinardmueller/libfmp},
}

Frank Zalkow and Meinard Müller
CTC-Based Learning of Chroma Features for Score—Audio Music Retrieval
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29: 2957–2971, 2021. PDF Details DOI

@article{ZalkowMueller21_ChromaCTC_TASLP,
author      = {Frank Zalkow and Meinard M{\"u}ller},
title       = {{CTC}-Based Learning of Chroma Features for Score--Audio Music Retrieval},
journal     = {{IEEE}/{ACM} Transactions on Audio, Speech, and Language Processing},
volume      = {29},
pages       = {2957--2971},
year        = {2021},
doi         = {10.1109/TASLP.2021.3110137},
url-details = {https://www.audiolabs-erlangen.de/resources/MIR/2021_TASLP-ctc-chroma},
url-pdf     = {https://ieeexplore.ieee.org/document/9531521},
}

Frank Zalkow and Meinard Müller
Learning Low-Dimensional Embeddings of Audio Shingles for Cross-Version Retrieval of Classical Music
Applied Sciences, 10(1), 2020. PDF Details DOI

@article{ZalkowMueller20_Shingles_AppliedSciences,
author      = {Frank Zalkow and Meinard M{\"u}ller},
title       = {Learning Low-Dimensional Embeddings of Audio Shingles for Cross-Version Retrieval of Classical Music},
journal     = {Applied Sciences},
volume      = {10},
number      = {1},
year        = {2020},
doi         = {10.3390/app10010019},
url-details = {https://www.mdpi.com/2076-3417/10/1/19},
url-pdf     = {2020_ZalkowMueller_Shingles_AppliedSciences.pdf}
}

Frank Zalkow and Meinard Müller
Using Weakly Aligned Score—Audio Pairs to Train Deep Chroma Models for Cross-Modal Music Retrieval
In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 184–191, 2020. PDF Details

@inproceedings{ZalkowMueller20_WeaklyAlignedCTC_ISMIR,
author    = {Frank Zalkow and Meinard M{\"u}ller},
title     = {Using Weakly Aligned Score--Audio Pairs to Train Deep Chroma Models for Cross-Modal Music Retrieval},
booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})},
address   = {Montr{\'{e}}al, Canada},
pages     = {184--191},
year      = {2020},
url-details = {https://www.audiolabs-erlangen.de/resources/MIR/2020-ISMIR-ctc-chroma},
url-pdf    = {2020_ZalkowM_CTC_ISMIR.pdf}
}

Hendrik Schreiber, Frank Zalkow, and Meinard Müller
Modeling and Estimating Local Tempo: A Case Study on Chopin's Mazurkas
In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 773–779, 2020.

@inproceedings{SchreiberZM20_LocalTempoChopin_ISMIR,
author    = {Hendrik Schreiber and Frank Zalkow and Meinard M{\"u}ller},
title     = {Modeling and Estimating Local Tempo: {A} Case Study on {C}hopin's Mazurkas},
booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})},
address   = {Montr{\'{e}}al, Canada},
pages     = {773--779},
year      = {2020},
}

Michael Krause, Frank Zalkow, Julia Zalkow, Christof Weiß, and Meinard Müller
Classifying Leitmotifs in Recordings of Operas by Richard Wagner
In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 473–480, 2020. Details

@inproceedings{KrauseZZWM20_LeitmotifClassification_ISMIR,
author    = {Michael Krause and Frank Zalkow and Julia Zalkow and Christof Wei{\ss} and Meinard M{\"u}ller},
title     = {Classifying Leitmotifs in Recordings of Operas by {R}ichard {W}agner},
booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})},
address   = {Montr{\'{e}}al, Canada},
pages     = {473--480},
year      = {2020},
url-details = {https://www.audiolabs-erlangen.de/resources/MIR/2020-ISMIR-LeitmotifClassification},
}

Stephanie Klauk and Frank Zalkow
Methoden computergestützter melodischer Analyse am Beispiel italienischer Streichquartette
In Stephanie Klauk (ed.): Instrumentalmusik neben Haydn und Mozart. Analyse, Aufführungspraxis und Edition, Saarbrücker Studien zur Musikwissenschaft 20, Königshausen & Neumann: 151–168, 2020.

@incollection{KlaukZalkow20_MelodischeAnalyseStreichquartette_SSM,
author    = {Stephanie Klauk and Frank Zalkow},
title     = {{M}ethoden computergestützter melodischer {A}nalyse am {B}eispiel italienischer {S}treichquartette},
booktitle = {{I}nstrumentalmusik neben {H}aydn und {M}ozart. {A}nalyse, {A}uff{\"u}hrungspraxis und {E}dition},
pages     = {151--168},
year      = {2020},
editor    = {Stephanie Klauk},
publisher = {Saarbr{\"u}cker Studien zur Musikwissenschaft 20, K{\"o}nigshausen \& Neumann},
address   = {Saarbr{\"u}cken, Germany}
}

Frank Zalkow, Stefan Balke, Vlora Arifi-Müller, and Meinard Müller
MTD: A Multimodal Dataset of Musical Themes for MIR Research
Transactions of the International Society for Music Information Retrieval (TISMIR), 3(1): 180–192, 2020. PDF Details Demo DOI

@article{ZalkowBAM20_MTD_TISMIR,
title     = {{MTD}: {A} Multimodal Dataset of Musical Themes for {MIR} Research},
author    = {Frank Zalkow and Stefan Balke and Vlora Arifi-M{\"{u}}ller and Meinard M{\"{u}}ller},
journal   = {Transactions of the International Society for Music Information Retrieval ({TISMIR})},
volume    = {3},
number    = {1},
year      = {2020},
pages     = {180--192},
doi       = {10.5334/tismir.68},
url-demo  = {https://www.audiolabs-erlangen.de/resources/MIR/MTD},
url-details = {https://transactions.ismir.net/articles/10.5334/tismir.68/},
url-pdf   = {2020_ZalkowBAM20_MTD_TISMIR.pdf}
}

Frank Zalkow, Stefan Balke, and Meinard Müller
Evaluating Salience Representations for Cross-Modal Retrieval of Western Classical Music Recordings
In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 331–335, 2019. PDF Details DOI Presentation

@inproceedings{ZalkowBM19_SalienceRetrieval_ICASSP,
author      = {Frank Zalkow and Stefan Balke and Meinard M{\"u}ller},
title       = {Evaluating Salience Representations for Cross-Modal Retrieval of Western Classical Music Recordings},
booktitle   = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
address     = {Brighton, United Kingdom},
year        = {2019},
pages       = {331--335},
url         = {https://ieeexplore.ieee.org/document/8683609},
url-pdf     = {https://ieeexplore.ieee.org/document/8683609},
url-details = {https://www.audiolabs-erlangen.de/resources/MIR/2019-ICASSP-BarlowMorgenstern/},
url-presentation = {2019_poster_ZalkowBM_SalienceThemeRetrieval_ICASSP.pdf},
doi         = {10.1109/ICASSP.2019.8683609}
}

Meinard Müller and Frank Zalkow
FMP Notebooks: Educational Material for Teaching and Learning Fundamentals of Music Processing
In Proceedings of the International Conference on Music Information Retrieval (ISMIR): 573–580, 2019. PDF Details Presentation

@inproceedings{MuellerZ19_FMP_ISMIR,
author    = {Meinard M{\"u}ller and Frank Zalkow},
title     = {{FMP} Notebooks: {E}ducational Material for Teaching and Learning Fundamentals of Music Processing},
booktitle = {Proceedings of the International Conference on Music Information Retrieval ({ISMIR})},
address   = {Delft, The Netherlands},
month     = {November},
year      = {2019},
pages     = {573--580},
url-pdf   = {2019_MuellerZalkow_FMP_ISMIR.pdf},
url-details = {https://www.audiolabs-erlangen.de/FMP},
url-presentation = {2019_poster_MuellerZalkow_FMP_ISMIR.pdf}
}

Prachi Govalkar, Johannes Fischer, Frank Zalkow, and Christian Dittmar
A Comparison of Recent Neural Vocoders for Speech Signal Reconstruction
In Proceedings of the ISCA Speech Synthesis Workshop (SSW): 7–12, 2019. PDF Details DOI

@inproceedings{GovalkarFZD19_ComparisionVocoders_SSW,
author    = {Prachi Govalkar and Johannes Fischer and Frank Zalkow and Christian Dittmar},
title     = {A Comparison of Recent Neural Vocoders for Speech Signal Reconstruction},
booktitle = {Proceedings of the ISCA Speech Synthesis Workshop ({SSW})},
address   = {Vienna, Austria},
month     = {September},
year      = {2019},
doi       = {10.21437/SSW.2019-2},
pages     = {7--12},
url-pdf   = {GovalkarFZD19_ComparisionVocoders_SSW.pdf},
url-details = {https://www.audiolabs-erlangen.de/resources/NLUI/2019-SSW-NeuralVocoders/}
}

Frank Zalkow, Angel Villar Corrales, TJ Tsai, Vlora Arifi-Müller, and Meinard Müller
Tools for Semi-Automatic Bounding Box Annotation of Musical Measures in Sheet Music
In Demos and Late Breaking News of the International Society for Music Information Retrieval Conference (ISMIR), 2019. PDF Details Presentation

@inproceedings{2019_ZalkowVTAM_MeasureAnnotation_ISMIR-LBD,
author      = {Frank Zalkow and Angel Villar Corrales and TJ Tsai and Vlora Arifi-M{\"u}ller and Meinard M{\"u}ller},
title       = {Tools for Semi-Automatic Bounding Box Annotation of Musical Measures in Sheet Music},
booktitle   = {Demos and Late Breaking News of the International Society for Music Information Retrieval Conference ({ISMIR})},
address     = {Delft, The Netherlands},
year        = {2019},
url-pdf     = {2019_ZalkowVTAM_BoundingBox_ISMIR-LBD.pdf},
url-details = {https://www.audiolabs-erlangen.de/resources/MIR/2019-ISMIR-LBD-Measures},
url-presentation = {2019_poster_ZalkowVTAM_BoundingBox_ISMIR-LBD.pdf}
}

Meinard Müller, Helmut Hedwig, Frank Zalkow, and Stefan Popescu
Constraint-Based Time-Scale Modification of Music Recordings for Noise Beautification
Applied Sciences, 8(3), 2018. PDF Demo DOI

@article{MuellerHZP18_NoiseBeauty_AppliedSciences,
author  = {Meinard M{\"u}ller and Helmut Hedwig and Frank Zalkow and Stefan Popescu},
journal = {Applied Sciences},
title   = {Constraint-Based Time-Scale Modification of Music Recordings for Noise Beautification},
year    = {2018},
month   = {March},
volume  = {8},
number  = {3},
articlenumber = {436},
url-pdf  = {2018_MuellerHZP_NoiseBeauty_AppliedSciences_PrintedVersion.pdf},
url      = {http://www.mdpi.com/2076-3417/8/3/436},
url-demo = {https://www.audiolabs-erlangen.de/resources/MIR/2018-MRI-NoiseBeauty},
doi      = {10.3390/app8030436},
ISSN     = {2076-3417}
}

Frank Zalkow and Meinard Müller
Vergleich von PCA- und Autoencoder-basierter Dimensionsreduktion von Merkmalssequenzen für die effiziente Musiksuche
In Proceedings of the Deutsche Jahrestagung für Akustik (DAGA): 1526–1529, 2018. PDF Presentation

@inproceedings{ZalkowM18_VergleichAutoencoderPCA_DAGA,
author    = {Frank Zalkow and Meinard M{\"u}ller},
title     = {Vergleich von {PCA}- und {A}utoencoder-basierter {D}imensionsreduktion von {M}erkmalssequenzen f{\"u}r die effiziente {M}usiksuche},
booktitle = {Proceedings of the {D}eutsche {J}ahrestagung f{\"u}r {A}kustik ({DAGA})},
address   = {M{\"u}nchen, Germany},
year      = {2018},
pages     = {1526--1529},
url-pdf   = {ZalkowM18_VergleichAutoencoderPCA_DAGA.pdf},
url-presentation = {2018_presentation_ZalkowM_VergleichAutoencoderPCA_Daga.pdf}
}

Frank Zalkow, Sebastian Rosenzweig, Johannes Graulich, Lukas Dietz, El Mehdi Lemnaouar, and Meinard Müller
A Web-Based Interface for Score Following and Track Switching in Choral Music
In Demos and Late Breaking News of the International Society for Music Information Retrieval Conference (ISMIR), 2018. PDF Details Presentation

@inproceedings{ZalkowRGDMM18_Carus_ISMIR-LBD,
author      = {Frank Zalkow and Sebastian Rosenzweig and Johannes Graulich and Lukas Dietz and El Mehdi Lemnaouar and Meinard M{\"u}ller},
title       = {A Web-Based Interface for Score Following and Track Switching in Choral Music},
booktitle   = {Demos and Late Breaking News of the International Society for Music Information Retrieval Conference ({ISMIR})},
address     = {Paris, Fance},
year        = {2018},
url-pdf     = {2018_ZalkowRGDMM_Carus_ISMIR-LBD.pdf},
url-details = {https://www.audiolabs-erlangen.de/resources/MIR/2018-ISMIR-LBD-Carus},
url-presentation = {2018_poster_ZalkowRGDMM_Carus_ISMIR-LBD.pdf}
}

Frank Zalkow, Christof Weiß, Thomas Prätzlich, Vlora Arifi-Müller, and Meinard Müller
A Multi-Version Approach for Transferring Measure Annotations Between Music Recordings
In Proceedings of the AES International Conference on Semantic Audio: 148–155, 2017. PDF DOI Presentation

@inproceedings{ZalkowWPAM17_MeasureTransfer_AES,
author    = {Frank Zalkow and Christof Wei{\ss} and Thomas Pr{\"a}tzlich and Vlora Arifi-M{\"u}ller and Meinard M{\"u}ller},
title     = {A Multi-Version Approach for Transferring Measure Annotations Between Music Recordings},
booktitle = {Proceedings of the {AES} International Conference on Semantic Audio},
pages     = {148--155},
address   = {Erlangen, Germany},
year      = {2017},
doi       = {10.17743/aesconf.2017.978-1-942220-15-2},
url       = {http://www.aes.org/e-lib/browse.cfm?elib=18772},
url-pdf   = {ZalkowWPAM17_MeasureTransfer_AES.pdf},
url-presentation = {2017_poster_ZalkowWPAM_Triple_AES.pdf}
}

Christof Weiß, Frank Zalkow, Meinard Müller, Stephanie Klauk, and Rainer Kleinertz
Versionsübergreifende Visualisierung harmonischer Verläufe: Eine Fallstudie zu Wagners Ring-Zyklus
In Proceedings of the Jahrestagung der Gesellschaft für Informatik (GI): 205–217, 2017. DOI

@inproceedings{WeissZMKK17_WagnerRing_GI,
author    = {Christof Wei{\ss} and Frank Zalkow and Meinard M{\"u}ller and Stephanie Klauk and Rainer Kleinertz},
title     = {{V}ersions{\"u}bergreifende {V}isualisierung harmonischer {V}erl{\"a}ufe: {E}ine {F}allstudie zu {W}agners {R}ing-{Z}yklus},
booktitle = {Proceedings of the Jahrestagung der Gesellschaft f{\"u}r Informatik ({GI})},
address   = {Chemnitz, Germany},
year      = {2017},
pages     = {205--217},
url       = {https://dl.gi.de/handle/20.500.12116/3903},
doi       = {10.18420/in2017_14}
}

Frank Zalkow, Christof Weiß, and Meinard Müller
Exploring Tonal-Dramatic Relationships in Richard Wagner’s Ring Cycle
In Proceedings of the International Conference on Music Information Retrieval (ISMIR): 642–648, 2017. PDF Presentation

@inproceedings{ZalkowWM17_WagnerHarmony_ISMIR,
author    = {Frank Zalkow and Christof Wei{\ss} and Meinard M{\"u}ller},
title     = {Exploring Tonal-Dramatic Relationships in Richard {W}agne{r’s} Ring Cycle},
booktitle = {Proceedings of the International Conference on Music Information Retrieval ({ISMIR})},
address   = {Suzhou, China},
year      = {2017},
pages     = {642--648},
url       = {https://ismir2017.smcnus.org/wp-content/uploads/2017/10/132_Paper.pdf},
url-pdf   = {ZalkowWM17_WagnerHarmony_ISMIR.pdf},
url-presentation = {2017_poster_ZalkowWM_Wagner_ISMIR.pdf}
}

Frank Zalkow, Stephan Brand, and Bejamin Graf
Musical Style Modification as an Optimization Problem
In Proceedings of the International Computer Music Conference: 206–211, 2016. PDF Presentation

@inproceedings{ZalkowBrandGraf16_StyleOpt_ICMC,
author    = {Frank Zalkow and Stephan Brand and Bejamin Graf},
title     = {Musical Style Modification as an Optimization Problem},
booktitle = {Proceedings of the International Computer Music Conference},
address   = {Utrecht, The Netherlands},
year      = {2016},
pages     = {206--211},
url       = {http://hdl.handle.net/2027/spo.bbp2372.2016.041},
url-pdf   = {ZalkowBrandGraf16_StyleOpt_ICMC.pdf},
url-presentation = {2016_poster_ZalkowBrandGraf_StyleOpt_ICMC.pdf}
}

Stephanie Klauk and Frank Zalkow
Das italienische Streichquartett im 18. Jahrhundert. Möglichkeiten der semiautomatisierten Stilanalyse
In Bericht zur Jahrestagung der Gesellschaft für Musikforschung (GfM) 2015 in Halle/Saale, 2016. PDF Presentation

@inproceedings{KlaukZalkow16_Streichq_GfM,
author    = {Stephanie Klauk and Frank Zalkow},
title     = {{D}as italienische {S}treichquartett im 18. {J}ahrhundert. {M}öglichkeiten der semiautomatisierten {S}tilanalyse},
booktitle = {{B}ericht zur {J}ahrestagung der {G}esellschaft f{\"u}r {M}usikforschung ({GfM}) 2015 in Halle/Saale},
editor    = {Wolfgang Auhagen and Wolfgang Hirschmann},
publisher = {{S}chott {C}ampus},
address   = {Mainz, Germany},
year      = {2016},
url       = {http://schott-campus.com/das-italienische-streichquartett-im-18-jahrhundert},
url-pdf   = {http://schott-campus.com/wp-content/uploads/2016/09/klauk_zalkow_italienisches-streichquartett.pdf},
url-presentation = {2015_poster_KlaukZalkow_Streichq_GfM.pdf}
}

All publications as Bibtex

International Audio Laboratories Erlangen

Publications