Course: Speech Coding, Summer Term 2015

Past course
Please visit the education page for information on current courses.
env_perc

Lecturer

Prof. Dr. Tom Bäckström

Guest Lectures

tbd

Time

Summer Term 2015, Wednesdays 16:15-17:45

Place

Am Wolfsmantel 33, Erlangen-Tennenlohe, Room 3R4.04

Registration

Please come to the first lecture on Wednesday, 15.04.2015, 16:15 - 17:45, Room 3R4.04 (Am Wolfsmantel 33). If you are unable to attend, please contact Prof. Dr. Tom Bäckström.

Content

Mobile phones – everyone has one. With 7 billion mobile phones in use, digital speech transmission is a truly global technology. Your grandma has one, Prince Charles has one and the poorest village in Africa has one. While the technology clearly works already, with such a market, the smallest improvement, when multiplied by 7 billion, has a huge impact worldwide.

Speech coding refers to digital compression and transmission of speech. This course provides an in-depth perspective to ACELP, the most commonly used speech coding algorithm. We will study the speech production models on which it is based, the perceptual models which are used for its optimization, and most importantly, go through the theory and practice of the most important concepts, linear prediction (LP), long time prediction (LTP), algebraic codebooks, line spectral frequencies (LSFs) and windowing. In addition, we will look at the big picture, the additional challenges that emerge when building a commercial speech coding product.

The goal of this course is to provide a strong foundation for researchers, engineers, and graduate students who are interested in the problem of speech coding.

Tentative Schedule

  • 15.4 -- Introduction
  • 22.4 -- cancelled, no lecture
  • 29.4 -- Speech Production and Perception
  • 6.5 -- Envelopes
  • 13.5 -- Windowing
  • 20.5 -- Residual Modelling & Fundamental Frequency
  • 27.5 -- Quality Evaluation
  • 3.6 -- Relaxed Modelling (RCELP)
  • 10.6 -- Systems Design, Constraints and Implementation
  • 17.6 -- Voice Activity Detection
  • 24.6 -- Packet Loss
  • 1.7 -- Speech Coding Standards
  • 8.7 -- On reserve
  • 15.7 -- On reserve

Course requirements

This course is the most advanced course offered by the university on this topic, and serves as an excellent basis from which to commence research in the area. Various aspects of the course bring students up to date with the very latest developments in the field, as seen in recent international standards, conferences and journals. This course builds on Sprach- und Audiosignalverarbeitung (by Prof. Kellermann), and is well complimented by Mensch-Maschine-Schnittstelle (by Prof. Rabenstein), Praxis der Audiodatenkompression (Dr. Grill), Speech Enhancement (Prof. Habets) and Selected Topics in Perceptual Audio Coding (Prof. Herre), which deal with many other signal processing methods and gives an understanding of human auditory perception (also a key part of speech coding) and audio compression techniques.

Pre-requisites; students must be familiar with Signals and systems as well as basic linear algebra and statistics.

Course material

If your are missing handouts or chapter printouts, please contact the course assistant (yet to be chosen) or the lecturer (tom.backstrom@audiolabs-erlangen.de).

tube