Automatic Speech Recognition

Automatic Speech Recognition

Automatic speech recognition, also known as speech-to-text, is the transcription of speech into (machine-readable) text by a computer

Target users and customers

  • Researchers
  • Developers
  • Integrators

Application sectors

The use of automatic speech recognition is so manifold that it is hard to list here. The main usages today are customer interaction via the telephone, healthcare dictation and usage on car navigation systems and smartphones. These applications will with increasingly better technology extend to audio mining, speech translation and an increased use of human computer interaction via speech.


Automatic speech recognition is a very hard problem in computer science but more mature than machine translation.

After a media hype at the end of the 1990’s, the technology has continuously improved and it has been adopted by the market, e.g. in large deployments in the customer contact sector, in the automation in radiology dictation, or in voice enabled navigation systems in the automotive sector.

Public awareness has increased through the use on smart-phones, in particular Siri. The research community concentrates on problems such as the recognition of spontaneous speech or the easy acquisition of new languages.

Technical requirements:

Speech translation is a computationally and memory-intensive process, so the typical set-up is to have one or several computers in the internet serving the speech translation requirements of many users.

Conditions for access and use:

RWTH provides on open-source speech recognizer free of charge for academic usage.
Other usage should be subject to a bilateral agreement.

Bildschirmfoto vom 2013-07-23 183A153A01


  • RWTH Aachen

Contact details:

Volker Steinbiss

RWTH Aachen University
Lehrstuhl für Informatik 6
Templergraben 55
52072 Aachen