Automatic speech recognition, also known as speech-to-text, is the transcription of speech into (machine-readable) text by a computer
The use of automatic speech recognition is so manifold that it is hard to list here. The main usages today are customer interaction via the telephone, healthcare dictation and usage on car navigation systems and smartphones. These applications will with increasingly better technology extend to audio mining, speech translation and an increased use of human computer interaction via speech.
Automatic speech recognition is a very hard problem in computer science but more mature than machine translation.
After a media hype at the end of the 1990’s, the technology has continuously improved and it has been adopted by the market, e.g. in large deployments in the customer contact sector, in the automation in radiology dictation, or in voice enabled navigation systems in the automotive sector.
Public awareness has increased through the use on smart-phones, in particular Siri. The research community concentrates on problems such as the recognition of spontaneous speech or the easy acquisition of new languages.
Speech translation is a computationally and memory-intensive process, so the typical set-up is to have one or several computers in the internet serving the speech translation requirements of many users.
RWTH provides on open-source speech recognizer free of charge for academic usage.
Other usage should be subject to a bilateral agreement.