gitmyhub

DeepSpeech

C++ ★ 27k updated 1y ago ▣ archived

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

DeepSpeech was Mozilla's open-source offline speech-to-text engine that ran entirely on-device, even on a Raspberry Pi, but is now discontinued and no longer maintained.

C++PythonTensorFlowsetup: hardcomplexity 4/5

DeepSpeech was Mozilla's open-source speech-to-text engine — software that listens to audio and converts spoken words into written text, entirely on-device without sending anything to the cloud. It was designed to run offline, which made it attractive for privacy-sensitive applications or situations where internet access wasn't available.

A key technical achievement was its ability to run on low-power hardware: it could transcribe speech in real time on a Raspberry Pi (a credit-card-sized computer costing around $35), as well as on more powerful GPU servers. This range made it useful for everything from embedded smart home devices to large-scale transcription pipelines.

Note: this project has been discontinued by Mozilla and is no longer actively maintained. For developers looking for a similar capability today, Mozilla's work here influenced several successor projects, and alternatives like Whisper (from OpenAI) have largely taken over this space. The code and pre-trained models remain available for historical reference or for projects that need to build on the existing foundation, but you should not start a new project expecting ongoing updates or support.

Where it fits