gitmyhub

asr-hotword

Python ★ 28 updated 9d ago

最棒的的ASR后处理热词方案,基于音素编辑距离,实现热词替换。

A Python post-processing library that fixes speech recognition errors for brand names and technical terms by matching sounds (phonemes) instead of letters, then swapping in the correct word automatically.

Pythonpypinyinrapidfuzzsetup: easycomplexity 2/5

Speech recognition software often gets uncommon words wrong. Brand names, technical terms, and proper nouns tend to come out garbled: "CapsWriter" might be transcribed as "Caps Rider", "Claude" as "cloud", and a Chinese brand name might appear as an ordinary word that sounds similar. This library is a post-processing tool you run on the raw text output from any speech recognition system to fix those mistakes.

The core idea is phoneme-based matching. Instead of comparing words letter-by-letter, the library converts both your list of target words and the recognized text into sound units (pinyin syllables for Chinese, individual letters for English), then measures how similar the sounds are. If the similarity score passes a threshold you set, the misrecognized fragment is replaced with the correct word. Processing 5,000 hotwords against a single sentence takes about 20 milliseconds.

You define your hotwords in a plain text file, one entry per line. Each entry has a target word followed by one or more aliases, separated by pipe characters. The aliases are alternate ways the word might get transcribed. Any alias that sounds similar enough to something in the recognized text will trigger a replacement with the first word in the line. The first word does not have to be a correction target; it can be any text you want to type quickly, like a phone number or email address. Saying the alias out loud then outputs the full expansion.

The library is extracted from a larger project called CapsWriter-Offline, which is a voice typing tool for Windows. It supports mixed Chinese and English text. Installation requires two Python packages: pypinyin (for Chinese phoneme conversion) and rapidfuzz (for fuzzy string matching). The repository includes a sample hotword file and a demo script so you can test corrections against example inputs right away.

Where it fits