gitmyhub

kekedubing

Python ★ 14 updated 29d ago

Local AI video translation, dubbing, subtitles, and blur editing with faster-whisper, Argos Translate, Supertonic, FFmpeg, and yt-dlp

A local web app that transcribes, translates, and re-voices any video using offline AI tools, upload a file or paste a URL from 1,800 supported sites, pick your languages, and get a dubbed MP4 with burned-in subtitles, all running on your own computer.

PythonFFmpegDockersetup: moderatecomplexity 3/5

Kekedubing (the name is Korean for a dubbing tool) is a local web application that translates and dubs videos using AI tools that run entirely on your own computer. You open it in a browser, upload a video or paste a URL from one of 1,800 supported video sites, choose source and target languages, pick a voice, and it produces a dubbed MP4 file with subtitles burned in.

The pipeline works in steps: the tool first transcribes the spoken audio using faster-whisper (a speech-to-text tool), then translates the transcript using Argos Translate (a translation library that downloads and runs language models locally), then generates new spoken audio in the target language using Supertonic (a text-to-speech system), and finally stitches everything together with FFmpeg (a widely used video processing tool). The dubbed voice is automatically sped up or slowed down to fit within the original segment's timing, staying between 85% and 175% of normal speed to keep speech natural.

Beyond the standard dubbed video output, there is a Live Interpreter tab. You paste a video URL, and the app plays it muted while streaming translated and dubbed audio in near real time, grouping transcript segments into longer chunks so the dubbing sounds less choppy.

The app also has a few editing features: you can draw a rectangular blur zone on the preview video to obscure faces or other content before rendering, choose from 35 subtitle style presets, adjust subtitle position and size, and switch fonts. Translation models for each language pair are downloaded on demand from the settings screen.

Docker is the recommended way to run it: two commands clone and start the app, and it is accessible at a local HTTPS address. Scripts for running it directly on Mac or Windows without Docker are also included. The README notes that long videos will be slow because transcription, speech synthesis, and video rendering are all CPU-heavy operations. The project is released under the MIT license.

Where it fits