fasterliveportrait-mlx

Python ★ 25 updated 27d ago

Apple MLX port of FasterLivePortrait for Apple Silicon

Animate portrait photos and videos on Apple Silicon Macs using AI, feed it a webcam, video, audio, or text to make still images move and speak. No GPU required, runs on Mac's unified memory via Apple's MLX framework.

PythonMLXGradioFastAPIHugging Faceffmpeguvsetup: moderatecomplexity 2/5

This project lets you animate portrait photos and videos on an Apple Silicon Mac. You supply a source image, a portrait photo or animal picture, and a driving input, a video clip, a live webcam feed, an audio recording, or even a text prompt, and the tool transfers the motion or expression from the driving source onto the subject in your photo. The result is a video where your still image moves and speaks according to whatever drove it.

Under the hood it uses Apple's MLX framework, which runs directly on the unified memory of Apple Silicon chips rather than requiring a separate GPU. All the core AI models, the face analysis, the motion estimation, the image warping and blending, are converted to MLX format and downloaded automatically from Hugging Face when you first run the app. You do not need to fetch or convert any weights by hand for normal use.

The web interface is built with Gradio and launches at a local address in your browser. From there you pick your source portrait, choose what will drive the animation, and click Generate. The CLI offers the same features with command flags, which is handy for scripting or batch work. An experimental FastAPI endpoint is also included for programmatic access. Multiple quality profiles let you trade off fidelity for speed, which matters when driving with a live webcam where latency is noticeable.

Beyond single-face human portraits, the tool supports animating animal subjects and can detect up to three faces in one source image, animating all of them simultaneously from a single driving face. Audio and text driving are marked experimental: you can feed an audio clip or a typed sentence and the system will generate corresponding lip and head motion, relying on additional models for the voice and motion conversion.

Setup requires an Apple Silicon Mac, Python 3.11 or newer, ffmpeg, and the uv package manager. After installing those, one command installs the Python environment and another starts the web UI. The project is derived from warmshao/FasterLivePortrait, which is itself based on the LivePortrait work from KwaiVGI.

Where it fits

Bring a still profile photo to life by driving it with your webcam in real time.
Generate a talking-head video from a portrait photo using an audio clip or typed sentence.
Animate an animal photo or a group shot with up to three faces using a single driving video.
Batch-process portrait animations via the CLI or FastAPI endpoint for automated workflows.

Open on GitHub → Full breakdown on explaingit →