HunyuanVideo-1.5
HunyuanVideo-1.5: A leading lightweight video generation model
HunyuanVideo-1.5 is a video generation model from Tencent's Hunyuan team. You give it a text description, an image, or both, and it produces a short video clip. It can work in text-to-video mode (describe a scene in words and get a video) or image-to-video mode (start from a still image and animate it). The model was released in November 2025 as an open-source project with code, weights, and a training pipeline included.
The model has 8.3 billion parameters, which places it on the smaller end compared to some other video generation systems. According to the README, this smaller size is intentional: the goal is to run on consumer-grade graphics cards rather than requiring expensive server hardware. A step-distilled version released in December 2025 further reduces generation time by about 75%, producing videos in around 75 seconds on a single high-end consumer GPU. The distilled model achieves this by running only 8 or 12 generation steps instead of the usual larger number.
The project is built for Python and integrates with several existing tools. It works with Hugging Face Diffusers, which is a popular open-source library for running AI generation models. It also supports ComfyUI, a visual node-based interface that lets you build generation workflows by connecting blocks together without writing code. Several community-built plugins and lightweight inference frameworks have also added support for the model.
The training code is open-sourced, and the README provides instructions for fine-tuning the model using a technique called LoRA, which lets you adapt the model to new styles or subjects without retraining everything from scratch. The Tencent team uses an optimizer called Muon for training, which they have also released publicly.
System requirements, installation steps, and prompt-writing guidance are covered in the README and a companion prompt handbook. The full README is longer than what was shown.