Cola-DLM

Python ★ 242 updated 16d ago

The codebase of Cola DLM

Cola DLM (Continuous Latent Diffusion Language Model) is a research project from ByteDance Seed that introduces a new approach to generating text with AI. Most current large language models generate text one word at a time, reading left to right. Cola DLM instead works in a hidden (latent) space: it first learns to compress text into a compact mathematical representation using a component called a Text VAE, and then learns to generate those representations using a diffusion process — the same technique used in image generation models like those that create pictures from text prompts. The result is a model that separates the high-level meaning of a passage from the specific word choices, potentially giving it different strengths than standard word-by-word models. This repository provides the trained model weights and code to run Cola DLM, including a server that responds to the same API format as OpenAI's chat models, making it easy to plug into existing tools. It targets AI researchers and engineers exploring alternatives to standard text generation approaches. Python 3.9 or newer and PyTorch 2.1 or newer are required.

Open on GitHub → Full breakdown on explaingit →