audiocraft

Jupyter Notebook ★ 23k updated 3mo ago

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

AudioCraft is a research library from Meta (Facebook Research) that lets you generate audio and music using AI. Give it a text description like "upbeat jazz with piano and drums" and it produces a matching audio clip — no musical knowledge or instruments needed.

The library bundles several AI models. MusicGen generates music from text descriptions and can also follow a melody you hum or upload. AudioGen does the same for environmental sounds — things like rain, crowd noise, or footsteps. EnCodec is a neural audio compressor that converts audio into a compact form and back, which the other models use internally. There is also AudioSeal for adding invisible watermarks to AI-generated audio, and JASCO for music generation guided by specific chords, melodies, or drum patterns.

Under the hood everything is built on PyTorch, a popular framework for deep learning research. The models are pre-trained, so you can run them without training anything yourself — just install the library and call the model with your text prompt. Training code is also included for researchers who want to fine-tune or build on top of these models.

You would use AudioCraft when prototyping apps that need background music generation, when doing audio research, or when experimenting with AI-generated sound design. It requires Python 3.9 and PyTorch. Model weights are available for non-commercial use under a separate license.

Open on GitHub → Full breakdown on explaingit →