demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Demucs is a Python tool from Meta's research team that separates a music track into its individual components: drums, bass, vocals, and the remaining accompaniment (what the README calls "other"). If you give it a full song, it outputs four separate audio files, one for each part. This is sometimes called stem separation or source separation, and it is useful for remixing, karaoke creation, or music analysis.
The current version (v4) uses a technique that processes the audio in two ways at once: as a raw waveform and as a frequency map, then combines the information using a Transformer, a type of AI model design originally developed for text processing. Earlier versions are also available if you need compatibility with older setups.
For musicians who just want to split tracks, installation is a single pip command and the tool runs from the terminal with one command pointing at an audio file. GPU acceleration is supported for faster processing. Outputs can be in various formats including float32 and int24. An experimental model adding guitar and piano separation is also included, though the project notes the piano quality is limited.
For researchers who want to train their own separation models, the repository provides training scripts, configuration files, and reproducibility grids matching the published paper results. The model was trained on a dataset called MUSDB HQ plus an additional 800 songs.
The original author has noted this repository is no longer actively maintained since leaving Meta. A separate fork is available for ongoing bug fixes. New feature requests and general issues are not being accepted on either repository.