xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
xFormers is a Python library from Meta's research team that provides building blocks for working with Transformer models, which are the architecture behind most modern AI language and vision systems. The library is aimed at researchers and engineers who want to experiment with or optimize these models without being limited to what ships in standard tools like PyTorch.
The core value the library offers is speed and memory efficiency. Transformer models, especially large ones, require a lot of GPU memory and computation. xFormers includes custom GPU code (called kernels) that makes certain operations faster or less memory-hungry than the standard implementations. The most prominent example is its memory-efficient attention operation, which the README claims can be up to 10 times faster than a standard approach while still producing exact results, not an approximation.
Beyond attention, the library includes optimized versions of other common operations used inside these models: layer normalization, dropout combined with activation functions, a fused linear layer, and a component called SwiGLU used in some newer architectures. These components are designed to be used independently, so you can drop one into an existing project without having to adopt the whole library.
Installation requires a compatible version of PyTorch and a CUDA-capable GPU (NVIDIA hardware on Linux or Windows). AMD GPU support is listed as experimental. The library is also available from source if you need to pair it with a specific PyTorch version not covered by the prebuilt packages.
The project is primarily a research tool and is used across both language and vision work at Meta. It carries a BSD-style open-source license and includes attribution to several other open-source projects whose code or ideas it builds on.