executorch
On-device AI across mobile, embedded and edge for PyTorch
ExecuTorch is a tool from the PyTorch team that lets developers take AI models they have already built and trained using PyTorch, then run those models directly on phones, tablets, embedded devices, and microcontrollers, rather than requiring a connection to a cloud server. It is used inside Meta's products including Instagram, WhatsApp, the Quest 3 headset, and Ray-Ban Meta smart glasses.
The general idea is that you prepare your model once on a regular computer, and ExecuTorch converts it into a compact file format that a small, lightweight program can execute on the target device. That on-device program has a base footprint of about 50 kilobytes, which is small enough to fit on quite constrained hardware. The conversion process can also apply optimizations such as quantization, which makes models smaller and faster at a small cost to precision.
One of the notable aspects is hardware flexibility. The same converted model file can target many different processor types, including chips from Apple, Qualcomm, ARM, and MediaTek, as well as standard CPUs. Switching from one hardware backend to another requires changing a single line in the export step, not rewriting the model.
Once a model is exported, you can run it from C++ code, from Swift on iOS, or from Kotlin on Android. The README includes short code examples for all three languages. Large language models such as Llama can also be exported and run on-device using the same workflow, with dedicated runner APIs for text generation.
The project is open-source and installable via pip. Full documentation lives at the PyTorch documentation site, and a Discord community is available for questions and discussion.