CodeGeeX
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
CodeGeeX is a 13-billion-parameter AI model for code completion, generation, and translation across 20+ programming languages, available as a free VS Code and JetBrains plugin with no local GPU needed for the extension.
CodeGeeX is an AI model trained to write and translate code. It has 13 billion parameters and was trained on code written in more than 20 programming languages, including Python, C++, Java, JavaScript, and Go. The model can complete code, generate functions from descriptions, and convert a piece of code written in one language into another language.
The project was developed by researchers and published at the KDD 2023 conference. It was trained on Huawei's Ascend AI processors but can also run on NVIDIA GPUs. The model weights are available for download after submitting a request through the project's website, and the download is around 26 gigabytes.
For everyday use, CodeGeeX is available as a free extension for VS Code and JetBrains IDEs such as IntelliJ IDEA and PyCharm. The extension provides code completion, code explanation, and code summarization directly inside the editor. This means you can install it into the tool you already use to write code and interact with the model without running anything separately.
The repository also includes a benchmark called HumanEval-X, which is a set of 820 hand-written coding problems across five programming languages. Each problem includes tests and a reference solution. The benchmark was created to give researchers a consistent way to measure how well code generation models perform across different languages, not just Python. It is available on the Hugging Face dataset platform.
Model weights can be run with reduced GPU memory requirements through quantization, which the repository describes as dropping from 27GB to 15GB of GPU RAM. A newer version of the model, CodeGeeX2, is also mentioned in the README as a separate release with support for over 100 languages.
Where it fits
- Install the free VS Code or JetBrains extension to get AI-powered code completion inside your existing editor without any local setup.
- Translate a function written in Python into Go, Java, or another language using the model's built-in code translation feature.
- Run CodeGeeX locally on a GPU server after downloading the model weights to get private, self-hosted code generation.
- Use the HumanEval-X benchmark to evaluate and compare code generation models across Python, Java, Go, C++, and JavaScript.