gitmyhub

Chinese-LLaMA-Alpaca-2

Python ★ 7.1k updated 2mo ago

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Chinese-language versions of Meta's LLaMA-2 and Alpaca-2 AI models, adapted with expanded vocabulary and Chinese training data, available as base and chat models from 1.3B to 13B parameters with long-context support.

PythonPyTorchtransformersllama.cppvLLMsetup: hardcomplexity 4/5

This repository provides Chinese-language versions of two AI language models: LLaMA-2 and Alpaca-2. LLaMA-2 is a foundational text model released by Meta; Alpaca-2 is a version of that model further trained to follow instructions and hold conversations, similar to how a chat assistant works. The models here have been adapted to work much better with Chinese text by expanding the vocabulary they understand and training them on large amounts of Chinese data.

There are two main categories of models: base models, which are good at continuing text given a prompt, and chat or instruction models, which are better at answering questions, writing, and back-and-forth conversation. Several sizes are available, from smaller 1.3 billion parameter models to larger 13 billion parameter ones. There are also extended-context versions that can read and generate much longer passages of text, with some supporting up to 64,000 tokens of context at once, which is roughly equivalent to a short novel.

The models can be run on a personal computer using techniques that compress them to use less memory. The repository includes scripts for pre-training and fine-tuning, so researchers and developers can train their own variants. It is compatible with a range of popular tools in the AI community, such as transformers, llama.cpp, and vLLM.

Some models in this project have also been trained with a technique called RLHF, which uses human feedback to make the model's responses more aligned with human values and preferences. The README is written primarily in Chinese and the project is aimed at Chinese-language AI research and application development. The full README is longer than what was shown.

Where it fits