Members
-
Relax ★ PINNED
An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale
Python ★ 429 1d agoExplain → -
PIPO ★ PINNED
Implementation of an efficient LLM architecture: the Pair-In / Pair-Out Model (PIPO)
Python ★ 31 9d agoExplain → -
HiSVD ★ PINNED
[ACL 2026] HiSVD: Principled Low-Rank Approximation of LLMs via Hierarchical Modeling of Information Capacity and Spectral Structure
Python ★ 2 3mo agoExplain → -
hint-tuning
Official code, data, and models for "Hint Tuning: Less Data Makes Better Reasoners"
Python ★ 22 16d agoExplain → -
slime ⑂
slime is an LLM post-training framework for RL Scaling.
Python ★ 0 10d agoExplain → -
torch_memory_saver ⑂
Allow torch tensor memory to be released and resumed later
Python ★ 0 1mo agoExplain → -
sglang ⑂
SGLang is a high-performance serving framework for large language models and multimodal models.
Python ★ 0 1mo agoExplain → -
Megatron-Bridge ⑂
Training library for Megatron-based models with bidirectional Hugging Face conversion capability
Python ★ 0 29d agoExplain → -
.github
Xiaohongshu AI Platform Team
★ 0 2mo agoExplain → -
TransferQueue ⑂
This is the official **live** mirror of https://gitcode.com/Ascend/TransferQueue. Feel free to contribute!
Python ★ 0 8d agoExplain → -
Model-Optimizer ⑂
A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.
★ 0 3mo agoExplain → -
dynamo ⑂
A Datacenter Scale Distributed Inference Serving Framework
Rust ★ 0 3mo agoExplain → -
vllm ⑂
A high-throughput and memory-efficient inference and serving engine for LLMs
Python ★ 0 1mo agoExplain →
No repos match these filters.