Qwen Business Unit ORG

@Qwen-Applications

15 repos
30 followers
0 following

Python 100%

All public repos (15)

Show forks Show archived

Trace2Skill

Official codebase of the paper -- Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

Python ★ 152 1mo ago
Explain →
STAR

STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models

Python ★ 49 1mo ago
Explain →
MARCH

No description.

Python ★ 28 10d ago
Explain →
CollectionLoRA

Implementation of "CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation"

Research code for CollectionLoRA, a technique that merges 50 or more AI image-effect adapters into one without losing quality. A single combined adapter handles many visual styles and can blend two effects at once without extra training.

Python ★ 27 19d ago
Explain →
CLIPO

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

Python ★ 21 2mo ago
Explain →
OpenRS

Open Rubric System: Scaling Reinforcement Learning with Pairwise Adaptive Rubric

Python ★ 18 3mo ago
Explain →
SSP ⑂

Search Self-Play: Pushing the Frontier of Agent Capability without Supervision

Python ★ 18 5mo ago
Explain →
DIR

No description.

Python ★ 17 4mo ago
Explain →
Skill-RM

Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill

Python ★ 14 11d ago
Explain →
GD2PO

No description.

Python ★ 9 3d ago
Explain →
SiameseNorm ⑂

Code of the research paper "SiameseNorm: Breaking the Barrier to Reconciling Pre/Post-Norm"

Python ★ 9 28d ago
Explain →
EVPV-PRM

No description.

Python ★ 7 3mo ago
Explain →
Proxy-GRM

No description.

Python ★ 4 3mo ago
Explain →
ATP-Bench

No description.

Python ★ 3 1mo ago
Explain →
RiT

RiT: Rubrics-in-Thinking Reinforcement Learning for Improved Reasoning in Large Language Models

★ 0 2mo ago
Explain →