gitmyhub

Thinking-Free_Policy_Initialization

Python ★ 104 updated 4mo ago

The official code of [ICLR 2026] TFPI: Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners

No plain-English explanation yet — one is being written right now. Check back in a minute.