Thinking-Free_Policy_Initialization
Python
★ 104
updated 4mo ago
The official code of [ICLR 2026] TFPI: Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners
No plain-English explanation yet — one is being written right now. Check back in a minute.