VinePPO
Python
★ 0
updated 7mo ago
This repository contains an experimental implementation of **Fine-Grained Credit Assignment for RL Training (CAL)** on top of Google's Tunix framework. This research explores token-level reward assignment to improve training stability and sample efficiency in reinforcement learning for large language models.
No plain-English explanation yet — one is being written right now. Check back in a minute.