PaLM-rlhf-pytorch

Python ★ 7.9k updated 21d ago

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

No plain-English explanation yet — one is being written right now. Check back in a minute.