evaluation-guidebook
Jupyter Notebook
★ 2.1k
updated 6mo ago
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
No plain-English explanation yet — one is being written right now. Check back in a minute.