gitmyhub

olmes

Python ★ 380 updated 2mo ago

Reproducible, flexible LLM evaluations

No plain-English explanation yet — one is being written right now. Check back in a minute.