llm-evaluation-system
Python
★ 17
updated 1d ago
Agentic AI-guided evaluation system for comparing LLMs with multi-judge jury scoring
No plain-English explanation yet — one is being written right now. Check back in a minute.