gitmyhub

mle-bench

Python ★ 1.6k updated 1mo ago

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

No plain-English explanation yet — one is being written right now. Check back in a minute.