ITS-bench
Python
★ 0
updated 1y ago
⑂ fork
Bench-marking Inference time scaling strategies on MLE-bench for measuring how well AI agents perform at machine learning engineering
No plain-English explanation yet — one is being written right now. Check back in a minute.