gitmyhub

deep-swe

★ 0 updated 12d ago ⑂ fork

Measuring frontier coding agents on original, long-horizon engineering tasks

No plain-English explanation yet — one is being written right now. Check back in a minute.