trustdiff
Python
★ 1
updated 9d ago
Claude Code skill that reviews AI-generated diffs for silent failures green tests can't catch - contract drift, scope creep, fake fixes, test masking. Ships its own blind benchmark: bare Haiku 83%/25% FP, Haiku+trustdiff 100%/0%.
No plain-English explanation yet — one is being written right now. Check back in a minute.