confidential_llms

Python ★ 2 updated 9mo ago

Repository for the paper "Evaluating Language Model Reasoning about Confidential Information" and the PasswordEval benchmark.

No plain-English explanation yet — one is being written right now. Check back in a minute.