RNGBench
Python
★ 38
updated 1d ago
An official Implementation of "Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games"
No plain-English explanation yet — one is being written right now. Check back in a minute.