gitmyhub

gemma4-vllm-0201-monkeypatch

Python ★ 0 updated 1mo ago

vLLM 0.20.1 monkeypatch that adds extract_activation_layers for Gemma 4 (E4B-it) hidden-state extraction at the last prompt token, with HF-reference verification harness (cosine >= 0.998 in both eager and fast-prefill modes).

No plain-English explanation yet — one is being written right now. Check back in a minute.