gemma4-vllm-0201-monkeypatch
Python
★ 0
updated 1mo ago
vLLM 0.20.1 monkeypatch that adds extract_activation_layers for Gemma 4 (E4B-it) hidden-state extraction at the last prompt token, with HF-reference verification harness (cosine >= 0.998 in both eager and fast-prefill modes).
No plain-English explanation yet — one is being written right now. Check back in a minute.