llama.cpp
C++
★ 4
updated 1mo ago
llama.cpp fork with a patch for RYS-duplicated Qwen3.5/Qwen3Next models (non-uniform full_attention pattern). See rys-qwen35 branch.
No plain-English explanation yet — one is being written right now. Check back in a minute.