StreamIndex
Python
★ 20
updated 1mo ago
Memory-bounded compressed sparse attention via streaming top-k. Triton kernels for the DeepSeek-V4 lightning indexer. 32x regime extension on a single H200 | by RightNow https://www.rightnowai.co/
No plain-English explanation yet — one is being written right now. Check back in a minute.