llm-vram-calculator

TypeScript ★ 0 updated 1mo ago

Estimate whether a given language model will fit in your GPU's VRAM for inference, LoRA, or QLoRA fine-tuning. Per-architecture KV cache values for current open-weights models.

No plain-English explanation yet — one is being written right now. Check back in a minute.

Open on GitHub → Full breakdown on explaingit →