xllm-service
C++
★ 93
updated 4d ago
A flexible serving framework that delivers efficient and fault-tolerant LLM inference for clustered deployments.
No plain-English explanation yet — one is being written right now. Check back in a minute.