xllm-service

C++ ★ 93 updated 4d ago

A flexible serving framework that delivers efficient and fault-tolerant LLM inference for clustered deployments.

No plain-English explanation yet — one is being written right now. Check back in a minute.