llm-inference-fast-benchmark

Python ★ 3 updated 1y ago

This repository benchmarks the performance of large language models (LLMs) on a 8B role play model(Sao10K/L3-8B-Lunaris-v1) with an average input of 4k tokens and an output of 250 tokens

No plain-English explanation yet — one is being written right now. Check back in a minute.

Open on GitHub → Full breakdown on explaingit →