llm-inference-fast-benchmark
Python
★ 3
updated 1y ago
This repository benchmarks the performance of large language models (LLMs) on a 8B role play model(Sao10K/L3-8B-Lunaris-v1) with an average input of 4k tokens and an output of 250 tokens
No plain-English explanation yet — one is being written right now. Check back in a minute.