Loading search index…
No recent searches
No results for "Query here"
← Back to Glossary
The standard metric for LLM inference speed. Typically reported as generation speed (how fast new tokens are produced). Prompt processing (prefill) speed is usually much higher but measured separately.