Q8_0

← Back to Glossary

8-bit quantization. Near-identical quality to full FP16 precision in practice, roughly half the memory footprint of FP16. Preferred when you have the VRAM headroom — particularly attractive for 32B and smaller models on 96GB systems.

llama-quant.cpp

Q6_K

Quantization

Codex

Matt Oswalt

Title here

Q8_0