Title here
Summary here
8-bit quantization. Near-identical quality to full FP16 precision in practice, roughly half the memory footprint of FP16. Preferred when you have the VRAM headroom — particularly attractive for 32B and smaller models on 96GB systems.