Q6_K

6-bit K-quant. Higher quality than Q4, uses ~50% more memory. Generally considered near-lossless for most tasks. The recommended quantization for 70B models on hardware with 96GB GPU memory.