Matt Oswalt
Codex
  • Linux
    • File Descriptors
    • Networking
      • eBPF
      • Sockets
  • LLM
    • Resources
    • Inference Stack
    • Apps & Libraries
    • Model Evaluation
    • Memory
    • Glossary
  • Machine Learning
    • Deep Learning
    • Machine Learning
    • Glossary
  • Math
    • Glossary
  • Rust
    • Common Traits
    • Ownership
  • Video
    • GoPro
  • Cheat Sheets
Matt Oswalt
  • Blogs
    • All Categories
    • Rust
    • General Programming
    • Systems
    • Machine Learning
    • Personal
  • Codex
  • Bookclub
  • Portfolio
  • Sponsor Me!
  • Github
  • Twitter
  • Twitch
  • LinkedIn
  • YouTube
  • Facebook
  • Bluesky
  • RSS

Search

Loading search index…

No recent searches

No results for "Query here"

  • to select
  • to navigate
  • to close

Search by FlexSearch

  • Linux
    • File Descriptors
    • Networking
      • eBPF
      • Sockets
  • LLM
    • Resources
    • Inference Stack
    • Apps & Libraries
    • Model Evaluation
    • Memory
    • Glossary
  • Machine Learning
    • Deep Learning
    • Machine Learning
    • Glossary
  • Math
    • Glossary
  • Rust
    • Common Traits
    • Ownership
  • Video
    • GoPro
  • Cheat Sheets

This Glossary

  • BPW (bits per weight)
  • GGUF
  • IQ (importance-matrix quantization)
  • Q4_K_M
  • Q6_K
  • Q8_0
  • Quantization

GGUF

← Back to Glossary

The file format used by llama.cpp to store quantized model weights, tokenizer data, and metadata in a single file. The standard format for local LLM inference. Replaced the older GGML format.

  • GGUF Format Documentation
Referenced in
  • BPW (bits per weight)
  • Inference Stack
  • Model Evaluation
Prev
File Descriptors
Next
Instruct model
    • © 2010 - 2026 Matt Oswalt · Powered by Hugo & Hyas.