SPOTLIGHT Developer Runs Qwen 3.5-9B on MacBook Air M4 via TurboQuant-Patched llama.cpp 7/10 4 min read 2 months ago
TOOL UPDATES TurboQuant Optimization Achieves 22.8 Percent Decode Speedup in llama.cpp by Skipping Redundant KV Dequantization 8/10 3 min read 2 months ago