llama.cpp

Articles tagged with llama.cpp

2 articles

All Critical (9-10) Important (7-8) Notable (5-6) Logged (1-4) 2 matches

Editorial illustration for: Developer Patches llama.cpp with Google TurboQuant to Run Qwen 3.5-9B on MacBook Air

Developer Runs Qwen 3.5-9B on MacBook Air M4 via TurboQuant-Patched llama.cpp

7/10 4 min read 2 months ago

Editorial illustration for: TurboQuant Optimization Achieves 22.8 Percent Decode Speedup in llama.cpp by Skipping Redundant K

TurboQuant Optimization Achieves 22.8 Percent Decode Speedup in llama.cpp by Skipping Redundant KV Dequantization

8/10 3 min read 2 months ago

📬 Get AI news daily → Subscribe Free