TOOL UPDATES TurboQuant Optimization Achieves 22.8 Percent Decode Speedup in llama.cpp by Skipping Redundant KV Dequantization 8/10 3 min read 1 month ago