BENCHMARKS SWE-Rebench February Update: GPT-5.4 and Qwen3.5 Lead on Decontaminated Coding Tasks 8/10 4 min read 2 months ago