A Reddit user has highlighted strong performance results for Nvidia’s Nemotron Cascade 2 30B-A3B model, which achieved 97.6% on the HumanEval coding benchmark and 88% on ClassEval. The results were posted by user ilintar on the LocalLLaMA subreddit, who tested mradermacher’s IQ4_XS quantized version of the model.
According to the post, the Nemotron Cascade 2 30B-A3B “is *not* based on the Qwen architecture despite a similar size, it’s a properly hybrid model based on Nemotron’s own arch.” The user noted that despite discussions around Nvidia’s Nemotron Super family of models, this particular model “has largely flown under the radar.”
The evaluation used HumanEval and ClassEval benchmarks, which the tester described as “quick to run and complicated enough for most small models to still have noticeable differences.” On HumanEval, the model’s 97.6% score reportedly left “both medium Qwen3.5 models in the rear window,” though specific comparison scores were not provided.
The Reddit user indicated they moved away from subjective evaluation methods, stating: “I’ve been running some evals on local models lately since I’m kind of tired of the ‘vibe feels’ method of judging them.” The combination of HumanEval and ClassEval was chosen as the testing methodology for its balance of speed and complexity.
The poster indicated plans for additional testing, writing “I’m going to run some more tests on this model, but I feel it deserves a bit more attention.” No timeline was provided for when additional benchmark results might be available.
