MiniMax M2.7 vs Qwen3.5-122B on 96GB VRAM: Community Benchmarks Show M2.7 Leads on Instruction Following
r/LocalLLaMA community benchmarks comparing MiniMax M2.7 and Qwen3.5 122B A10B on 96GB VRAM rigs show M2.7 leading on instruction following and complex reasoni…
Published on MyPrivateClaw
Apr 13, 2026, 8:37 AM UTC
Coverage date
Apr 13, 2026
Last updated
Apr 13, 2026, 8:37 AM UTC
News summary
Community benchmarks published on r/LocalLLaMA compare MiniMax M2.7 and Qwen3.5 122B A10B running at full offload on 96GB VRAM systems. The results favour M2.7 for instruction following and multi step reasoning, while Qwen3.5 retains advantages in raw throughput at lower quantisation levels. What Happened A detailed benchmark post on r/LocalLLaMA tested both models on a 96GB VRAM rig (dual RTX PRO 6000 Blackwell, 48GB each) using NVFP4 quantisation for M2.7 and Q4 K M for Qwen3.5 122B A10B. The comparison covers MMLU (200 question subset), instruction following, and agentic task completion. Key results from the community benchmark: MiniMax M2.7 (NVFP4, 96GB full offload): MMLU 88–89%, strong on multi step instruction following and tool use chains Qwen3.5 122B A10B (Q4 K M, 96GB full offload): MMLU 85–87%, faster token generation at lower quantisation Token throughput: Qwen3.5 generates…