← Back to Rankings
Model

AI Coding Capability Leaderboard

Data source: SWE-bench official Bash-Only leaderboard (mini-SWE-agent v2.0.0, 500 instances, single attempt). Data retrieved February 2026. LMSYS Chatbot Arena and other leaderboards temporarily unavailable due to network restrictions.

RankModelVendorSWE-benchType
🥇Claude 4.5 OpusAnthropic76.8%Closed
🥈Gemini 3 FlashGoogle DeepMind75.8%Closed
🥉MiniMax M2.5MiniMax75.8%Closed
4Claude Opus 4.6Anthropic75.6%Closed
5Claude 4.5 Opus (medium)Anthropic74.4%Closed
6Gemini 3 Pro PreviewGoogle DeepMind74.2%Closed
7GLM-5Z-AI72.8%Closed
8GPT-5.2OpenAI72.8%Closed
9Claude 4.5 SonnetAnthropic71.4%Closed
10Kimi K2.5Moonshot AI70.8%Closed
11DeepSeek V3.2DeepSeek70.0%Open
12Gemini 3 ProGoogle DeepMind69.6%Closed
13Claude 4 OpusAnthropic67.6%Closed
14Claude 4.5 HaikuAnthropic66.6%Closed
15GPT-5.1OpenAI66.0%Closed
16GPT-5OpenAI65.0%Closed
17Claude 4 SonnetAnthropic64.9%Closed