Document Arena

View overall rankings across AI models in document analysis and long-content reasoning.

Jun 10, 2026
252,314 votes
29 models
Rank Spread
1
15
Anthropic
Anthropic · Proprietary
1507±6
32,122$5 / $251M
2
15
Anthropic
Anthropic · Proprietary
1507±7
20,246$5 / $251M
3
17
Anthropic
Anthropic · Proprietary
1498±7
13,878$5 / $251M
4
18
Anthropic
Anthropic · Proprietary
1496±7
14,110$5 / $251M
5
111
Anthropic
Anthropic · Proprietary
1495±15
1,461$10 / $501M
6
310
Anthropic
Anthropic · Proprietary
1487±6
49,424$3 / $151M
7
311
OpenAI · Proprietary
1485±7
11,789$5 / $301.1M
8
411
OpenAI · Proprietary
1483±7
12,050$5 / $301.1M
9
612
OpenAI · Proprietary
1474±7
24,400$2.50 / $151.1M
10
512
Anthropic
Anthropic · Proprietary
1473±11
3,431$5 / $251M
11
512
Anthropic
Anthropic · Proprietary
1472±11
3,223$5 / $251M
12
915
Anthropic
Anthropic · Proprietary
1461±10
7,987$5 / $25200K
13
1217
Moonshot · Modified MIT
1451±8
8,574$0.95 / $4262.1K
14
1217
Anthropic
Anthropic · Proprietary
1449±7
24,162$3 / $15200K
15
1223
Meta
Meta · Proprietary
1442±18
1,081N/AN/A
16
1319
Google · Proprietary
1441±6
38,090$2 / $121M
17
1321
MiniMax · Proprietary
1438±10
3,636$0.60 / $2.40N/A
18
1523
Google · Proprietary
1433±9
10,748$2 / $121M
19
1523
Moonshot · Modified MIT
1429±7
16,540$0.60 / $3N/A
20
1625
Google · Apache 2.0
1424±9
7,970N/AN/A
21
1726
Google · Proprietary
1420±6
24,980$1.25 / $101M
22
1726
Anthropic
Anthropic · Proprietary
1418±6
26,371$1 / $5200K
23
1629
Z.ai · Proprietary
1413±15
1,389$1.20 / $4202.8K
24
2029
Google · Proprietary
1413±9
7,188$0.50 / $31M
25
2029
1410±7
14,105$2 / $62M
26
2129
OpenAI · Proprietary
1405±9
7,096$1.75 / $14400K
27
2329
OpenAI · Proprietary
1402±8
8,539$5 / $301.1M
28
2329
OpenAI · Proprietary
1401±9
8,253$1.25 / $10400K
29
2329
OpenAI · Proprietary
1401±6
28,188$1.75 / $14400K

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)