Models IndexQ2 · 2026

Foundation models on accounting tasks.

Tools are products. Models are the capability they sit on top of. This index benchmarks 11 foundation LLMs on eight accounting-task categories — anchored to DualEntry's public 101-task eval where published, synthesized from adjacent public signals otherwise.

11
Models tracked
9
Measured
2
Synthesized
8
Task categories
01Models Leaderboard

Ranked by the composite Models Index (70% accounting-task mean + 15% cost efficiency + 10% context + 5% speed). Accounting % is the DualEntry overall where published, otherwise the mean of our eight sub-category scores.

#TickerModelProviderAccountingΔ Q/QCostCtxIndex
01GPT54NGPT-5.4-NanoOpenAI's fastest, cheapest GPT-5.4 variant — 75.2% on DualEntry, best speed-per-dollar.OpenAI75.2%+6.9$0.10/$0.401M81.8
02GPT54MGPT-5.4-MiniThe price-performance workhorse — 74.3% on DualEntry at a fraction of flagship cost.OpenAI74.3%+5.3$0.50/$2.001M79.3
03CL47Claude Opus 4.7Anthropic's flagship reasoning model —Anthropic79.2%+1.4$5.00/$251M78.2
04GPT54GPT-5.4OpenAI's flagship — 77.3% on DualEntry, heldOpenAI77.3%+0.7$2.50/$151M78.1
05MMX27MiniMax M2.7MiniMax's frontier — 71.3% on DualEntry, competitive mid-tier pricing.MiniMax71.3%+5.2$0.80/$2.201M75.2
06DSV4DeepSeek V4Synth1T MoE open-weights model — synthesized ~70% accounting capability at roughly 1/50th of GPT-5.4's cost.DeepSeek68.8%+2.7$0.30/$0.901M74.7
07GM31PGemini 3.1 ProGoogle's flagship for agentic deployment — 66% on DualEntry, strong long-context story.Google66.0%+7.3$2.00/$121M71.8
08GRK41Grok 4.1 FastSynthxAI's value tier — synthesized ~58% on DualEntry, massive 2M context at rock-bottom pricing.xAI57.5%+12.7$0.20/$0.502M70.2
09CL46SClaude Sonnet 4.6Anthropic's workhorse — 63.4% on DualEntry, strong balance of capability and cost.Anthropic63.4%+5.8$3.00/$151M69.6
10GLM5Z.ai GLM-5Strong Chinese-origin model — 72.3% on DualEntry, aggressive price point.Z.ai72.3%-2.2$0.60/$1.80200K68.6
11CL45HClaude Haiku 4.5Anthropic's fastest and cheapest — 61.4% on DualEntry, strong for high-volume classification.Anthropic61.4%+4.3$0.25/$1.25200K64.5
02Sources