Claude Sonnet 4.6
Anthropic's workhorse — 63.4% on DualEntry, strong balance of capability and cost.
Accounting overall
63.4%
Input / Output
$3.00 / $15 per MTok
Context
1M
Speed
~120 tok/s
Released
2025-12
Cutoff
2025-08
Eight accounting-task categories borrowed from DualEntry's 101-task benchmark. Measured where published, synthesized from adjacent benchmarks otherwise.
Sonnet 4.6 is the previous-generation mid-tier Anthropic offering — reliable, well-understood by developers, and present in many production agentic accounting tools at the time this issue publishes. At 63.4% on DualEntry it meaningfully underperforms GPT-5.4-Mini (74.3%) and GPT-5.4-Nano (75.2%) on accounting work, despite similar per-token pricing.
For tools built on Sonnet 4.6 today, the obvious upgrade path is either Opus 4.7 (for capability-first workflows) or GPT-5.4-Mini (for cost-optimized workflows). The question for tool builders: is your current Sonnet 4.6 integration producing the accuracy your customers expect? The DualEntry numbers suggest the ceiling is lower than it feels.
Citations
- DualEntry benchmark (Sonnet 4.6 63.4%)dualentry.com/blog/claude-opus-4-7-accounting-ai-benchmark-results