Gemini 3.1 Pro
Google's flagship for agentic deployment — 66% on DualEntry, strong long-context story.
Accounting overall
66.0%
Input / Output
$2.00 / $12 per MTok
Context
1M
Speed
~130 tok/s
Released
2026-02
Cutoff
2025-11
GDPVal-AA Elo
1314
Eight accounting-task categories borrowed from DualEntry's 101-task benchmark. Measured where published, synthesized from adjacent benchmarks otherwise.
Gemini 3.1 Pro trails the OpenAI and Anthropic frontier on DualEntry's accounting benchmark — 66% overall, more than 11 points behind GPT-5.4 and 13 points behind Opus 4.7. It's competitive on general capability benchmarks, but accounting is a domain where Gemini has historically been weaker, and the gap here is consistent with that pattern.
Where Gemini 3.1 Pro shines: multimodal document extraction (invoice capture from image-heavy PDFs) and long-context retrieval from large document sets. For agentic workflows that are document-heavy rather than reasoning-heavy, Gemini may actually outperform its headline DualEntry number on real work.
Artificial Analysis's GDPVal-AA Elo of 1314 places it meaningfully below the frontier, suggesting buyers evaluating Gemini for agentic accounting should pilot carefully on their specific workflows rather than rely on general-capability benchmarks alone.
Citations
- DualEntry benchmark (Gemini 3.1 Pro 66%)dualentry.com/blog/claude-opus-4-7-accounting-ai-benchmark-results
- Artificial Analysis — Gemini 3.1 Proartificialanalysis.ai/models/gemini-3-1-pro-preview