GM31PGoogle · Gemini 3

Gemini 3.1 Pro

Google's flagship for agentic deployment — 66% on DualEntry, strong long-context story.

Accounting overall

66.0%

Input / Output

$2.00 / $12 per MTok

Context

Speed

~130 tok/s

Released

2026-02

Cutoff

2025-11

GDPVal-AA Elo

1314

01Accounting Task Breakdown

Eight accounting-task categories borrowed from DualEntry's 101-task benchmark. Measured where published, synthesized from adjacent benchmarks otherwise.

Transaction Class.

78.0

Journal Entry

76.0

Accounts Payable

68.0

Accounts Receivable

66.0

Bank Reconciliation

62.0

Financial Reporting

52.0

Month-End Close

40.0

Accounting Knowledge

76.0

02Research

Gemini 3.1 Pro trails the OpenAI and Anthropic frontier on DualEntry's accounting benchmark — 66% overall, more than 11 points behind GPT-5.4 and 13 points behind Opus 4.7. It's competitive on general capability benchmarks, but accounting is a domain where Gemini has historically been weaker, and the gap here is consistent with that pattern.

Where Gemini 3.1 Pro shines: multimodal document extraction (invoice capture from image-heavy PDFs) and long-context retrieval from large document sets. For agentic workflows that are document-heavy rather than reasoning-heavy, Gemini may actually outperform its headline DualEntry number on real work.

Artificial Analysis's GDPVal-AA Elo of 1314 places it meaningfully below the frontier, suggesting buyers evaluating Gemini for agentic accounting should pilot carefully on their specific workflows rather than rely on general-capability benchmarks alone.

Citations

DualEntry benchmark (Gemini 3.1 Pro 66%)dualentry.com/blog/claude-opus-4-7-accounting-ai-benchmark-results
Artificial Analysis — Gemini 3.1 Proartificialanalysis.ai/models/gemini-3-1-pro-preview