New Finance Agent Benchmark Released

Anthropic's latest flagship model

Released Date: 9/29/2025

Avg. Accuracy:

77.4%

Latency:

147.07s

Performance by Benchmark

Benchmarks

Accuracy

Rankings

FinanceAgent

55.3%

( 1 / 42 )

CorpFin

69.4%

( 8 / 55 )

CaseLaw

70.4%

( 15 / 38 )

TaxEval

78.3%

( 8 / 72 )

MortgageTax

75.7%

( 12 / 44 )

AIME

88.5%

( 6 / 62 )

MGSM

94.3%

( 2 / 65 )

LegalBench

83.2%

( 4 / 87 )

MedQA

94.7%

( 7 / 68 )

GPQA

80.8%

( 5 / 64 )

MMLU Pro

87.3%

( 3 / 62 )

MMMU

79.3%

( 7 / 41 )

LiveCodeBench

73.0%

( 13 / 61 )

Terminal-Bench

61.3%

( 1 / 23 )

SWE-bench

69.8%

( 1 / 23 )

Academic Benchmarks
Proprietary Benchmarks (contact us to get access)

Cost Analysis

Input Cost

$3.00 / M Tokens

Output Cost

$15.00 / M Tokens

Join our mailing list to receive benchmark updates on

Stay up to date as new benchmarks and models are released.