New Finance Agent Benchmark Released

Released date

Model

Last Updated 4/5/2025

together/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8

Llama 4 Maverick

Llama 4 Maverick 17B 128E Instruct FP8

Released Date: 4/5/2025

Avg. Accuracy:

59.4%

Latency:

11.42s

Performance by Benchmark

Benchmarks

Accuracy

Rankings

3.6%

( 26 / 29 )

3.6%

26 / 29

57.6%

( 22 / 41 )

57.6%

22 / 41

76.1%

( 44 / 69 )

76.1%

44 / 69

63.3%

( 52 / 72 )

63.3%

52 / 72

69.3%

( 35 / 56 )

69.3%

35 / 56

72.7%

( 14 / 33 )

72.7%

14 / 33

85.2%

( 27 / 52 )

85.2%

27 / 52

25.2%

( 28 / 46 )

25.2%

28 / 46

92.4%

( 9 / 49 )

92.4%

9 / 49

77.2%

( 37 / 72 )

77.2%

37 / 72

43.3%

( 49 / 49 )

43.3%

49 / 49

67.7%

( 17 / 48 )

67.7%

17 / 48

79.4%

( 18 / 46 )

79.4%

18 / 46

47.3%

( 27 / 47 )

47.3%

27 / 47

71.7%

( 13 / 30 )

71.7%

13 / 30

18.4%

( 12 / 13 )

18.4%

12 / 13

Academic Benchmarks

Proprietary Benchmarks (contact us to get access)

Cost Analysis

Input Cost

$0.27 / M Tokens

Output Cost

$0.85 / M Tokens

Input Cost (per char)

$0.21 / M chars

Output Cost (per char)

$0.22 / M chars

Join our mailing list to receive benchmark updates on

Stay up to date as new benchmarks and models are released.