Independent Evaluation, Unbiased Benchmarks

Testing AI on Real-World Tasks

We benchmark the world's leading AI models on rigorous, domain-specific tasks in finance, law, software, healthcare, and more. We run all of our own evaluations and create many of our benchmarks in-house.

Vals AI Updates

Fresh updates from our testing queue

benchmark
06/09/2026

Public Benefits Bench Released

Public Benefits Bench Released

View Details

System

Accuracy

62.11%

± 1.26

60.69%

± 1.27

58.52%

± 1.28

57.98%

± 1.28

57.92%

± 1.28

57.58%

± 1.29

57.24%

± 1.29

53.38%

± 1.30

50.88%

± 1.30

50.07%

± 1.30
Showing top 10 models from the benchmark. Visit the benchmark page to view more

Industry Leaderboard

Independent benchmarks for industry-specific AI performance.

Industry
Benchmark