Independent Evaluation, Unbiased Benchmarks

Testing AI on Real-World Tasks

We benchmark the world's leading AI models on rigorous, domain-specific tasks in finance, law, software, healthcare, and more. We run all of our own evaluations and create many of our benchmarks in-house.

Vals AI Updates

Fresh updates from our testing queue

benchmark
05/04/2026

Vals Index and Multimodal Index Methodology Update

Vals Index and Multimodal Index Methodology Update

View Details

System

Accuracy

72.21%

± 1.95

71.00%

± 2.18

66.13%

± 2.17

66.00%

± 2.16

65.55%

± 2.14

64.52%

± 2.23

62.22%

± 2.30

59.33%

± 2.46

58.11%

± 2.19

57.92%

± 2.25
Showing top 10 models from the benchmark. Visit the benchmark page to view more

Industry Leaderboard

Independent benchmarks for industry-specific AI performance.

Industry
Benchmark

Model Performance Over Time

Tracking how foundation models improve with each release