Model
GPT 5.4 evaluated on our full benchmark suite
Vals AIModel
Gemini 3.1 Flash Lite evaluated across our benchmark suite
Vals AIBenchmark
CaseLaw (v2) Benchmark Update
Vals AIModel
Qwen 3.5 Flash on Vals Index
Vals AIBenchmark
Vibe Code Bench v1.1 Released
Vals AIModel
GPT 5.3 Codex Evaluated
Vals AIModel
Gemini 3.1 Pro evaluated across our benchmark suite
Vals AIModel
Gemini 3.1 Pro evaluated on Vals Index
Vals AIBenchmark
MedCode Released: Can models support the medical billing process?
Vals AIBenchmark
MedScribe Released: Can models support doctors with their administrative work?
Vals AI