Accuracy (Vals Index)
60.06% ± 1.52
Latency (Vals Index)
711.83s
Cost/Test (Vals Index)
$2.15
Context Window
1M
Max Output Tokens
128k
Input Modality
Hyperparameter settings
Default Provider :
Anthropic
Some benchmarks may use different provider and parameters. Please refer to the benchmark page for more information.
Temperature
1
Top P
Default
Top K
Default
Max Output Tokens
128,000
Thinking
On
Compute Effort
max
Benchmarks
Accuracy
Rankings
Contact us
Proprietary Benchmarks (contact us to get access)
Academic Benchmarks
Read about our methodology.