Llama 3.1 Instruct Turbo (405B)
Release Date: 7/23/2024
Benchmarked by
Llama 3.1 Instruct Turbo, 405B parameters with FP8 quantization and reduced context.
Accuracy (Average)
74.37%
Latency (Average)
11.09s
Avg. Cost (In/Out)
3.5 / 3.5
Context Window
131k
Max Output Tokens
4k
Input Modality
Hyperparameter settings
Default Provider :
Meta
Some benchmarks may use different provider and parameters. Please refer to the benchmark page for more information.
Temperature
Default
Top P
Default
Top K
Default
Max Output Tokens
4,096
Benchmarks
Accuracy
Rankings
Academic Benchmarks
Proprietary Benchmarks (contact us to get access)