Llama 3.1 Instruct Turbo (405B)
Release Date: 7/23/2024
Benchmarked by
Llama 3.1 Instruct Turbo, 405B parameters with FP8 quantization and reduced context.
Avg. Accuracy
73.62%
Latency
19.64s
Cost (In/Out)
3.5 / 3.5
Context Window
131k
Max Output Tokens
4k
Input Modality
Benchmarks
Accuracy
Rankings
Academic Benchmarks
Proprietary Benchmarks (contact us to get access)