Llama 3.1 Instruct Turbo (405B)
Release Date: 7/23/2024
Benchmarked by
Llama 3.1 Instruct Turbo, 405B parameters with FP8 quantization and reduced context.
Avg. Accuracy
76.2%
Latency
19.9s
Cost (In/Out)
3.50 / 3.50
Context Window
131k
Max Output Tokens
4k
Input Modality
Benchmarks
Accuracy
Rankings
Academic Benchmarks
Proprietary Benchmarks (contact us to get access)