Llama 3.1 Instruct Turbo, 8B parameters with FP8 quantization.
Released Date: 7/23/2024
Avg. Accuracy:
48.4%Latency:
2.24sPerformance by Benchmark
Benchmarks
Accuracy
Rankings
Academic Benchmarks
Proprietary Benchmarks (contact us to get access)
Cost Analysis
Input Cost
$0.18 / M Tokens
Output Cost
$0.18 / M Tokens
Input Cost (per char)
$0.05 / M chars
Output Cost (per char)
$0.07 / M chars