meta-llama: Meta Llama 3.1 8B Instruct Turbo

llama-3.1-8b-instruct-turbo

Context: 128k

Max Output: 128k

Input: $0.020/1M

Output: $0.030/1M

Optimized version of Llama 3.1 8B Instruct with 128K context window, designed for high-speed inference in multilingual chat and dialogue use cases with improved throughput and efficiency.

Input: Text

Output: Text

Providers

deepinfra

Credits

Context128k

Max Output128k

Input$0.020/1M

Output$0.030/1M

Cache Read—

Cache Write—

nebius

Credits

Context128k

Max Output128k

Input$0.030/1M

Output$0.090/1M

Cache Read—

Cache Write—

Quick Start

Use Meta Llama 3.1 8B Instruct Turbo through Helicone's AI Gateway with automatic logging and monitoring.