meta-llama: Meta Llama 3.1 8B Instruct Turbo

llama-3.1-8b-instruct-turbo
Context: 128k
Max Output: 128k
Input: $0.020/1M
Output: $0.030/1M
Optimized version of Llama 3.1 8B Instruct with 128K context window, designed for high-speed inference in multilingual chat and dialogue use cases with improved throughput and efficiency.
Input: Text
Output: Text

Providers

deepinfra
Credits
Context128k
Max Output128k
Input$0.020/1M
Output$0.030/1M
Cache Read
Cache Write
nebius
Credits
Context128k
Max Output128k
Input$0.030/1M
Output$0.090/1M
Cache Read
Cache Write

Quick Start

Use Meta Llama 3.1 8B Instruct Turbo through Helicone's AI Gateway with automatic logging and monitoring.