Qwen3-Next-80B-A3B-Instruct is a causal language model that is instruction-optimized for chat and agent applications. It features a Mixture-of-Experts (MoE) architecture that achieves an extremely low activation ratio, drastically reducing FLOPs per token while preserving model capacity. The model supports ultra-long contexts and has a Multi-Token Prediction (MTP) mechanism to boost performance and accelerate inference.
Input: Text, Image, Video
Output: Text
Providers
deepinfra
Credits
Context262k
Max Output16k
Input$0.140/1M
Output$1.40/1M
Cache Read—
Cache Write—
Quick Start
Use Qwen3 Next 80B A3B Instruct through Helicone's AI Gateway with automatic logging and monitoring.