Qwen3 VL 235B A22B Instruct is a powerful, open-weight multimodal model from Alibaba Cloud that excels at both language and vision tasks. It integrates strong text generation with advanced visual understanding of images and video, enabling applications like visual question answering, document parsing, chart extraction, and multilingual OCR. Key features include robust perception, spatial and long-form video comprehension, and the ability to follow complex instructions in multi-turn dialogues. This model also supports agentic interactions and tool use, including visual coding and GUI automation.
Input: Text, Image, Video
Output: Text
Providers
novita
Credits
Context256k
Max Output16k
Input$0.300/1M
Output$1.50/1M
Cache Read—
Cache Write—
Quick Start
Use Qwen3 VL 235B A22B Instruct through Helicone's AI Gateway with automatic logging and monitoring.