Qwen: Qwen3.5-Flash chat

openrouter
alibaba
The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the 3 series, these models deliver a leap forward in performance for both pure text and multimodal tasks, offering fast response times while balancing inference speed and overall performance.

Capabilities

Context Window 1M tokens
Max Output 65k tokens
Inputs
Outputs

Pricing (per 1M tokens)

Input $0.07
Output $0.26
Cache Read -
Cache Write -

Supported Parameters

include_reasoningmax_tokenspresence_penaltyreasoningresponse_formatseedstructured_outputstemperaturetool_choicetoolstop_p