Qwen: Qwen3.5-Flash chat

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

Capabilities

Context Window 1M tokens

Max Output 65k tokens

Inputs

Outputs

Pricing (per 1M tokens)

Input $0.07

Output $0.26

Cache Read -

Cache Write $0.08

Supported Parameters

include_reasoningmax_tokenspresence_penaltyreasoningresponse_formatseedstructured_outputstemperaturetool_choicetoolstop_p