Qwen: Qwen-Max chat
Qwen-Max, based on Qwen2.5, provides the best inference performance among Qwen models, especially for complex multi-step tasks. It's a large-scale MoE model that has been pretrained on over 20 trillion...
Capabilities
Context Window 32k tokens
Max Output 8k tokens
Inputs
Outputs
Pricing (per 1M tokens)
Input $1.04
Output $4.16
Cache Read $0.21
Cache Write -
Supported Parameters
max_tokenspresence_penaltyresponse_formatseedtemperaturetool_choicetoolstop_p