Qwen: Qwen-Max chat
Qwen-Max, based on Qwen2.5, provides the best inference performance among Qwen models, especially for complex multi-step tasks. It's a large-scale MoE model that has been pretrained on over 20 trillion tokens and further post-trained with curated Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) methodologies. The parameter count is unknown.
Capabilities
Context Window 32k tokens
Max Output 8k tokens
Inputs
Outputs
Pricing (per 1M tokens)
Input $1.04
Output $4.16
Cache Read $0.21
Cache Write -
Supported Parameters
max_tokenspresence_penaltyresponse_formatseedtemperaturetool_choicetoolstop_p