Mercury 2 reasoning
Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving...
Capabilities
Context Window 128k tokens
Max Output 50k tokens
Inputs
Outputs
Pricing (per 1M tokens)
Input $0.25
Output $0.75
Cache Read $0.02
Cache Write -
Supported Parameters
include_reasoningmax_tokensreasoningresponse_formatstopstructured_outputstemperaturetool_choicetools