Mercury 2 reasoning

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving...

Capabilities

Context Window 128k tokens

Max Output 50k tokens

Inputs

Outputs

Pricing (per 1M tokens)

Input $0.25

Output $0.75

Cache Read $0.02

Cache Write -

Supported Parameters

include_reasoningmax_tokensreasoningresponse_formatstopstructured_outputstemperaturetool_choicetools