MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a...
Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price262k0.100.01 -0.30
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...
Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price262k0.400.08 -2.00
MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like...
Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price1M1.000.20 -3.00
MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...
Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price1M0.400.08 -2.00
MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....
Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price1M1.000.20 -3.00