DeepSeek V3 chat

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported evaluations...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price163k

0.32- -0.89

DeepSeek V3 0324 chat

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the DeepSeek V3 model and performs really well...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price163k

0.200.14 -0.77

DeepSeek V3.1 reasoning

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price32k

0.15- -0.75

DeepSeek V3.1 Terminus chat

DeepSeek-V3.1 Terminus is an update to DeepSeek V3.1 that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price163k

0.270.13 -0.95

DeepSeek V3.2 reasoning

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price131k

0.250.03 -0.38

DeepSeek V3.2 Exp chat

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price163k

0.27- -0.41

DeepSeek V3.2 Speciale reasoning

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on DeepSeek Sparse Attention (DSA) for efficient long-context processing, then scales post-training reinforcement learning...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price163k

0.290.06 -0.43

DeepSeek V4 Flash chat

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price1M

0.140.00 -0.28

DeepSeek V4 Pro reasoning

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price1M

0.430.00 -0.87

R1 reasoning

DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass....

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price64k

0.70- -2.50

R1 0528 reasoning

May 28th update to the original DeepSeek R1 Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price163k

0.500.35 -2.15

R1 Distill Llama 70B reasoning

DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The model combines advanced distillation techniques to achieve high performance across...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price131k

0.70- -0.80

R1 Distill Qwen 32B reasoning

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on Qwen 2.5 32B, using outputs from DeepSeek R1. It outperforms OpenAI's o1-mini across various benchmarks, achieving new...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price32k

0.29- -0.29