Llama 3 70B Instruct chat

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 70B instruct-tuned version was optimized for high quality dialogue usecases. It has demonstrated strong...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price8k

0.51- -0.74

Llama 3 8B Instruct chat

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 8B instruct-tuned version was optimized for high quality dialogue usecases. It has demonstrated strong...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price8k

0.03- -0.04

Llama 3.1 70B Instruct chat

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrated strong...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price131k

0.40- -0.40

Llama 3.1 8B Instruct chat

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 8B instruct-tuned version is fast and efficient. It has demonstrated strong performance compared to...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price16k

0.02- -0.05

Llama 3.2 11B Vision Instruct chat

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price131k

0.24- -0.24

Llama 3.2 1B Instruct chat

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows it to operate...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price60k

0.03- -0.20

Llama 3.2 3B Instruct reasoning

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price80k

0.05- -0.34

Llama 3.2 3B Instruct (free) reasoning

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price131k

-- --

Llama 3.3 70B Instruct chat

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price131k

0.12- -0.38

Llama 3.3 70B Instruct (free) chat

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price65k

-- --

Llama 4 Maverick chat

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price1M

0.15- -0.60

Llama 4 Scout chat

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price327k

0.08- -0.30

Llama Guard 3 8B chat

Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification)...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price131k

0.48- -0.03

Llama Guard 4 12B chat

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...

Context Inputs Outputs Input Price Cache Read Price Cache Write Price Output Price163k

0.18- -0.18