R1 Distill Llama 70B reasoning

DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The model combines advanced distillation techniques to achieve high performance across...

Capabilities

Context Window 131k tokens

Max Output 16k tokens

Inputs

Outputs

Pricing (per 1M tokens)

Input $0.70

Output $0.80

Cache Read -

Cache Write -

Supported Parameters

frequency_penaltyinclude_reasoninglogit_biasmax_tokensmin_ppresence_penaltyreasoningrepetition_penaltyresponse_formatseedstoptemperaturetop_ktop_p