R1 Distill Llama 70B reasoning
DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The model combines advanced distillation techniques to achieve high performance across...
Capabilities
Context Window 131k tokens
Max Output 16k tokens
Inputs
Outputs
Pricing (per 1M tokens)
Input $0.70
Output $0.80
Cache Read -
Cache Write -
Supported Parameters
frequency_penaltyinclude_reasoninglogit_biasmax_tokensmin_ppresence_penaltyreasoningrepetition_penaltyresponse_formatseedstoptemperaturetop_ktop_p