Reka: Flash 3 reasoning
Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a 32K context length and optimized through reinforcement learning (RLOO), it provides competitive performance comparable to proprietary models within a smaller parameter footprint. Ideal for low-latency, local, or on-device deployments, Reka Flash 3 is compact, supports efficient quantization (down to 11GB at 4-bit precision), and employs explicit reasoning tags ("") to indicate its internal thought process.
Reka Flash 3 is primarily an English model with limited multilingual understanding capabilities. The model weights are released under the Apache 2.0 license.
Capabilities
Context Window 65k tokens
Max Output 65k tokens
Inputs
Outputs
Pricing (per 1M tokens)
Input $0.10
Output $0.20
Cache Read -
Cache Write -
Supported Parameters
frequency_penaltyinclude_reasoningmax_tokenspresence_penaltyreasoningseedstoptemperaturetop_ktop_p