Gemini 2.5 Flash Lite Preview 09-2025 reasoning
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the Reasoning API parameter to selectively trade off cost for intelligence.
Capabilities
Context Window 1M tokens
Max Output 65k tokens
Inputs
Outputs
Pricing (per 1M tokens)
Input $0.10
Output $0.40
Cache Read $0.01
Cache Write $0.08
Supported Parameters
include_reasoningmax_tokensreasoningresponse_formatseedstopstructured_outputstemperaturetool_choicetoolstop_p