GPT-5.1 Chat reasoning

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on harder queries, improving accuracy on math, coding, and multi-step tasks without slowing down typical conversations. The model is warmer and more conversational by default, with better instruction following and more stable short-form reasoning. GPT-5.1 Chat is designed for high-throughput, interactive workloads where responsiveness and consistency matter more than deep deliberation.

Capabilities

Context Window 128k tokens

Max Output 16k tokens

Inputs

Outputs

Pricing (per 1M tokens)

Input $1.25

Output $10.00

Cache Read $0.12

Cache Write -

Supported Parameters

max_tokensresponse_formatseedstructured_outputstool_choicetools