All News

OpenAI Expands GPT-5 Choices and Restores Old Models

OpenAI's GPT-5 launch met pushback over reduced model choices and a changed chat persona. In response, OpenAI now exposes selectable GPT-5 modes (Auto, Fast, Thinking, Thinking mini, Pro) and lets paid users re-enable legacy models like GPT-4o and GPT-4.1 via settings. The change restores control for power users and helps businesses balance speed, cost, and reasoning accuracy.

Published August 13, 2025 at 08:14 PM EDT in Artificial Intelligence (AI)

OpenAI expands GPT-5 options amid user backlash

OpenAI’s GPT-5 rollout hit an early snag: fans of older models complained that the new ChatGPT lacked the style and behavior they preferred. The bigger issue was not just voice, but choice—users wanted more control over which underlying model handled their prompts.

OpenAI designed GPT-5 to route requests automatically between a lightweight fast model and a heavier reasoning model. That default "Auto" setting remains the easiest option, but OpenAI has added clearer menu choices so users can pick the behavior they want.

What’s on the GPT-5 menu

  • Auto — Built-in router decides between fast and reasoning models; recommended for most users.
  • Fast — Routes directly to the light, low-latency model for quick, basic answers.
  • Thinking — Full reasoning model that chains steps, may use web tools, limited to 3,000 messages per week.
  • Thinking mini — A lighter reasoning option that acts as a fallback when Thinking hits limits.
  • Pro — The most capable reasoning model, currently behind a $200/month tier; Plus users may get limited trials.

Think of Auto like an automatic transmission: most people leave it on and get smooth performance. Power users who love tinkering can choose the specific "gear" they prefer.

How to get the older models back

Many users missed GPT-4o and other previous variants. Paid ChatGPT subscribers can still access GPT-4o directly from the model menu. To reach models such as GPT-4.1, 4o-mini, 3o and GPT-5 Thinking mini, go to Settings and toggle on "Show additional models." OpenAI’s message to developers and businesses: you control the trade-offs between speed, cost and depth.

What this means for teams and products

Model choice matters. Faster models reduce latency and cost for chat UIs and real-time assistants. Reasoning models improve complex workflows like legal summarization, multi-step data analysis, and code reasoning but cost more and may be rate-limited. Legacy model availability also affects brand voice and continuity for customer-facing agents.

Practical steps teams should take now:

  1. Profile typical queries to decide which prompts need deep reasoning versus quick answers.
  2. Set routing policies (intent-based or cost-based) and test them in staging before a public rollout.
  3. Monitor latency, cost per query, hallucination rates and user satisfaction to tune which model serves each route.

OpenAI’s decision to re-expose model options is a practical win for developers and product teams. It restores agency: choose the right tool for the job, whether that means a cheap fast reply or a careful, multi-step reasoning pass.

For organizations building at scale, the takeaway is clear: treat model selection like architecture. Design routing, measure outcomes, and keep legacy behaviors available where they matter to users. That’s how you balance innovation with continuity.

Keep Reading

View All
The Future of Business is AI

AI Tools Built for Agencies That Move Fast.

QuarkyByte can help teams benchmark which GPT-5 mode fits each use case, design routing rules that cut latency and cost, and set monitoring for accuracy and drift. Talk with our analysts to map model performance to business KPIs and create a tested rollout plan that protects user experience and compliance.