OpenAI GPT-5 Router is like Apple removing headphone jack.
— immortal (@immortal_0698) August 14, 2025
It sucks but everyone will follow it.
What is GPT-5 Router
The GPT-5 router picks the right model for each request in real time. In plain English: easy stuff goes to the small model; complex stuff goes to the big brain. The goal is simple, better answers per dollar and millisecond by mixing models instead of forcing a single static choice. I suspect router will be a key component in subscription pricing.
How It Works: Routing as Classification Problem
Understanding the router means treating it like a classifier. For example, you have two models: a smaller, no-reasoning model and a larger, reasoning model. Given a user query, the router has to make a call:
- Smaller model: when the query is simple
- Larger model: when the query is complex
In reality, we have more models, but for simplicity, we will stick to two models.
The Classification Matrix
A compact way to reason about this: a confusion matrix. To keep score, call the positive class “complex” and the negative class “simple”. Rows are the router’s decision; columns are the true difficulty of user query.
Actual Difficulty: Simple | Actual Difficulty: Complex | |
---|---|---|
Route: Smaller | True Negative (TN) | False Negative (FN) |
Route: Larger | False Positive (FP) | True Positive (TP) |
We don’t have to worry about the diagonal elements, as they are the cases where the router is correct. But we need to worry about the off-diagonal elements : False Positive and False Negative.
Error Analysis: Both Mistakes Cost Money
False Negative (Complex → Smaller): The worst outcome
- Breaks user experience - they get a shallow answer to a deep question
- Damages trust and perceived quality
- Users complain, cancel subscriptions, bad reviews
- Cost: Customer churn and reputation damage
False Positive (Simple → Larger): The expensive mistake
- User gets a great answer but you burn unnecessary compute
- $0.05 query becomes a $0.60 query (12x cost)
- At scale, this adds up fast - 10,000 false positives = $5,500 in wasted compute
- Cost: Direct margin erosion
So the strategy becomes: bias toward false positives (overspend on compute) rather than false negatives (lose customers). You can optimize compute costs later, but you can’t win back a user who thinks your AI is “dumber than your previous model.”
This is why OpenAI initially erred on the side of caution with the router, then faced backlash when the pendulum swung too far toward false negatives. The sweet spot is narrow and expensive to find.
Economic Motivation: The Subscription Squeeze
This technical complexity of router exists because OpenAI faces a challenging economic reality: flat subscription pricing becomes difficult when usage explodes exponentially. As per Sam Altman, even $200/month struggles to maintain profitability.
insane thing: we are currently losing money on openai pro subscriptions!
— Sam Altman (@sama) January 6, 2025
people use it much more than we expected.
Math Behind the Subscription Pricing
Here’s the math behind the subscription pricing:
- Users pay $20/month for supposedly “unlimited” access (Nothing is unlimited)
- But Big models can burn upto $0.5+ per query in compute costs (Reasoning models)
- Deep research runs cost ~$1+ each and take 20+ minutes
- Other features such as memory, tools, etc. are not free.
It’s not just OpenAI - other companies are facing similar challenges:
- Anthropic - Their $20/month subscription includes significant rate limiting.
- Cursor - They recently announced that after 250 Sonnet requests, they’ll meter usage and charge based on consumption
Routers are going to get better
Creating a good router is fundamentally a data problem, and OpenAI has a massive advantage here. Every query-response pair becomes training data for router improvement:
Data Collection at Scale:
- Millions of daily interactions across different complexity levels
- User feedback signals (thumbs up/down, follow-up questions)
- Engagement metrics (time spent reading, follow-up queries)
- Cost-per-query data for model optimization
Iterative Improvement Loop:
- Router misroutes a complex query → user complains or asks follow-up
- OpenAI labels this as “should have gone to reasoning model”
- Router learns: similar queries get routed to larger model next time
- Over time, accuracy improves from 80% → 90% → 95%+
The GPT-5 Launch Backlash
When OpenAI launched GPT-5 with mandatory routing, users immediately complained about quality degradation. The router was routing too many complex queries to the smaller model, making GPT-5 seem “dumber” than GPT-4o.
User Backlash:
- Users reported shallow answers to complex prompts
- Reddit filled with complaints about the perceived downgrade
- Loss of manual model selection frustrated paid subscribers
OpenAI’s Response:
- Brought back GPT-4o access for Plus users
- Acknowledged router problems and began tuning improvements
- Added more transparency about which model responds
Conclusion / Prediction
The router will come back - but better trained. OpenAI learned that accuracy matters more than cost savings for user satisfaction. Expect:
- Higher-tier customers: Will likely get manual model selection options
- Free/basic tiers: Will live with the router, but a much-improved version
- Industry trend: Other AI companies will adopt similar routing strategies as costs mount
The economics make routers inevitable, but OpenAI’s rough launch showed that execution quality determines success or failure.