News

GPT-5 Rumoured for Q2 2026: What Builders Should Expect

OpenAI's next frontier model is reportedly in testing. Here's what the leaks suggest, why the timeline matters for your roadmap, and how to prepare your AI stack.

Max Beech· Founder

·Jul 8, 2025·6 min read

The rumour: Multiple sources suggest OpenAI has GPT-5 in late-stage testing, with an announcement potentially as early as Q2 2025. Sam Altman has been characteristically coy, saying only that "something exciting is coming" at a recent investor event.

Why this matters: Every frontier model jump reshapes what's possible. GPT-4 enabled reliable function calling and longer context. GPT-5 could make current limitations - hallucinations, reasoning depth, multimodal understanding - feel quaint.

The builder's question: Should you wait? Build now? How do you architect systems that benefit from model improvements without requiring rewrites?

What we think we know

Piecing together investor presentations, researcher comments, and API telemetry patterns, here's the emerging picture:

Expected capabilities

Reasoning improvements

Multiple sources mention significant advances in logical reasoning and mathematical problem-solving. The gap between o1-style "thinking" models and standard chat models may shrink dramatically.

One researcher (speaking anonymously) described internal benchmarks showing "PhD-level performance on complex multi-step problems that GPT-4 struggles with."

Longer effective context

GPT-4 technically supports 128K tokens, but quality degrades significantly past 32K. GPT-5 reportedly maintains coherence across much longer contexts - some rumours mention 1M+ tokens with minimal degradation.

Better tool use

Function calling reliability could improve substantially. Current GPT-4 occasionally hallucinates function names or generates invalid JSON. If GPT-5 delivers guaranteed schema compliance natively, wrapper libraries become optional.

Multimodal native

Rather than separate vision and text models, GPT-5 may handle images, audio, and video natively in a unified architecture. This could simplify multimodal agent development considerably.

Timeline speculation

Milestone	Rumoured timing
Red team testing	Ongoing (started Q1 2025)
Limited API preview	Q2 2025
General availability	Q3-Q4 2025
Developer tier pricing	Unknown

These timelines have slipped before. GPT-4 took longer than expected. But OpenAI faces competitive pressure from Anthropic's rapid Claude iteration and Google's Gemini improvements.

Pricing expectations

Frontier models historically launch expensive. GPT-4 was 10-20x pricier than GPT-3.5 at launch. Expect GPT-5 to follow a similar pattern:

Launch pricing: 2-5x current GPT-4o
6 months later: Price drops as optimisation improves
12 months later: Approaches current GPT-4o pricing

Budget for higher costs initially, then benefit as prices normalise.

"AI-assisted development isn't about replacing developers - it's about amplifying them. The best engineers are shipping 3-5x more code with AI tools while maintaining quality." - Kelsey Hightower, Principal Engineer at Google Cloud

How to prepare your stack

Don't wait, but build adaptably. Here's the practical approach:

1. Abstract your model calls

If you're calling openai.chat.completions.create() directly throughout your codebase, you're locked in. Wrap model calls in your own abstraction:

async function callLLM(
  messages: Message[],
  options?: { model?: string; temperature?: number }
): Promise<string> {
  const model = options?.model || getDefaultModel();
  // When GPT-5 launches, change one line
  return openai.chat.completions.create({ model, messages });
}

2. Design for capability detection

Don't hardcode assumptions about what models can do:

const MODEL_CAPABILITIES = {
  'gpt-4o': {
    maxContext: 128000,
    reliableContext: 32000,
    nativeFunctions: true,
    vision: true
  },
  'gpt-5': {  // Add when available
    maxContext: 1000000,
    reliableContext: 500000,
    nativeFunctions: true,
    vision: true
  }
};

3. Budget for experimentation

Set aside 10-15% of your AI budget for testing new models when they launch. Early access often comes with quirks, but also opportunities - you might find GPT-5 solves a problem that's been blocking your roadmap.

4. Document your model-specific workarounds

Every hack you've implemented to work around current limitations? Document it. When GPT-5 launches, you'll want to audit which workarounds can be removed:

// TODO: GPT-4 sometimes returns tool_calls with invalid JSON
// Remove this when upgrading to GPT-5 if schema compliance improves
const parsedArgs = safeJSONParse(toolCall.function.arguments);

What this means for competitive positioning

If you're building on OpenAI

You'll likely get access to GPT-5 through existing API keys. Plan for a testing sprint when preview access becomes available. First-mover advantage on new capabilities can differentiate your product.

If you're building on competitors

Anthropic and Google will respond. Within 3-6 months of a GPT-5 launch, expect Claude and Gemini updates. Multi-provider architectures become more valuable - you can switch to whoever has the best price/performance at any moment.

If you're self-hosting

Open-source won't match GPT-5 immediately, but the gap typically closes faster with each generation. Meta's Llama team tends to release competitive models 6-12 months after frontier launches. Plan your self-hosting roadmap accordingly.

The "should I wait?" question

No. Here's why:

Timeline uncertainty: GPT-5 might slip to late 2025 or 2026. You can't plan a business around rumoured release dates.

Current models are capable enough: GPT-4o handles the vast majority of business use cases well. If you're blocked by model limitations, you're probably trying to solve the wrong problem.

Architecture matters more than models: Systems built on good abstractions improve automatically when models improve. Build well now, upgrade seamlessly later.

Competition isn't waiting: Your competitors are shipping products on current models. Waiting for "the next thing" is a losing strategy.

What we're watching

OpenAI's API announcements: Any new beta features or capability flags often precede model launches.
Pricing signals: Major API pricing changes sometimes indicate capacity building for new models.
Researcher departures: When key researchers leave or stay after a major project, it signals completion status.
Anthropic/Google responses: If competitors announce accelerated timelines, OpenAI may be telegraphing something.

Bottom line

GPT-5 is coming - probably within the next 6-12 months. It will likely offer meaningful improvements in reasoning, context handling, and reliability. But these improvements won't invalidate well-architected systems built on current models.

Build now with clean abstractions. Budget for model experimentation. Plan to test new capabilities when they arrive. Don't let rumours of future models paralyse your current roadmap.

The best time to build was yesterday. The second best time is today - on GPT-4o, with an architecture ready for whatever comes next.

---

Further reading:

---

Frequently Asked Questions

Q: Will AI replace software developers?

AI is augmenting developers, not replacing them. The most likely scenario is that developers become more productive, handling more complex work while AI handles routine coding tasks. Demand for senior engineering judgment is increasing, not decreasing.

Q: What's the security risk of AI-generated code?

AI models can introduce vulnerabilities or insecure patterns. Treat AI-generated code with the same scrutiny as any external code contribution - security scanning, code review, and testing are essential regardless of the code's source.

Q: How do I integrate AI coding tools into my workflow?

Start with code completion and documentation assistance, then gradually adopt more autonomous features. Establish review practices for AI-generated code and track quality metrics to ensure standards are maintained.

Stop doing the work around the work

OpenHelm connects to your tools, reads the context, and does the steps, so you sign off on the result instead of producing it. See how it covers an entire role’s weekly workload, check the pricing, or run it yourself with the free local app.

Book a demo Explore use cases

Back to Blog