The Model Price Collapse: What to Build Now That You Couldn't Six Months Ago

Ten days. That's how long it took for two major releases to change the cost math on building with frontier AI models.

On November 18th, Google launched Gemini 3 with a "Deep Think" reasoning mode and day-one deployment across 2 billion Search users. Six days later, Anthropic released Claude Opus 4.5 at 67% cheaper than the previous Opus generation: $5 per million input tokens, $25 per million output, with an 80.9% score on SWE-bench Verified.

I'm not writing a benchmarks recap. You can find those anywhere. I want to talk about what the pricing shift means for the economics of building with AI, because the business case for several use cases just changed significantly.

What 67% Cheaper Actually Means

Frontier model capability has historically come with frontier model pricing. That created a real constraint: the models capable of handling complex, multi-step reasoning were too expensive for high-volume applications. Teams made compromises. They used weaker models for most requests and reserved the frontier models for the cases where quality mattered most. Or they decided a use case wasn't economically viable and shelved it.

A 67% price reduction on frontier capability doesn't just make existing use cases more profitable. It moves previously uneconomical applications across the viability threshold. The math that didn't work six months ago works now.

As Simon Willison noted on launch day, the Opus 4.5 release makes it harder than ever to benchmark AI systems because by the time you've run your evals, the price-to-performance ratio has already changed. That's a real problem for AI teams trying to make stable architectural decisions. It's also an opportunity if you're willing to revisit assumptions.

Use Cases That Now Pencil Out

Here's where I'd be looking if I were re-evaluating a shelved project or reconsidering a cost-driven compromise.

AI code review on every pull request

Serious code review (the kind that catches logic errors, security issues, and architectural problems, not just style violations) requires a capable model. At previous frontier pricing, reviewing every PR in a team with high commit volume was expensive enough that most teams either skipped it, used a weaker model that missed things, or ran it only on certain types of changes.

At $5 per million input tokens, a thorough review of a medium-sized diff costs a few cents. For a team shipping 50 PRs a month, you're looking at a few dollars. The economics are no longer the reason not to do it.

Document intelligence at scale

Legal, insurance, and financial services organizations have enormous volumes of documents that require genuine comprehension: contracts, filings, claims, policy documents. The analysis that matters (finding the clause that creates liability, flagging the inconsistency between two sections of a policy) needs a model that can actually read and reason, not one that pattern-matches on keywords.

The previous cost structure pushed teams toward either weaker models (with accuracy they couldn't rely on for consequential decisions) or careful volume limits (process only the highest-value documents). The new pricing opens up the full document volume for more organizations.

Complex customer support agents

Most AI customer support deployments are doing the easy tier: routing, FAQs, simple lookups. The hard tier (genuine troubleshooting, multi-step problem resolution, situations requiring real judgment about what a customer needs) has typically required human agents because the AI capable of handling it cost too much to deploy at scale.

At current pricing, a complex 20-minute support interaction that uses 50,000 tokens costs around $1.50 in model costs. For issues that would otherwise consume 30 minutes of a human agent's time, that math works.

Specialized knowledge-worker tools

There's a category of AI product that isn't consumer chat and isn't internal automation: domain-specific tooling for knowledge workers. Think AI for analysts writing investment memos, for engineers doing root cause analysis on complex system failures, for researchers synthesizing literature. These tools need frontier capability because the work is genuinely hard.

Building a product in this category previously meant either pricing it at a level most users wouldn't accept, or taking a margin hit that made the business unworkable. The price reduction on frontier models changes both of those conversations.

What to Check Before You Build

Cost isn't the only thing that changed. Before you restart a shelved project based on new economics, check a few things.

Does your latency budget still work? More capable models are often slower. Opus 4.5's reasoning improvements come with inference time that may or may not fit your use case. Run the numbers on response time, not just token cost.

Is the use case still the right one? A project shelved because of cost might have had other problems that were easier to blame on budget. Go back to the original decision and look honestly at what else was flagged.

What's your eval infrastructure? Lower costs don't help if you can't tell whether the outputs are actually good. Before you scale a use case, you need a way to measure quality that doesn't involve humans reviewing every response. If you don't have that, build it first.

What's the competitive window? Price reductions tend to be industry-wide over time. If a use case is now viable because of pricing, it's viable for your competitors too. The question is how long you have before the window closes.

The Broader Pattern

November 2025 wasn't unusual. It was a more compressed version of something that's been happening for three years: frontier capability getting cheaper faster than most organizations can update their assumptions about what's economical to build.

The teams that consistently capture value from these shifts aren't the ones that react fastest. They're the ones with a clear map of what they'd build if cost weren't the constraint, so when cost stops being the constraint they know exactly where to go.

If you're working through the economics of a specific use case (what it would actually cost to build and run, whether the model capabilities are there, and whether the business case holds), get in touch. That's a conversation worth having before you commit to an architecture.