The Model Fragmentation Problem

Nobody’s talking about the hidden cost of AI’s infinite options — and it’s quietly killing your productivity.

Mar 28, 2026

Real John here - this week I’m taking some time off but love AI too much to let this newsletter pause. Celebrating my anniversary this weekend! Yay us!
There are so many models out there, and analysis paralysis is in full-effect for so many people. TLDR; stop trying to figure out which is the perfect model for the current task, just ship it. Give yourself time to stick to one process, and you don’t have to change just because a new model came out. If it is working, stay with it until it is not working. Ok, that’s all, 9 of clubs - if you know you know. Here’s this week’s newsletter:

man surrounded by images of AI models looking frustrated at his laptop — generated with ChatGPT

I have a confession. Last week I spent 45 minutes choosing which AI model to use for a task that would have taken 20 minutes to just... do.

That’s not a productivity win. That’s a trap.

Here’s where we are right now: OpenAI serves 85 active models through their API. Anthropic serves 31. xAI has 33. Bedrock hosts 35. Replicate is sitting on 63. DeepInfra has 60. And new releases are dropping on a bi-weekly cadence like your favorite streaming show — except each one comes with a new pricing tier, new context window, new “benchmark-beating” claim, and a new set of tradeoffs you now have to understand.

Someone has to evaluate all of this. Guess who that is? You.

This Isn’t Choice. This Is Chaos.

We’ve romanticized optionality in tech for years. More options = more power. Except it doesn’t work that way in practice. It works like standing in a Cheesecake Factory with a 21-page menu when you’re starving. You don’t make a better choice. You make a slower one and wonder if you ordered wrong the whole time.

I’ve watched engineers at startups spend more time in model comparison spreadsheets than shipping features. I’ve seen CTOs schedule “AI strategy” meetings to debate which LLM to use for a customer service bot — as if the model is the product. IT IS NOT THE MODEL. The product is the thing you’re building.

The vendors love this, by the way. Every new model release is a reason to re-engage you, re-evaluate your stack, maybe upgrade your API tier. The churn cycle is now embedded in your architecture decisions. Congratulations, you’re now a subscriber to indecision.

The Real Cost Nobody’s Calculating

Everyone talks about inference costs. Token prices. Latency benchmarks. That’s table stakes.

Nobody talks about integration debt.

Every time you swap a model, you are not just swapping a model. You’re re-testing your prompts. You’re re-validating outputs. You’re updating your evals (if you even have evals, and if you don’t, that’s a different newsletter). You’re potentially refactoring how you parse responses because the new model has slightly different formatting behavior. You’re updating your internal docs. You’re re-training your team.

That’s not a swap. That’s a mini-project.

And if you’re a startup with two engineers and a runway clock ticking — that mini-project has an opportunity cost that would make your investors uncomfortable.

What Actually Works

I built Cash Critters — a financial literacy app for kids — for roughly $50/month. I didn’t spend three weeks evaluating every AI tool on the market. I picked something, shipped it, and iterated. The constraints forced the decision. The decision enabled execution.

Here’s the framework I use now, and it’s embarrassingly simple:

1. Pick a default model and commit to it for 90 days. Stop re-evaluating every time a new benchmark drops. Claude, GPT-4o, whatever — pick one, learn its quirks, optimize for it. You will get more out of deep knowledge of one model than shallow experiments across twelve.

2. Route by use case, not by hype. If you genuinely need different models for different tasks (coding vs. summarization vs. vision), define those lanes ONCE and stop revisiting. Two or three lanes max. This is not a model showcase, it’s a product.

3. Treat model changes like dependency upgrades. Would you randomly upgrade your database version mid-sprint because a new one dropped? No. You schedule it, you test it, you do it deliberately. Same energy for LLMs.

4. Build model-agnostic abstractions from day one. If your codebase has the model name hardcoded in 47 places, that’s on you. Abstract the call. Make swapping a one-line config change. This is basic engineering hygiene that somehow everyone forgets when “AI” gets slapped on the project.

The Takeaway

The AI tooling market wants you overwhelmed. Overwhelmed people buy more things trying to fix the overwhelm. Don’t play that game.

You don’t need 85 models. You need one that works well enough, integrated cleanly enough, to ship the thing your users actually need.

A horse gets you where you’re going. You don’t need a stable full of unicorns.

Pick your model. Build your thing. Ship it.

Go build something amazing.

John Mann is a software engineering executive, solopreneur, and the founder of Startups and Code LLC. He writes weekly about AI, startups, and tech leadership — for builders who execute, not just theorize. Follow along on Substack.

Discussion about this post

Ready for more?