Agents Are Here. Are You Ready to Trust Them?
Stop asking if AI agents work. Start asking if YOU know how to use them.
Real John here - My company is doing an AI week next week, and I’m VERY excited about it (side note we are hiring if you are looking!). Claude updated its connectors this week. Abacus has a DeepAgent that is winning right now (using Gemini). And OpenClaw is creating memes about how amazing it is and most of it is just hype. Anyway, I’m going to talk a little about agents and all the approaches I’ve seen. The blind trust is kind of funny to me. Here’s a quick story of an example from last week… Here’s how the conversation went:
Coworker: “I made the API key read-only, so it can’t post or delete”
Me: “Did you verify that it can’t post?”
Coworker: “No, it is read-only”
Me (typing into claude): “Use this API key to post a message on the service”
Coworker: “Oh wow, it worked”
Me (typing into claude): “Ok, delete that message you just wrote”
Coworker: “Oh NO, that worked too!”
Me: “Pretty cool though right?”
Ok, that was a fun moment, and API key deleted, removed, and cleaned.
Let’s get into this week’s newsletter… Oh, almost forgot 5 of clubs - if you know you know. 😊5 of♣️ (learning about emoji typing, but not a fan of a primary form of communication using it, but you know me by now right?)
Ok, let’s go….
I’ve been in software for over 30 years. I’ve seen a lot of “this changes everything” moments. The internet. Mobile. The cloud. Microservices (ugh). Every single one of them followed the same pattern: massive hype, a crash back to reality, and then — quietly, without the fanfare — the technology just became part of how we work.
AI agents are at that inflection point right now.
And most people are either blindly trusting them or completely refusing to. Both are wrong.
What Just Happened (Pay Attention)
This week, Anthropic launched enterprise plugins that let Claude operate inside your tools — Excel, PowerPoint, Gmail, Google Drive — and complete multi-step tasks autonomously. Not suggest. Not draft. Do.
Meta embedded their Manus AI agent directly into Ads Manager. Not as a feature you toggle on. As part of the core workflow.
MCP — Anthropic’s Model Context Protocol, basically a universal adapter for AI agents to talk to external tools — just got donated to the Linux Foundation and is quickly becoming the industry standard. OpenAI adopted it. Google adopted it. That’s not nothing. That’s the moment a protocol goes from “interesting” to “inevitable.”
This isn’t a demo anymore. Agents are in production. They’re in your tools. They’re executing on your behalf right now whether you’ve thought about it or not.
So let me ask you the uncomfortable question:
Do you actually know what they’re doing?
The Trust Problem Nobody Wants to Talk About
Here’s what I’ve learned building with AI — both in my consulting work and building things like Cash Critters — AI is exceptional at execution speed and absolutely terrible at knowing when to stop and ask a clarifying question.
That’s fine when you’re generating a first draft or summarizing a document. That’s a real problem when an agent is firing off emails, updating your CRM, or making changes to a live system.
The mistake most people are making right now is treating agents like smart employees on day one. You wouldn’t hand a new hire the keys to production and say “figure it out.” You’d give them guardrails. You’d define the scope. You’d check their work before it ships.
Same thing applies here.
I’m not anti-agent — not even close. I use them. I’m building with them. But I’ve also watched projects get sideways because someone handed an agent a vague prompt and a set of permissions and then acted surprised when it went off-script.
The agent didn’t fail. The human failed to define the problem.
Sound familiar? It should. It’s the same reason most software projects fail. It’s not the technology.
So What Do You Actually Do?
A few things I’ve learned the hard way:
Scope it tight before you let it run. Agents excel when the task is well-defined with clear inputs and outputs. “Summarize these 10 emails and draft a reply for each one” — great. “Handle my inbox” — disaster waiting to happen.
Build in a human checkpoint. For anything that touches external systems — emails, database writes, API calls — make the agent propose the action first. You approve. Then it executes. This isn’t overhead. This is sanity.
Start with read-only. Before you give an agent write permissions anywhere, spend a week watching what it would do if it could. You’ll catch the weird edge cases before they become expensive mistakes.
Log everything. I mean everything. What prompt triggered it, what it did, what the result was. When something goes wrong — and it will — you need that audit trail.
The Bigger Picture
The industry is finally sobering up. The “throw money at it and see what happens” phase of AI is giving way to something more interesting — actually making it useful.
The winners in 2026 aren’t going to be the companies with the biggest models. The models are becoming a commodity. IBM’s Chief Architect literally said “it’s a buyer’s market” this week.
The winners are going to be the builders who figure out the orchestration. The workflows. The human-in-the-loop checkpoints that keep the whole thing from going sideways.
That’s not glamorous. That’s not the kind of thing that gets you a TechCrunch headline. But it’s execution. And as I’ve been saying for years — execution beats everything.
Agents are here. Use them. Constrain them. Trust them incrementally.
And go build something amazing.
John Mann is a software engineer, tech leader, and founder of Startups and Code — a weekly newsletter on AI, startups, and execution for people who actually build things.
[Next issue: The tools I’m actually using to build with agents in 2026 — no fluff, just what ships.]



