Stop Hiring AI. Start Building It.

A body shop owner I work with said something to me last month I haven’t been able to put down. He’d spent six months trying to “onboard” AI for his eight-person shop. Sent the announcement. Wrote a usage policy. Did the onboarding playbook the same way he’d onboarded every receptionist for thirty years.

It half-worked, then dribbled out. Sound familiar.

What he said when he called me wasn’t the usual AI isn’t worth the hype. He said — “I think I’ve been doing this wrong from word one. I keep trying to onboard it. I should have been building it.”

Right. That word swap — onboarding to building — is the whole shift. Not a tweak. The shift.

Here’s the thing. If you’ve watched an AI tool quietly die in your business and chalked it up to this stuff isn’t ready yet, consider another possibility. You weren’t deploying it. You were trying to hire it. And AI isn’t a hire.

The deeper habit you haven’t noticed

Bigger frame first, because if we fix the verb without fixing the worldview, you’ll fix one tool and stay stuck on the rest.

For a hundred years, businesses have been pyramids. Boss at top. Workers at bottom. Work pushes down, decisions push up. When you needed more capacity you hired a person. When the person didn’t work you re-hired or trained. That shape has been the only shape for so long that we don’t even see it anymore — it’s just how a business works.

When AI showed up, you did the only thing the pyramid lets you do. You tried to fit it into the org chart. Some operators slot it under “tools.” Some under “team.” Either way, the move is the same — find the seat, fill the seat, hire the thing.

A pyramid has no native slot for an agent. The shape is wrong for what you’re trying to do.

The shape that fits AI is a Station Plan. You at the Chef — judgment, decisions, relationships, taste. An orchestrator as the Pass — routing work between agents, managing handoffs, translating direction up and down. Agents on the Line as stations — each one owning a function. (We call this the Station Plan — there’s a visual of it here if you want to see what I mean.)

A Station Plan doesn’t run on hires. It runs on built things. Stations get engineered. They don’t get onboarded. The verb is different. The whole posture is different.

Where the management metaphor came from — and where it’s incomplete

Ethan Mollick at Wharton has been making the case for the last year that the people best positioned to get value out of AI aren’t engineers — they’re managers. Owners. People who’ve spent decades scoping work, defining what done looks like, and giving useful feedback. His phrase — “Management as AI Superpower.” His research found experienced managers consistently outperforming AI experts at getting useful work out of these tools. Reason was simple. The things that make a manager good — clear scope, examples, well-set boundaries, feedback — are the same things that make AI useful.

He was right. He’s still right.

But the metaphor is two years old now, and the lane has been hiding inside it.

Here’s the catch. Managers know how to onboard people. They don’t know how to build agents. Same instincts, completely different application. Onboarding assumes the other party will learn, develop judgment, and grow into the role over months. None of that is true of AI. AI doesn’t onboard — it gets configured. It doesn’t ramp — it gets calibrated. It doesn’t develop instincts — it runs on the spec you wrote.

Translation Mollick’s frame needs in 2026 — take the management instinct, clear scope, examples, feedback, and apply it to engineering, not HR. Same skill set. Different category of object.

That’s the move this whole piece is about. Take the seven things you’d give a new hire. Stop giving them to a hire. Build them into a recipe.

The seven things you’d build into a recipe

Same seven you’d give a new receptionist in her first month. The shift isn’t what. It’s how. You’re not telling, training, or hoping. You’re writing specs.

1. Calibration (you used to call this “training”)

A new hire gets context — who you are, who your customers are, what tone you use, what you stand for. Most operators give AI a blank chat window and a question. Fix isn’t to “onboard AI better.” It’s to build the calibration into the system itself. A system prompt, a custom GPT, a Claude project that knows your business persistently — built once, reused on every task. You’re not training a person. You’re calibrating an instrument.

2. Context (the input feed)

Calibration is what’s true about your business every day. Context is what’s true about this customer, this job, this moment. Most people just type the question — the recipe is processing nothing. Build the habit (or the automation) of feeding context. Paste in the customer file, the job notes, the email thread. The station runs on what you feed it. Empty in, empty out.

3. Guardrails (the safety stops)

A new hire develops judgment over months. AI never will. So the boundaries can’t be use good judgment — they have to be hard-coded into the spec. Never quote a price. Never reply to a service complaint without me. Always include the warranty disclaimer. Stop if a customer mentions injury. A receptionist with no guardrails is a liability. A station with no safety stops is a system failure waiting to happen — and the recipe has none of the self-preservation instincts a person does.

4. Examples (reference outputs)

Highest-leverage move on the list, and the one almost nobody does. You don’t tell a new hire write good emails. You show her five emails that landed and two that missed. With a hire, you do this once and she internalizes it. With a recipe, you do it once and bake it into the spec permanently. Three or four pieces of work you’d be proud of, pasted into the system prompt with “this is the standard.” The reason your AI sounds generic is that you’ve never shown it your work. Show it. Once. Forever.

5. Output over process (the spec sheet)

Hardest one to internalize. Hiring teaches you to give instructions like a checklist — do this, then this, then this. Works for people who can think around the edges of the steps. AI walks the steps you describe, literally, and the output is exactly as mediocre as the steps. Fix is to write a spec, not a procedure. “Write an email that resolves the complaint, names a specific concession, and invites them back” beats “Step 1 acknowledge. Step 2 apologize. Step 3 offer remedy” every single time. A spec describes the destination. A procedure cages the recipe.

6. Diagnostic readouts (you used to call this “measurement”)

A new hire gets a thirty-day check-in — informal, vibe-based, sometimes that’s enough. AI doesn’t drift gradually like a person. It drifts silently and consistently. Pick two or three things you actually want to measure — hours saved on customer email per week, quotes drafted before review, complaints about tone. Then check in monthly. If you’re not measuring, you don’t have data on the recipe. You have a feeling. Feelings are how good recipes get scrapped.

7. Maintenance cycle (the feedback loop)

When a hire gets it wrong, you tell her. She incorporates the lesson because she’s a person with continuity. When AI gets it wrong, telling it doesn’t work — you’re talking to a session that ends. Feedback has to go back into the system. Update the calibration, add the failure as a new example, tighten a guardrail. The word loop matters because it’s cyclical. If you don’t close the loop into the system itself, the same mistake keeps happening — and you keep concluding the AI is dumb. AI isn’t dumb. The station is unmaintained.

Three principles that govern engineering

The seven ingredients are what you build. Three principles govern how you think about building. Without them you can do the seven mechanically and still get poor results.

It’s not Excel. Same prompt, different output, every time. Not a defect — it’s how AI works. Operators who treat AI like a spreadsheet get disappointed forever. Operators who treat it like a fast junior having a slightly different day every day get real leverage.

First try is rarely the last try. Prompt → output → critique → refine → output. That cycle isn’t failure. It is the work. Plan for two or three rounds, not one.

Speed beats polish. Owner who runs twenty rounds in an hour beats the owner who runs three, even if no round is sharper than the last. Perfectionism — useful when hiring people — is a liability when engineering recipes.

What I see in the field

I run audits on AI deployments at Local Nerds — small and mid-market service businesses, mostly. Pattern is unnervingly consistent. Nearly every operator scores 0 or 1 out of 3 on Examples. Same on Output over process. Both gaps come from the same root — the operator wrote a job description instead of a recipe spec. They told the AI what to be instead of what to produce. They gave it the freedom they wouldn’t trust a junior employee with — because they were thinking like a hiring manager, not an engineer.

Takeaway isn’t AI is hard. Gap between underperforming AI and useful AI is two specific changes most owners haven’t made — not because they couldn’t, but because they were still trying to hire.

The Chef

Here’s the role shift hiding inside all of this. You’re not the boss at the top of a pyramid pushing work down. You’re not the hiring manager filling seats. You’re at the center of a Station Plan, building the recipes that run your business. We call this the Chef.

Your job in the new shape — make the calls only you can make. Define what good looks like. Own the relationships. Engineer the recipes that carry the rest. The seven ingredients are how you build one station. Get good at all seven and you can put any function on the Station Plan.

That’s the identity to grow into. Not “owner who uses AI better.” Chef who builds.

The Monday Test

So. Pick the AI tool in your business you’ve been treating as a hire — the one you “onboarded” and walked away from.

Stop trying to manage it. Open its system prompt (or build one if it doesn’t have one) and start engineering. Run it through the seven specs:

Calibration — does it know your business persistently?
Context — does it get the situation before each task?
Guardrails — clear hard-coded boundaries?
Examples — three or four reference outputs in the spec?
Spec, not procedure — describing destinations, not steps?
Diagnostic readouts — anything specific you’re tracking?
Maintenance cycle — when it misses, does the spec get updated?

You’ll probably have two or three of the seven ingredients in shape. Don’t try to fix all the gaps this week — that’s a project, not a Monday move. Pick the gap that’s costing you the most and close just that one. (My bet — Examples. It’s almost always Examples.)

If you want to do this with a real number out of 21 instead of a feeling, we built a 10-minute self-audit worksheet for it — Is Your AI Actually Working? A 10-Minute Audit. Same logic, plumber-language. Free. Use it on every AI tool you have.

The shift

AI isn’t underperforming because it’s incapable. It’s underperforming because you’re trying to hire it. Metaphor everyone’s been using for two years — manage AI like a new employee — is half right and half holding you back. Half that’s right — clear scope, examples, feedback. Half that’s holding you back — assuming AI will learn, ramp, and develop judgment the way a person does. It won’t. It can’t. That’s the whole shape of what’s changed.

So. Stop hiring AI. Start building it. Shape of your business changed. Catch up.

Source: Ethan Mollick, “Management as AI Superpower” — One Useful Thing (oneusefulthing.org). Worth reading the original for the underlying research.

Frameworks: The Station Plan (parent architecture). The Professional Recipe (the seven-ingredient discipline that builds each station). The AI Audit Worksheet (10-minute self-audit, plumber-readable).

~ source material · Ethan Mollick: Management as AI Superpower (One Useful Thing, Wharton)