I run audits on AI deployments for business owners who think the tools aren’t working. Same pattern every time.
Owner — “We tried this. It doesn’t work for us.”
Me — “Show me how you set it up.”
They open the AI tool. Blank system prompt. No examples. No guardrails. No measurement. Just a blank chat box with a team told to “use AI more.”
I score the deployment against the seven ingredients of what a real AI implementation looks like (something we mapped out a few pieces back). Score comes back — 1 out of 21. Sometimes 2. Once in a while, 3.
Then I say — “Your deployment scored a 1 out of 21. And you’re concluding the tool doesn’t work.”
Owner usually goes quiet for a second, then says something like “oh. That’s not the same as the tool being broken.”
Right. It’s not.
The diagnosis mistake that kills most AI deployments
Here’s the thing — every owner who’s told me “AI doesn’t work for us” had a deployment that scored between 1 and 7 out of 21. On the scorecard, that’s called Announced. You bought AI. You didn’t deploy it. You installed software and walked away.
That’s not a tool problem. That’s a non-deployment.
Here’s what Announced looks like on the ground. Team hears on Monday that AI is available. Uses it for a week. By week three, usage is near zero except for you and maybe one early adopter. By month two, the tool is dead. Owner concludes “AI isn’t ready” or “our work is too specialized.”
Owner isn’t wrong about the pattern. They’re wrong about the diagnosis. They didn’t try AI and find it wanting. They tried 5% of a real deployment — the parts that feel self-evident (open chat, ask a question) — and concluded the whole thing was broken.
It’s like hiring a junior employee on day one, giving them the employee handbook, and firing them on day three because they’re not yet productive. The job wasn’t to fire them. Job was to onboard them. You skipped the work and then blamed the hire.
What each band actually looks like
Scorecard has four bands. Here’s what’s happening at each one.
Announced (0-7 out of 21). You have the tool. You don’t have the setup.
- Team opens a blank chat every time. No persistent knowledge of your business. No examples. No guardrails. No measurement.
- Symptom — people use it for a week, then stop.
- What’s missing — everything. The gaps are too interconnected to fix piecemeal. Right move is a clean deployment cycle.
Deployed (8-14 out of 21). You’ve started. Some ingredients are in place, others are obvious gaps.
- Tool produces some value but it’s nowhere near the leverage available. You’re getting wins on simple tasks (routine emails, summaries) but it fails on anything that requires your business’s specifics.
- Symptom — “It’s good for basic stuff, but nothing important.”
- What’s missing — usually the same two or three ingredients (Examples, Feedback Loop, Measurement). Pick the lowest-scoring gap with the highest cost in your business and close it this month.
Iterated (15-19 out of 21). Most ingredients are working. You’ve moved past deployment into refinement.
- Tool is producing real value. Team trusts it within defined boundaries. It’s part of the workflow, not a side experiment.
- Symptom — team would push back if you took it away.
- What’s missing — the operating principles. You’re under-iterating. Most teams at this band are still over-perfecting each output instead of running more cycles.
Compounding (20-21 out of 21). System is high-functioning. Framework isn’t a checklist anymore — it’s how the business operates.
- Tool gets sharper every month because the feedback loop is real. New hires get onboarded into the AI system the same way they get onboarded into anything else.
- Symptom — hard to imagine running the business without it.
- What’s missing — almost nothing. Document what you’ve built. You’re now a model.
The owner about to cancel
If you’re reading this because you’re about to cancel the tool, you’re in the Announced band or the low end of Deployed. The diagnosis you’ve probably made is “this tool doesn’t work.”
Real diagnosis is “this deployment isn’t finished.”
Those are literally different problems. One requires a new tool. The other requires two weeks of focused work.
How to score yourself
Scorecard is seven questions, one for each ingredient. For each question, pick the answer that’s true about your business today — not what you intend to do, what’s actually happening right now.
1. Does the AI know your business persistently?
- 0 = No system prompt, blank chat every time
- 1 = Some saved prompts, used inconsistently
- 2 = Documented system prompt or custom GPT, the team uses it consistently
- 3 = As #2, plus reviewed and updated quarterly
2. Do you give the AI context before each task?
- 0 = Just ask the question, no background
- 1 = Sometimes paste in details, inconsistent
- 2 = Standard practice to include customer file, job notes, thread
- 3 = As #2, plus tools/templates that auto-pull context
3. Have you told the AI what it can’t do?
- 0 = No guardrails at all
- 1 = Informal team understanding, nothing written
- 2 = Documented do-not rules, team knows them
- 3 = As #2, plus rules baked into the system prompt
4. Has the AI seen examples of your work?
- 0 = Never shown any examples
- 1 = One or two examples, used occasionally
- 2 = Three to five real samples embedded in the system prompt
- 3 = As #2, plus examples refreshed when the business evolves
5. Do you describe outcomes or just give steps?
- 0 = All step-by-step instructions, all the time
- 1 = Mix of both, no consistency
- 2 = Standard practice is describing the outcome, AI figures out steps
- 3 = As #2, plus explicit guidance on when to specify steps (regulated work) vs. when to leave room
6. Do you actually track whether it’s working?
- 0 = Nothing measured, “it feels like it helps”
- 1 = Anecdotal noticing of wins
- 2 = Two to three real metrics (hours saved, items drafted, complaints reduced), checked monthly
- 3 = As #2, plus metrics drive real decisions about what to expand/cut/fix
7. When the AI gets it wrong, does the system improve?
- 0 = Rewrite and forget, system never updates
- 1 = Occasional updates when something breaks badly
- 2 = Documented process — misses get logged, system gets updated on a regular cadence
- 3 = As #2, plus the system is visibly sharper than it was three months ago
Total your score.
What to do with it
If you scored 1-7 (Announced) — the gaps are too interconnected. Don’t fix one ingredient this month. Run a clean deployment cycle on the whole stack — system prompt, examples, guardrails, measurement. One focused week. After that, you’re in Deployed and can pick your next gap strategically.
If you scored 8-14 (Deployed) — you have some foundation. Find your two lowest-scoring ingredients. Close the one with the highest cost in your business this month. Then the next. Sequential beats simultaneous.
If you scored 15-19 (Iterated) — most of this is working. Next move isn’t more setup. It’s more cycles. Push on speed and iteration.
If you scored 20-21 (Compounding) — document what you’ve built. You’re a model now.
Before you cancel
Most owners score themselves and go quiet for a second, just like that insurance office manager did. Then they say something like “so it’s not that the tool is broken. It’s that I never finished deploying it.”
Exactly.
And that’s good news. A broken tool requires a new tool. An incomplete deployment requires two weeks of focused work. Fix is in your hands.
So. Before you cancel that subscription, run the scorecard. Be honest about where you actually are. If you’re under 12, you don’t have an AI problem. You have a deployment problem. Fix the gap. The tool will work.
If you want to do this with a structure in front of you, we built a 10-minute self-audit worksheet that walks you through the same logic in plain language — Is Your AI Actually Working? A 10-Minute Audit. Same framework, plumber-readable. Use it.
Then decide whether you’re canceling the tool or finishing the deployment.
Framework: The AI Onboarding Scorecard (0-21 diagnostic). The AI Audit Worksheet (plumber-readable companion). See also: Stop Hiring AI. Start Building It. (the framework that produced this scorecard).
~ source material · AI Onboarding Scorecard (Local Nerds original). The 0-21 diagnostic framework
