The metrics framework
Too many KPIs and no one looks at them. The minimum useful set:
Containment rate
Percentage of calls handled end-to-end by the AI without escalation. Target depends on your business — most well-tuned setups run 60-85%. If yours is below 50%, the call flow needs work. If it's above 95%, you may be escalating too rarely (check the recordings).
Booking rate (if booking is in scope)
Percentage of qualified booking-intent calls that result in a confirmed appointment. Should approach or exceed your previous staffed rate within 4–6 weeks. If it doesn't, the booking flow is the bottleneck.
Average call duration
Drift up over a few weeks usually means the AI is over-explaining. Drift down often means callers are hanging up early — check whether the AI is missing intent.
Customer-initiated escalations
Calls where the caller asked for a human. Each one is a failure to handle that intent. Tracking the trend matters more than the absolute number.
Integration error rate
CRM pushes that failed, bookings that didn't sync, webhooks that timed out. Should be near-zero in steady state; a spike means a config or API problem.
Weekly review cadence
30 minutes a week, every week, for the first 90 days. The discipline is the whole game.
- 01Review the 5 KPI numbers vs last week (containment, booking, duration, escalations, errors)
- 02Listen to 5 randomly sampled call recordings end-to-end
- 03Listen to 5 calls flagged as escalations or low-sentiment
- 04List 1–3 specific changes to make based on what you heard
- 05Make the changes, deploy, note them in a changelog
Common drift patterns and how to fix them
Containment rate dropped 10+ points
Almost always a new question type the AI doesn't have FAQ coverage for. Listen to escalated calls; add the new questions to the FAQ.
Customers reporting the AI sounds 'off'
Usually means a prompt was edited badly, or the vendor pushed a model update that changed the voice. Pull a recent call, compare to a launch-week call. If the voice changed, escalate to the vendor.
Booking rate dropped
Either a calendar integration issue (check error logs) or a script change that made the booking flow more confusing. Roll back to the previous booking script and compare.
Duplicate contacts piling up in CRM
Dedup broke. Either the vendor's lookup logic regressed or your CRM field changed. Run a dedup audit; restore the lookup rule.
After-hours escalations going to voicemail
On-call rotation may have changed without updating the AI's escalation routing. Verify the rotation in the AI's config matches reality.
Quarterly deeper review
Every 90 days, take a longer look:
- Pull the full quarter of call data — top intents, escalation reasons, drop-off points
- Re-evaluate FAQ coverage against the last quarter's actual questions
- Audit CRM data hygiene — duplicate rate, custom field fill rate
- Review pricing vs realized ROI (did you stay in your bucket? overage cost?)
- Review vendor's product changelog — anything new you should adopt?
- Stakeholder check: ask the customer success / sales team what they're hearing about the AI
When 'switch vendors' is actually the right call
Most 'we should switch' instincts are tuning problems. But sometimes it's actually a vendor problem:
- The vendor pushed a model update that meaningfully degraded the voice and won't roll back
- Pricing changed unfavorably and there's no negotiation room
- A critical integration the vendor promised was 'on the roadmap' is now indefinitely deferred
- Support response time has degraded to the point of breaking incident response
- The vendor's company is in trouble (layoffs, missed payroll, sudden leadership exit)
Before switching, write down what specifically would have to be true on the new vendor for it to be better — and verify that on a demo. Switching for novelty usually means rebuilding the same problems.