Methodology · How we score

The scoring rubric, in full and in public.

Updated May 2026By The CallTreo editorial teamDisclosure →Rubric versioned + change-logged

Every vendor on CallTreo is scored against the same nine-dimension weighted rubric. This page documents exactly how that score is computed, what we refuse to score on, and how often we update. If you find a vendor we've scored wrong — email hello@calltreo.com and we'll re-review.

See the scored directory Editorial disclosure

The rubric

Nine dimensions. Total weight 1.00.

Each dimension is scored 0–10 by an editor; the overall score is the weighted average. The weights below are what live in our scoring code today — not aspirational, not a future plan.

Dimension	Weight	What it means
Call handling	20%	Conversation quality, routing reliability, edge-case handling. Does the AI gracefully handle off-script inputs, accents, and noisy lines? Does it sound natural or robotic? This is the largest single weight because it's the most common reason a deployment fails post-launch.
Integrations	15%	Native CRM and calendar fit, API flexibility, webhook robustness. We weight native > webhook > Zapier > none. Specific integration depth gets evaluated on the relevant integration pages.
Automation	15%	Booking, qualification, follow-up triggers. Can the AI close the loop without a human? We score actual workflow completion, not feature checkboxes.
Ease of setup	10%	Time to launch, technical complexity, ongoing tuning effort. A vendor that takes 6 weeks to launch is scored lower than one that ships in 1 — even if the 6-week vendor is more flexible.
Pricing value	10%	Not lowest price — best value for the segment. Predictable pricing scores higher than opaque pricing. Hidden setup fees and aggressive overage rates get penalized.
Vertical fit	10%	Templates and signals for specific industries (legal, medical, home services). A vendor optimized for SaaS sales won't fit an HVAC dispatch use case; we score against the segment, not the vendor's marketing.
Human handoff	10%	Quality of escalation: live transfer, callback workflow, urgent routing. Vendors that hand off cleanly to your team beat vendors that pretend AI can handle everything.
Reporting	5%	Call summaries, transcripts, dashboards. Useful for both ops and rep coaching. Lower weight because it's table-stakes for most vendors at this point.
Support	5%	Onboarding quality, response time, ongoing partnership. Important when something breaks; less differentiated than the rest of the rubric in steady-state.
Total	100%	Maps to a 0–10 overall score.

What we refuse to do

Honesty isn't a marketing claim. It's a list of things we won't do.

Read the full editorial disclosure →

We don't score on logo prestige.
A vendor's Series-C funding round doesn't make their AI better. We've kept under-the-radar vendors high in rankings when the product warrants it, and dropped well-funded names when it didn't.
We don't fabricate what we can't verify.
Pricing, HIPAA status, specific integration support — we render "On request" or "—" when we can't confirm something publicly. We'd rather show a gap than invent a fact.
We don't let referral fees move ranks.
Some vendors pay us a referral fee. None of them have moved a rank because of it. The rubric is the rubric. See our disclosure for full detail.
We don't write sponsored reviews.
No vendor pays for placement in editor verdicts or comparison pages. Pricing-page snapshots, integration pages, and vendor reviews are all editorially independent.

How we gather data

Four sources, weighted by reliability.

When sources conflict, hands-on testing beats reviews beats docs beats marketing. Anything we can't verify is rendered as a gap, not invented.

01
Vendor websites + product docs
Feature claims, integration lists, and posted pricing. Treated as marketing material — we verify when we can.
02
Vendor demos + live tests
Where possible, we sit through a real demo and test the integration the vendor's claiming. Live tests carry the most weight in our scoring.
03
User reviews + community signal
G2, Capterra, Reddit, industry-specific forums. Useful for spotting patterns; we don't take any single review as gospel.
04
Hands-on integration evaluation
For the integration pages (HubSpot, Salesforce, Calendly, etc.), we test the integration where we have access — and clearly flag where we're relying on the vendor's documentation.

How we update

Scores aren't set and forgotten.

Quarterly editor pass
Every published vendor gets re-reviewed at least once per quarter — scores updated, new product changes incorporated, deprecated features dropped.
Ad-hoc updates on major changes
Vendor ships a major feature, pricing change, or new integration → we update sooner. Same goes for de-listing if a vendor folds or pivots away.
Versioned rubric
We've kept the 9-dimension weighting stable since launch. If we materially change the weights, we'll publish a versioned rubric note explaining why and how scores shift.

Editorial

Maintained by the CallTreo editorial team.

Scores and editor verdicts on this site are written by humans who've been on the buying side of receptionist software. We push every page through a buyer-perspective read before publish — asking "is this what we'd want to know on the demo call?" If a page fails that bar, it doesn't ship.

See something wrong? Email hello@calltreo.com. Vendor corrections, factual disputes, or coverage requests are all welcome — we'll respond and update where the correction is valid.

Dimension

Weight

What it means

Call handling

20%

Conversation quality, routing reliability, edge-case handling. Does the AI gracefully handle off-script inputs, accents, and noisy lines? Does it sound natural or robotic? This is the largest single weight because it's the most common reason a deployment fails post-launch.

Integrations

15%

Native CRM and calendar fit, API flexibility, webhook robustness. We weight native > webhook > Zapier > none. Specific integration depth gets evaluated on the relevant integration pages.

Automation

15%

Booking, qualification, follow-up triggers. Can the AI close the loop without a human? We score actual workflow completion, not feature checkboxes.

Ease of setup

10%

Time to launch, technical complexity, ongoing tuning effort. A vendor that takes 6 weeks to launch is scored lower than one that ships in 1 — even if the 6-week vendor is more flexible.

Pricing value

10%

Not lowest price — best value for the segment. Predictable pricing scores higher than opaque pricing. Hidden setup fees and aggressive overage rates get penalized.

Vertical fit

10%

Templates and signals for specific industries (legal, medical, home services). A vendor optimized for SaaS sales won't fit an HVAC dispatch use case; we score against the segment, not the vendor's marketing.

Human handoff

10%

Quality of escalation: live transfer, callback workflow, urgent routing. Vendors that hand off cleanly to your team beat vendors that pretend AI can handle everything.

Reporting

Call summaries, transcripts, dashboards. Useful for both ops and rep coaching. Lower weight because it's table-stakes for most vendors at this point.

Support

Onboarding quality, response time, ongoing partnership. Important when something breaks; less differentiated than the rest of the rubric in steady-state.

Total

100%

Maps to a 0–10 overall score.

Maintained by the CallTreo editorial team.

See something wrong? Email hello@calltreo.com. Vendor corrections, factual disputes, or coverage requests are all welcome — we'll respond and update where the correction is valid.

The scoring rubric, in full and in public.

Nine dimensions. Total weight 1.00.

Honesty isn't a marketing claim. It's a list of things we won't do.

Four sources, weighted by reliability.

Vendor websites + product docs

Vendor demos + live tests

User reviews + community signal

Hands-on integration evaluation

Scores aren't set and forgotten.

Maintained by the CallTreo editorial team.

The scoring rubric, in full and in public.

Nine dimensions. Total weight 1.00.

Honesty isn't a marketing claim. It's a list of things we won't do.

Four sources, weighted by reliability.

Vendor websites + product docs

Vendor demos + live tests

User reviews + community signal

Hands-on integration evaluation

Scores aren't set and forgotten.

Maintained by the CallTreo editorial team.