Voice AI Pricing in India Doesn't Tell You What You'll Actually Pay

Voice AI vendors price per minute, per call, per credit, or per agent seat. None of these pricing models capture the metric that matters: cost per qualified outcome. Understanding the gap between what you're quoted and what you'll actually pay requires looking at the hidden costs that every pricing model obscures.

Three Pricing Models and Their Hidden Costs

Per Minute Pricing

The most transparent model. You pay ₹3–15 per connected minute. The problem: per-minute pricing punishes long calls — which are often the ones that work. A 4-minute conversation that results in an enrollment is more valuable than a 15-second hang-up, but you pay 16x more for it.

The deeper problem: per-minute pricing also punishes context-free agents that waste time on every call. If a voice agent takes 90 seconds to re-establish context that should have carried forward, you're paying for 90 seconds of overhead on every connected call. With Alchemyst's Kathan engine now powering over 500,000+ calls daily, that translates to thousands of hours of wasted billing across the network.

Per Credit Pricing

Opaque by design. A "credit" might cover 1 call, 1 minute, or 1 SMS — depending on the vendor. Some vendors use fractional credits for different actions. A 2-minute call might cost 3 credits while an SMS costs 0.5 credits. This makes it nearly impossible to predict monthly spend or compare vendors on a like-for-like basis.

The hidden cost: credit-based pricing often includes minimum commitments and expiration dates. You buy 10,000 credits, use 6,000, and the remaining 4,000 expire at month-end. Your effective cost per interaction is 67% higher than the sticker price.

Outcome-Based Pricing

The vendor charges only for qualified leads or successful conversations. Sounds ideal — you pay for results, not activity. But the definition of "qualified" is subject to negotiation and drift. A "qualified lead" might mean someone who stayed on the line for 60 seconds, not someone who expressed genuine purchase intent. The goalposts shift, and you end up paying for outcomes that don't convert downstream.

Model	Transparency	Predictability	Hidden Risk
Per Minute	High	Medium	Punishes long (valuable) calls; penalizes context-free overhead
Per Credit	Low	Low	Opaque unit definition; minimum commitments; expiration
Outcome-Based	Medium	Medium	Definition creep on 'qualified'; misaligned incentives

The Metric That Actually Matters

Regardless of which pricing model your vendor uses, the metric you should benchmark against is cost per meaningful interaction — defined as total spend divided by conversations that exceeded a quality threshold. The definition of "quality" depends on the use case. For sales, it might be 1+ minute of dialogue. For NPS collection, it's a completed survey.

Here's the math from two different real-world deployments on the Kathan voice OS:

Use Case 1: Enrollment Outreach (JK Shah Classes)

Component	Value
Billing rate	₹9/minute
Total spend	₹63,975
Meaningful conversations (>1 min)	2,566
Cost per meaningful interaction	₹24.93

Use Case 2: NPS Feedback (Unacademy)

Component	Value
Billing rate	₹3/minute
Total spend	₹11,963
Completed NPS responses	1,109
Cost per NPS response	₹10.79

These two deployments show why per-minute pricing alone is misleading. Unacademy paid one-third the per-minute rate of JK Shah (₹3 vs. ₹9), but the more important metric is the cost per outcome. For JK Shah, that was ₹24.93 per qualified lead. For Unacademy, it was just ₹10.79 per completed NPS response. A lower per-minute rate is only better if it translates to a lower cost per outcome.

The "Context-Free Tax"

Every call made by a context-free agent includes hidden overhead: re-identification time, script redundancy, language mismatch, and missed qualification signals. This is especially true when serving a diverse linguistic landscape. The Kathan voice OS is engineered to minimize this, supporting over 12+ Indian languages including Hindi, Tamil, Telugu, Gujarati, Kannada, Marathi, Bengali, Malayalam, Punjabi, Odia, Assamese, and Urdu, alongside global languages like English, Arabic, Spanish, French, Mandarin, and Japanese. This overhead inflates your effective cost per interaction regardless of the base rate.

We call this the context-free tax. It's the difference between what you pay per minute and what you pay per useful minute. In a typical stateless deployment, 30–40% of connected minutes are wasted on overhead that context engineering eliminates.

"Per-minute pricing is fine if you also track cost per meaningful interaction. The first number is what the vendor charges. The second number is what you actually pay for results."

How to Evaluate Pricing Honestly

When comparing voice AI vendors on price, ask for these four numbers — not just the rate card:

Ask For	Why It Matters
Cost per meaningful interaction from a reference deployment	Proves the vendor can deliver outcomes, not just minutes
Average call duration on connected calls	Short averages (under 30s) suggest high hang-up rates
Retarget vs. cold performance split	If retargets don't outperform cold, context isn't working
Context overhead estimate	How many seconds per call are spent on re-identification and script setup

The vendor who delivers ₹25 per meaningful interaction at ₹9/minute is a better deal than the vendor who quotes ₹5/minute but delivers ₹200 per meaningful interaction. Price per minute is the input. Cost per outcome is the output. With the Kathan voice OS (कथन), our focus is always on the output. Built in India, for the world, our goal is to deliver meaningful outcomes at scale.

See Alchemyst Kathan's pricing and performance data — benchmarked on real deployments, not projections.