AI Voice Agents: 2025 Update

Table of Contents

Voice is one of the most powerful unlocks for AI application companies. It is the most frequent (and most information-dense) form of human communication, made “programmable” for the first time due to AI.

For enterprises, AI directly replaces human labor with technology. It’s cheaper, faster, more reliable — and often outperforms humans. Voice agents also allow businesses to be available to their customers 24/7 to answer questions, schedule appointments, or complete purchases. Customer availability and business availability no longer have to match 1:1 (ever tried to call an East Coast bank after 3 p.m. PT?); with voice agents, every business can always be online.

For consumers, we believe voice will be the first — and perhaps the primary — way people interact with AI. This interaction could take the form of an always-available companion or coach, or by democratizing services, such as language learning, that were previously inaccessible.

We are just now transitioning from the infrastructure to application layer of AI voice. As models improve, voice will become the wedge, not the product. We are excited about startups using a voice wedge to unlock a broader platform.

What’s new in AI voice?

2024 was a massive year for AI voice. Since we published our last AI voice update…

Advancements in model development have streamlined the infrastructure “stack,” resulting in voice agents with lower latency and improved performance. This improvement has largely materialized in the last six months with new conversational models.

These conversational models are also becoming more affordable over time. In December 2024, OpenAI dropped the price of the GPT-4o realtime API by 60% for input (to $40/1M tokens) and 87.5% for output (to $2.50/1M tokens). GPT-4o mini is also now available via realtime.

Where are AI agents now?

The voice agent market exploded in H2 2024. One data point: companies building with voice represented 22% of the most recent YC class, per Cartesia.

Voice agents are also being added as a capability to more horizontal or multi-modal products.

In 2024, we saw companies at several layers of the conversational voice stack attract both funding and traction, including:

Model companies like ElevenLabs and Cartesia
Horizontal platforms like Vapi and Bland
Verticalized platforms like HappyRobot and Wayfaster

Especially for larger enterprises, we’ve rarely seen a shift from full human call-taking → full AI call-taking immediately. Founders instead find a “wedge” to start to capture what is often a small percentage of calls for a customer — which can (hopefully) expand over time into handling more call types and workflows. Wedges we’ve seen include:

Market evolution: Fundraises

Verticals of focus: core markets

The most natural early categories for voice agents typically have high existing call center/BPO spend. If calls are taken by onshore employees as part of their standard jobs: (1) the pain point/revenue is typically not strong enough — unless a significant number of employees solely take/make calls; and (2) it’s difficult to quantify results/savings and “make the case.”

Each of these primary verticals (financial services, B2C, B2B, government, and healthcare) are likely to have their own core providers, similar to how they have their own systems of record.

We expect to see significant founder activity in the following categories (reach out if you’re building here!):

Financial services – debt collection, for example
Insurance – both customer-facing and back office
Government
Support services – including more complex customer service calls (like IT help) that require expertise

Outside “call center categories,” we have seen willingness to pay for AI voice agents for coaching or training use cases, largely targeted at high salary jobs. In these industries, realistic voice agents can essentially act as “simulators” to significantly improve on-the-job performance. This can replace labor spend (such as sales coaches) or less effective software.

As one indicator of where early stage founders are building, we look at YC companies.

Since 2020, there have been 90 voice agent companies. This is accelerating with each new cohort — 10 of these are in the W25 class, which has yet to be fully announced. In pre-2023 cohorts, voice agents are largely companies that have pivoted into the space in the past year.

YC founders building voice agents are largely concentrated in B2B- (~69%) and healthcare-focused (~18%) use cases, followed by consumer (~13%).

Within B2B, the most common sub-industries are: fintech (16.9%) and ops — largely customer support (12.4%). Within healthcare, voice agents either target front office (patient-facing) or back office (pharmacy, insurance, etc.-facing), focusing on: general human medicine (11.2%), dental (3.4%), veterinary (2.2%), or physical therapy (1.1%).

a16z voice agent investments

These are portfolio companies of a16z. A list of investments made by a16z is available here.

Voice agent market maps

What are we looking for in AI voice?

Case studies: AI voice interviewers

Job interviews feel like a non-obvious early use case for voice agents, given the complexity (conducting a full interview with a human) and sensitivity (maintaining a strong candidate experience). However, we’ve seen significant early traction from several startups here — some insights below from customers:

The pain point is especially strong in staffing (43 public co agencies, $650B annual revenue) — higher volume, lower to medium skill roles (likely not a 10x engineer at an early stage startup). AI interviews can easily replace screening calls, or even more of the process. This is because:

Candidates are more willing to “jump through hoops,” which might include interviewing with an AI
Customers get paid by # of candidates they refer or # hired by end employer — more volume is better, as it allows them to send either more candidates or better candidates

"Something like 90% of the candidates we send now make it to first round [with the employer], 75-80% make it to final round. Our numbers were half that before [AI voice interviewing start-up]." —Staffing agency for Fortune 100

Many AI interview products are already performing at or above the level of a human recruiter, for a few reasons:

Candidates can interview ASAP or at any time
Evaluation is consistent, and the customer can re-run on past interviews if the criteria changes
No issues with language or accents on either side
AI is often better able to assess technical or position-specific answers than a general recruiter

"The interviewee often starts gaining trust with AI in a way that they might not with the human interviewer. A recruiter may not have the experience to understand what interviewee is saying. AI can read from systems and give responses that are smarter and more engaging." —$200M annual revenue staffing agency

Questions around AI voice for 2025

Pricing- What will be the preferred pricing model?

Many companies initially adopted a price-per-minute model, but this approach is increasingly under pressure as model costs decrease — and some customers are becoming aware of these reductions. What will the preferred pricing model look like going forward? It will likely involve a combination of a platform fee and a usage-based component. Where does it make sense to charge for implementation or institute minimum usage requirements?

Modality expansion- How quickly should companies expand beyond calls?

No business or industry relies entirely on calls — email, web chat, text, etc. are important channels. How quickly should companies expand beyond calls into other modalities? Is it better to capture one workflow, end to end, or all calls first?

End vision- Is it possible to replace the xMS?

Many voice agents pitch the end vision of replacing the xMS (system of record software) in their category. In what categories is this actually possible/likely? Does it matter, if many businesses are already paying more to handle calls than they are for the xMS?

Industry vs. technical teams- Whose advantage?

Many of the early voice agents we’ve seen are from highly technical teams who put in the work to learn about a vertical/market after being pulled there. As the technical barriers lower, will it become more of a GTM game, where teams with little technical but more industry expertise are advantaged? How will this look different across verticals?

Horizontal vs. vertical approach- What makes sense, where?

In some categories, enterprises may want to build an agent themselves using a more horizontal product, versus adopting something built for their specific market or use case. In what industries/sizes of business will this make the most sense? How can vertical products serve enterprises that operate across many verticals (and may see benefits from working with one provider?)

Emotionality- Will voice agents enhance customer relationships?

In many cases, AI voice agents can already outperform humans on emotional vectors. They pay better attention, are more empathetic and patient, and have (theoretically) unlimited time to spend. There are categories where this will be particularly valuable, and voice agents can help businesses build deeper relationships with their customers — but this has been relatively untapped so far. We are excited to see how founders build around this theme in the most relevant verticals.

Get in touch!

If you're building in voice AI, I'd love to hear from you. Email me at omoore@a16z.com, or reach out on X.

Contributor

Olivia Moore is a partner on the consumer investing team at Andreessen Horowitz, where she focuses on AI.
- Follow
- X
- Linkedin

More From this Contributor

The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. In addition, this content may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein.

This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investments/.

Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.

RECOMMENDED FOR YOU

Hi, AI: Our Thesis on AI Voice Agents Olivia Moore and Anish Acharya Read More
Investing in HappyRobot Anish Acharya, Seema Amble, and Olivia Moore Read More
Investing in Scribenote Olivia Moore and Anish Acharya Read More
Investing in Decagon Kimberly Tan and Jeff Silverstein Read More
Investing in 11x Joe Schmidt, Joe Morrissey, and Seema Amble Read More

go to top