ElevenLabs co-founder and CEO Mati Staniszewski says voice is turning into the subsequent main interface for AI – the way in which folks will more and more work together with machines as fashions transfer past textual content and screens.
Talking at Web Summit in Doha, Staniszewski advised TechCrunch voice fashions like these developed by ElevenLabs have just lately moved past merely mimicking human speech — together with emotion and intonation – to working in tandem with the reasoning capabilities of huge language fashions. The consequence, he argued, is a shift in how folks work together with expertise.
Within the years forward, he mentioned, “hopefully all our telephones will return in our pockets, and we will immerse ourselves in the actual world round us, with voice because the mechanism that controls expertise.”
That imaginative and prescient fueled ElevenLabs’s $500 million elevate this week at an $11 billion valuation, and it’s more and more shared throughout the AI trade. OpenAI and Google have each made voice a central focus of their next-generation fashions, whereas Apple seems to be quietly constructing voice-adjacent, always-on applied sciences via acquisitions like Q.ai. As AI spreads into wearables, automobiles, and different new {hardware}, management is turning into much less about tapping screens and extra about talking, making voice a key battleground for the subsequent part of AI growth.
Iconiq Capital basic companion Seth Pierrepont echoed that view onstage at Net Summit, arguing that whereas screens will proceed to matter for gaming and leisure, conventional enter strategies like keyboards are beginning to really feel “outdated.”
And as AI techniques change into extra agentic, Pierrepont mentioned, the interplay itself will even change, with fashions gaining guardrails, integrations, and context wanted to reply with much less express prompting from customers.
Staniszewski pointed to that agentic shift as one of many largest modifications underway. Somewhat than spelling out each instruction, he mentioned future voice techniques will more and more depend on persistent reminiscence and context constructed up over time, making interactions really feel extra pure and requiring much less effort from customers.
Techcrunch occasion
Boston, MA
|
June 23, 2026
That evolution, he added, will affect how voice fashions are deployed. Whereas high-quality audio fashions have largely lived within the cloud, Staniszewski mentioned ElevenLabs is working towards a hybrid strategy that blends cloud and on-device processing — a transfer aimed toward supporting new {hardware}, together with headphones and different wearables, the place voice turns into a continuing companion slightly than a characteristic you determine when to have interaction with.
ElevenLabs is already partnering with Meta to convey its voice expertise to merchandise together with Instagram and Horizon Worlds, the corporate’s digital actuality platform. Staniszewski mentioned he would even be open to working with Meta on its Ray-Ban sensible glasses as voice-driven interfaces increase into new type elements.
However as voice turns into extra persistent and embedded in on a regular basis {hardware}, it opens the door to critical considerations round privateness, surveillance, and the way a lot private information voice-based techniques will retailer as they transfer nearer to customers’ each day lives — one thing firms like Google have already been accused of abusing.
Thanks for studying! Be a part of our neighborhood at Spectator Daily



















