The Future Of Voice Will Be Contextual Conversations And Interactive Experiences

The march of the virtual assistant towards global domination is gathering pace. A report by eMarketer said that mass adoption of virtual assistants is slowly becoming a reality. Voice-enabled device usage is forecast to show a year-on-year increase of 128.9% in 2016, with 60.5 million Americans expected to ask a virtual assistant—Alexa, Cortana et al.—to do something just through the power of their voice.

Over the next few years, there will be a significant increase in not only the number of devices available but also the voice-related capabilities of the leading platforms. Companies that have already nailed their colors to the voice-enabled mast will (unsurprisingly) dominate the market, the report said.

Google, Amazon, Microsoft and Samsung have either products available or are working with partners to build devices. Apple’s Siri is widely rumored to be getting a sleek new physical body in the not-too-distant future, which could propel the voice-activated assistant market into the stratosphere.

“Consumers are becoming increasingly comfortable with the technology, which is driving engagement,” said eMarketer’s vice president of forecasting Martín Utreras, in a blog post. “As prices decrease and functionality increases, consumers are finding more reasons to adopt these devices.”

Voice Interfaces Will Become The Norm

Voice recognition technology in the form of natural language processing has come on in leaps and bounds, so much so that devices that don’t incorporate some form of voice control or interaction will seem, well, dated.

For the moment, Amazon is the clear leader. Around 35.6 million people will use a voice-activated assistant device at least once a month in 2017, eMarketer said. And 70.6% of those people will be speaking to a member of the Echo family.

Google Home—which does have the advantage of a superior search tool in Google Assistant—will account for 23.8% of the market by the end of the year. The remaining 6.6% is likely to be shared around a variety of manufacturers that are starting their own voice-activated journey.

What is beyond question is that in the not-too-distant-future voice user interfaces will become the norm. The big names in 2017—Alexa, Google Assistant, Microsoft’s Cortana, Siri and Samsung’s butler-sounding Bixby—may be leading the field now, but there is a consensus that the time is right for voice to take center stage.

The challenge will be to provide people with voice-enabled experiences that do more than just relay news, play music or provide weather and traffic information.

According to Boston-based Earplay’s CEO Jon Myers, the mass adoption of voice-activated devices will require simulated conversations and narrative entertainment platforms. And, in Myer’s view, the increased focus on voice-activated skills—arguably kickstarted by Alexa—will validate a market that seems to have come out of nowhere.

“It’s not about the technology, it’s about user experience and design,” said Myers. “A lot of the technology behind speech recognition and natural language processing has been around for quite a while. It’s not as if the tech has taken a sudden leap that has made it so much better and everybody wants it … it’s the fact that a company like Amazon spent a lot of time making sure that the user experience was solid.”

Contextual Conversations Will Increase Adoption

The twin pillars of user experience and design are validated by the hype that currently surrounds voice-based tech.

At Google’s I/O 2017 developer conference in Mountain View, voice-activated interfaces were front and center. By the same chalk, the Amazon Echo now comes in a variety of shapes and sizes, all of which bring voice control to the forefront of device engagement. And lets not forget that other leading brands are looking to jump onto the voice bandwagon.

Myers cites a change in consumer attitude to speaking out loud as one aspect that has made voice such a hot topic. On a psychological level a person can get the information they want in an intuitive and fluid way, which enhances the overall user experience. For that reason alone, companies should be looking to use contextual conversations as a basis for engagement or fun.

On the flip side, many of the skills or actions available via existing devices do require a certain degree of structure to get the right answer or provide a worthwhile experience. For example, the Earplay skill provides people with interactive story-telling content—a sort of choose-your-own-adventure that echoes the halcyon days of radio dramas. The difference is that the skill allows people to play an active role in a dynamic storyline … just by using your voice.

“Some people see us as a game … and that’s cool,” Myers said. “Early on we were trying to work out where we fitted into the apps store … are we an audiobook or a game? Eventually we gave up trying to pigeonhole ourselves and said we that we were voice-driven interactive audio entertainment. We invented our own medium!”

In some ways, the need for contextual capabilities in voice-activated devices mirrors the trend for chatbots. Real-time conversations with digital assistants are at the top of the list for many brands so it makes perfect sense that contextual engagement would play a role in expanding the benefits of voice interaction.

“It depends on what the company’s goals are,” Myers said. “If they’d like to enter into the space anew with their own product, then they should be aware that voice experiences are deceptively easy to construct and prototype. What follows is a difficult path and a lot of work from prototype to a high quality experience that will do well upon release. In our experience, the design often requires more time than the engineering.”

People Can Do Cool Things With Their Voice

With that in mind, there are numerous signs that voice-based interactions are about to hit the next level.

Towards the end of 2016, Google opened its “Actions on Google” platform for developers with the sole intention of giving people the ability to have a two-way dialog with its virtual assistant. Amazon Web Services have made the AI technology behind Alexa available to anybody who wants to build a conversational interface. And you don’t need a crystal ball to realize that voice can provide value to any brand that wants to “talk” to its customers.

In what should not be a shock to anyone, voice-based computing is also on the radar of venture capitalists.

Take Voicecamp. The New York-based accelerator is running an 11-week funded program for eight startups that are building conversational interfaces as part of voice-controlled ecosystem. The program—which includes Earplay—is intended to bring developers and platforms together, with the overriding aim of making voice-based computing a “natural and frictionless end user experience.”

All of this is good news for developers and people who see voice as a perfect conduit to merge the worlds of digital and physical experience. Smartphones have ruled the roost when it comes to brand engagement on a device, but they still require people to hold things, press buttons and scroll to get the results they want. Just speaking into thin air is so much easier and (in theory) less time consuming.

“Technology in the voice space has been regularly improving for several years, but suddenly voice is emerging everywhere,” said Myers. “The primary catalyst is not any particular breakthrough in technology or a new way that we’re processing speech. Instead, it’s instead the monthly breakthroughs in how we enable people to use their voice to do cool new things.”

View all blogs �

David Bolton

Former ARC Writer

Published On: May 23, 2017

Reading Time: 7 min

Automotive

Automotive Testing Trends and Challenges in 2026

As the automotive industry shifts toward software-defined vehicles and integrated digital ecosystems in 2026, QA teams face unprecedented complexity. Discover the top trends and real-world testing strategies.

Accessibility

EAA Enforcement: What We Learned at IAAP Dublin

We recap the main talking points of the IAAP EU Accessibility event in Dublin, with a special focus on EN 301 549 and the European Accessibility Act.

Accessibility

Why Accessibility Is the Infrastructure for AI Readiness

AI agents cannot transact with what they cannot interpret.

AI Training & Testing

U.S. Super Apps: Orchestrating Seamless Ecommerce Experiences

Learn why the US super app is an integrated layer, powered by agentic AI. And why quality execution is the core challenge.

Integrated Functional

Rethink Regression Testing: 3 Reasons to Outsource

Hand off regression testing to a crowdtesting partner to save time, improve coverage and keep your QA staff happy.

Manual Functional

Crowdtesting vs. In-House QA: Why Market Leaders Choose a Hybrid Strategy

Internal QA is an organization’s main line of defense in digital quality. Find out how crowdtesting fills in the gaps and complements in-house teams.

No results found.