Multimodal Will Change How We View Voice

Demand for voice is surging as consumers increasingly find the medium a more natural way of experiencing products and services in their daily lives. Consumers are now completing common tasks like getting news updates, checking upcoming appointments, and placing takeout orders via voice. As they rise in popularity, voice experiences have the potential to drive increased engagement, loyalty, and even purchase behavior.

Even with voice becoming a part of consumers’ typical routines, many interactions can be improved by the addition of visuals. That is why multimodal experiences are becoming so important in connecting companies and consumers.

Multimodal experiences combine multiple interfaces (e.g. visual and voice) to provide a more holistic and natural interaction. While devices like the Amazon Echo Show and Google Home Hub immediately spring to mind, multimodal experiences have actually been around for far longer. A device doesn’t necessarily need a screen to be considered multimodal. A visual feature like Alexa’s LED ring, for example, can be an effective way to communicate information like whether a device is listening, processing, or offline. There are other times when adding visual elements such as images, videos, or LED signals can convey information more succinctly or advance the interaction more efficiently than communicating solely via voice.

There are plenty of benefits that come with delivering multimodal experiences. Think about something as simple as checking the weather – it is far more efficient to state the current day’s results and display graphics showing the entire week’s forecast than to read through all the details for each day individually. In fact, most voice apps would benefit from a visual interface.

A recent survey from Walker Sands revealed nearly half of consumers (49%) are not willing to buy luxury items or food and groceries using a voice assistant. Forty-seven percent of respondents similarly ruled out buying furniture through voice-only experiences. Additional research from comScore helps to further explain why consumers may be unwilling to buy these types of products through voice assistants. While security remains a major deterrent to voice commerce, some other areas of concern, such as the inability to view product details or compare products, are easily solved by adding a visual element. Even the potential for misheard or incorrect orders can be minimized with a graphical interface as shoppers could quickly review orders before completing their purchase.

Though multimodal experiences offer their fair share of benefits to brands, their design presents another layer of complexity – particularly as it relates to testing. Multimodal experiences need to be consistent across devices even when moving between totally disparate environments, such as going from a smart speaker to a car’s infotainment system. In-lab testing simply isn’t robust enough to cover all the bases. Moreover, while testing websites or mobile apps is fairly standardized, there is much more variability with voice experiences. The only way to put these experiences through their paces is by testing in a wide range of real-world scenarios, including different combinations of users, devices, voices, languages, and dialects – all on a global scale.

As more brands see the value of multimodal experiences, and start designing for them, quality will become the differentiator between the brands that win and those that lose. Learn more about the key design concerns for multimodal experiences in our new eBook.

Whitepapers

Testing Essentials for Five-Star Voice Experiences

Learn the best practices of voice testing, driving great voice experiences using real-world and automated testing.

Read 'Testing Essentials for Five-Star Voice Experiences' Now

View all blogs �

Emerson Sklar

Tech Evangelist and Solution Architect

Published On: April 4, 2019

Reading Time: 3 min

Accessibility

How QSRs Can Serve Up Quality Digital Experiences

Learn how fast food restaurants can deliver the satisfying digital experiences customers crave.

Accessibility

Understanding The Digital Health App Divide

Digital health products must be trustworthy and intuitive, but internal testing rarely reflects real-world use.

AI Training & Testing

Testing AI in 2026: Progress, Priorities and Plateaus

Read highlights from Applause’s 2026 State of Digital Quality in Testing AI report.

Automotive

Automotive Testing Trends and Challenges in 2026

As the automotive industry shifts toward software-defined vehicles and integrated digital ecosystems in 2026, QA teams face unprecedented complexity. Discover the top trends and real-world testing strategies.

Accessibility

EAA Enforcement: What We Learned at IAAP Dublin

We recap the main talking points of the IAAP EU Accessibility event in Dublin, with a special focus on EN 301 549 and the European Accessibility Act.

Accessibility

Why Accessibility Is the Infrastructure for AI Readiness

AI agents cannot transact with what they cannot interpret.

No results found.

Multimodal Will Change How We View Voice

Testing Essentials for Five-Star Voice Experiences

How QSRs Can Serve Up Quality Digital Experiences

Understanding The Digital Health App Divide

Testing AI in 2026: Progress, Priorities and Plateaus

Automotive Testing Trends and Challenges in 2026

EAA Enforcement: What We Learned at IAAP Dublin

Why Accessibility Is the Infrastructure for AI Readiness

General

Company

Resources

Legal