Last year it was the robots, now avatar-enabled voice assistants are coming

7 Feb

Written By Yca Tan

CES, the consumer electronics show, remains one of the best places in the world to see what’s new today. More and more tech companies join the fray, showing off digital and AI consumer products on the show floor. Increasingly, CES has also become a place for companies that sell to other companies rather than directly to consumers. As a result, there are now many invite-only side events where one to one meetings between decision makers at companies are held non-stop for four days. It’s very efficient.

This mix of deep tech and business to business activities has led to it being a great event to get a good reading of trends in technology that are not quite yet obvious but that are about to hit the lime-light. Last year, I noticed that there were a lot more social robots being shown and I predicted that 2024 would see a big boom in social robotics. In particular, I argued that there were now far more robotics companies that were specifically designed to serve a specific pain point of an end-user, whereas previous generations of robotics companies tried to go to market with a generic platform where someone else has to figure out how to turn it into a commercially viable enterprise that actually adds value to people’s lives. And I dare say that is exactly what happened!

So, having walked the many, many miles of CES early this January, here’s one big observation I’m making for 2025: visual avatars will start to complement many, many virtual assistants that are currently voice-only. Or put differently, make way voice assistants, the visual show-stoppers are here to grab people’s attention!

At the CES show floor, perhaps the most clear signal of this were the Automotive Tier 1 and OEMs who are creating avatar-enabled voice assistants. Renault created a joyous little avatar aptly named Reno. And Harman showed off a cute mouse-like avatar called Luna with a stated goal to create intelligent, empathetic interactions. You can just imagine these avatars being available through holograms in the future, wouldn’t that be cool? There is of course some concern that a visual avatar in a car will be a distraction, but if you view this in the context of ever-increasing self-driving capabilities, I doubt that that will be a barrier to adoption even in the medium term. Most luxury cars already come with a voice assistant as standard, and they will now have to re-assess whether they will make that avatar-enabled too.

Where it gets really interesting, and with an impact way bigger than just cars, are the behind the scenes conversations that are going on. LLMs have really made it so easy to create a virtual assistant, that a lot of brain-power and design thinking has gone from ‘How can I create a basic web-assistant answering simple questions’ to ‘How can I create a strong, empathetic bond with my customers’. And one of the answers that keeps coming up is to use avatars. Basically, every single place where you currently have a text or voice-based AI assistant is up for grabs. And that is a very, very big market.

Adding avatars that are socially and emotionally aware will not happen everywhere as including it comes with cost and development risk. Building an empathetic relationship is more important in some areas than others. Where trust is paramount for successful value delivery, fully empathetic avatar-enabled voice assistants may well be the way forward. I’m thinking places like healthcare, life insurance, debt management, mortgage advice, etc etc. Take Hippocratic AI as a healthcare example. Today, they have an amazing range of voice assistants that can do tasks like post-care check-ins, care transitions, or provide life-style counselling. These are sensitive tasks. And while patients say that the assistants perform their tasks even better than real nurses, they don’t fell they listen to them as well. And that’s were an empathetic avatar comes in.

It’s well known that humans are visually oriented creatures, so it should be no surprise that a virtual assistant with a visual representation is superior to a voice-only solution (technically called an ‘embodied conversational agent', or ECA). But it’s not just that they grab attention. By presenting a virtual face you provide a whole new channel of communication, using facial expressions of emotion, social signals indicating intention, confusion, or simply helping with turn taking to establish more fluent conversations. Having a visual nexus of attention also means that camera-based analysis of the user’s facial expressions becomes easier. While my Alexa can hear me from the other room, just adding a camera to it doesn’t mean it can always see me. Turning to a visible avatar will address that issue.

Over the last two years, we’ve all been amazed by what Large Language Models like ChatGPT can do. I think it’s only the start. The future of AI is fully natural multimodal interactivity, going well beyond voice. And as THE experts in facial expression analysis, we are super excited by that!

BLUESKEYE AI is a provider of technology to measure the social, emotional, and medically relevant face and voice behaviour of people, whose product B-Social is an SDK designed specifically to allow social robots and embodied conversational agents to create better interactions.

Are you thinking of integrating an avatar-enabled virtual assistant in your solution but don’t know exactly where to start or what technology components you need? Request a demo, and we’ll get in touch with you as we’ve got a wealth of information on this topic!

Yca Tan

Last year it was the robots, now avatar-enabled voice assistants are coming

Self-driving cars will increase the need for driver monitoring, not reduce it

2024: A year of breakthroughs for BLUESKEYE AI