When Alexa reads a story to a child… with the voice of her deceased grandmother

In recent years, Alexa – Amazon’s artificial intelligence that equips the connected devices of the e-commerce giant – has largely imposed itself in the daily lives of its users. Alexa receives more than a billion requests per week, even let Amazon know, which numbers more than 100,000 skills (or “skills”) of Alexa. The American giant took advantage of its re:MARS conference dedicated to machine learning, automation, robotics and space, to present a demonstration that was disturbing to say the least.

This features a boy talking to an Amazon Echo speaker. “Alexa”, asks the boy, “can grandma finish reading me The Wizard of Oz »? A woman’s voice then begins to speak. You guessed it, it is indeed the voice of the deceased grandmother of the child.

“What surprised me the most about Alexa is the companionship relationship we have with it,” argues Rohit Prasad, SVP and Chief Scientist of Alexa AI. “The human attributes of empathy and affect are key to building trust. They have become even more important in these pandemic times, when so many of us have lost loved ones. While AI can’t take away the pain of loss, it certainly can make the memories last. “On the side of Amazon, we do not deliver an exact date of availability for this feature.

A cumbersome AI

Many questions have already been raised about the ethics of reproducing the voice of a real person. Interviewed by ZDNet, Nate Michel, an Amazon official, however, explained that Amazon’s technology is still “exploratory at this stage”. Generating such a voice is indeed a technical challenge, as Rohit Prasad explains, because you have to produce a high-quality voice with less than a minute of recording, compared to hours of recording a voice in the studio. The Alexa AI teams took on the challenge by viewing it as a voice conversion task rather than a speech generation task.

To make Alexa even more human, Rohit Prasad explains that Amazon is building generalizable intelligence into the tool. Generalizable intelligence includes three key attributes: learning across many different tasks, continuously adapting to user environments, and learning new concepts through self-supervision.

Amazon is working on approaches like think before you speak, in which Alexa effectively uses “common sense tacit knowledge” (built using a large language model and a common sense knowledge graph) to generate responses to a user. For example, if a customer says on Valentine’s Day “Alexa, I want to buy flowers for my wife”, Alexa could use her knowledge of the world to advise the user by replying “maybe you should get her some red roses”. Which would also be a disturbing experience, no doubt.


Source link -97