This AI can imitate your voice from a few seconds of recording, it’s scary

OpenAI has unveiled a revolutionary new capability: the ability to create highly realistic synthetic voices by training on just 15 seconds of audio recording.

AI voice
Credit: 123RF

OpenAI has just lifted the veil on Voice Engine, a brand new AI that can imitate any voice after listening to it for just 15 seconds. The Voice Engine system is not entirely new, since it was first developed by OpenAI at the end of 2022, with a first version already being used for speech synthesis functions in the popular voice assistant. The company’s ChatGPT AI. However, this is the first time Open AI has spoken about it publicly.

As described in a recent OpenAI blog post, Voice Engine allows users to create stunningly realistic synthetic voices that can read any provided text “in an emotive and realistic way.” The company has shared some examples of voice clones, which demonstrate an impressive naturalness, although there is still a slightly artificial edge to some of them.

OpenAI wants to revolutionize the market with Voice Engine

OpenAI cites several promising real-world use cases for the technology, such as educational tools, translation of podcasts into new languages, access to remote communities and even communication assistance for non-verbal people. The company has already launched “a small-scale preview” with selected partners who received early access.

Age of Learning, an education company, used Voice Engine to generate scripted voiceovers, while AI visual storytelling app HeyGen gives users the ability to create fluent translations of audio with the voice and accent of the original speaker.

The most striking example is undoubtedly that of researchers who were able to “restore the voice” of a young woman who lost the use of speech following a brain tumorby training Voice Engine on just 15 seconds of an old recording.

OpenAI is already warning of the dangers of such technology

However, despite these potentially revolutionary use cases, OpenAI is taking a deliberately cautious stance on further dissemination of the voice cloning system. The company invokes the urgent need to guard against misuse of technology for malicious purposesas spreading false information and voice cloning without user consent.

You can imagine that some people could quickly have fun spreading false messages from famous personalities on social networks. We have also seen scammers using AI to imitate your loved ones and ask you for money over the phone,

The implications of using voice cloning AI for disinformation campaigns are particularly significant given major elections in the US and UK this year. As generative AI tools become more and more sophisticated in the areas of audio, text, images and video, it is increasingly difficult to distinguish real content from artificial content. For example, we recently saw Sora, another AI from OpenAI that can generate very realistic videos in no time.

OpenAI recognized that it was essential to start building “societal resilience” in the face of the challenges posed by these technologies. She encouraged measures such as the gradual abandonment of voice authentication for sensitive accounts and called for policies to protect the voices of individuals, as well as educating the public about the capabilities of AI.

Currently, all speech engine samples created by OpenAI partners are digitally watermarked to help trace their origin. The company also said it requires explicit consent from the original speaker and does not allow the recreation of political candidates’ voices during election periods.

Source link -101