Telecommuting: by videoconference from the wreck of the Titanic


A researcher has just brought video conferencing technology to one of the most remote places on earth: the wreck of HMS Titanic, which lies on the seabed, 4,000 meters below the surface.

“It’s as if we can now make video conferences from the abyss,” says Alex Waibel, a researcher at Carnegie Mellon University and the Karlsruhe Institute of Technology.

That’s it, do you understand the risk?

Radio signals do not work well underwater

Alex Waibel is an expert in text-to-speech technology. Currently, the only way for researchers exploring the wreckage of the Titanic – or other deep-sea wrecks – aboard submersibles to communicate with the surface is to send text messages via sonar.

Because radio signals do not work well underwater. This poses a communication problem for which scientists have found solutions since the Second World War.

During a recent OceanGate Expeditions expedition, Alex Waibel narrated his dive and used voice recognition technology to convert what he said into transmittable messages. On the surface, the technology developed by the researcher and his team then resynthesized the raw text messages into video using artificial intelligence.

Towards a consumer use case

The result is a near real-time video that uses the voice of Alex Waibel. Her lips moving in sync with the words. These efforts aim to facilitate natural communication in extreme environments, but could also have potential for the general public. The researcher, a research fellow at Zoom, advises the company on AI research and language technology development.

“By interpreting and recreating natural voice communication, we are trying to reduce the workload of scientists and pilots in such missions in a natural way, despite the challenges imposed by water, operational stress, conversational dialogue and poor acoustic conditions,” Alex Waibel told CMU’s Aaron Aupperlee.

The voice recognition market more generally is entering an accelerated phase of development and adoption in a number of key sectors. Alex Waibel’s work builds on this trend with a broadcast mechanism that uses low-bandwidth broadcasts (in this case by sonar) to deliver full, albeit synthesized, video to the end user.

The technology uses a synthesized voice that sounds like the speaker, building on advances in AI-powered text-to-speech technology. Another potential application of this technology is rapid translation from one language to another, where the end user sees a video in an understandable language that the speaker does not know.

Source: ZDNet.com





Source link -97