Laser shark, Mona Lisa yawning: Google impresses with VideoPoet


Google demonstrated new generative AI. This time, it’s about creating short three-second videos with great modularity.

Not a day goes by without the demonstration of a new generative AI. This time, it’s Google which is doing it with its project called VideoPoet and whose first demonstrations are impressing Internet users. This generative AI focuses on generating short videos in 2-second vertical format. The strength of VideoPoet is its ability to adapt to the needs of the user. The AI ​​can, in fact, create videos from a simple text, but also from an image, another video, or even add audio to a video.

The difficult terrain of video generation

If the generation of text or image has already proven itself with ChatGPT and MidJourney, things are a little more complicated to generate video. Indeed, a video is nothing more than a succession of images on paper, but in practice, each image must be consistent with the previous and the next to create a logical scene. This is the whole difficulty for AIs, they are good at creating images that are very different from each other, but not at making logical microadjustments in an already drawn image. This is why, at this stage, video generation by AI is often limited to very short videos. In the case of Google VideoPoet, no more than 2 seconds.

But the demos presented by Google remain impressive, especially since the firm cannot generate video without providing any base image for the AI. We thus discover “a shark which shoots a laser coming from its mouth” or “an origami fox which walks in a forest”. Perhaps the most amusing example is “ a woman who yawns », associating this order with the image of Mona Lisa.

YouTube link Subscribe to Frandroid

The audio generation is also quite impressive. It is done by providing a video without sound to the AI, which is then able to generate an audio track corresponding to what it understands of the scene, without the slightest help from text. A video of a cat playing the piano, and the AI ​​adds some musical notes played on the piano. A steam train moving on rails, and the AI ​​adds the characteristic noise of such a vehicle.

To finish its show of force, Google had fun associating its new VideoPoet AI with Google Bard. The latter generated a script in the form of a multitude of commands to send to VideoPoet, which created around thirty 2-second videos. The firm then put the different videos together to try to tell a 60-second story.

YouTube link Subscribe to Frandroid

This demonstration also highlights the limits of this AI. We move from one scene to another without much consistency, and the images generated are not at the level of MidJourney’s latest static creations.

Still, video generation seems to be the next level to reach for AI champions, and Google seems to have taken a small lead with VideoPoet in the field. Looking in the longer term, we can only imagine what this type of generative AI will look like when associated with YouTube or platforms like TikTok.


Want to join a community of enthusiasts? Our Discord welcomes you, it is a place of mutual help and passion around tech.



Source link -102