By tampering with the Gemini demo, Google shot itself in the foot


24 hours after the announcement of the Gemini multimodal language model, the behind-the-scenes demo video released by Google begins to be revealed. Google recorded fake questions, to make it appear as if Gemini was guessing the context. In reality, the questions asked by Google directed the AI ​​to the right answers.

Did Google lie about the intelligence of Gemini, its new language model presented as superior to OpenAI’s GPT-4? If there is nothing today to call into question the assertions of the web giant, it is clear that Google made an error in the way it presented things.

In a 6 minute and 23 second video, broadcast on December 6, 2023 and widely shared in the media and on social networks, Google presented Gemini as a Jarvis-style super-AI (Iron Man), with incredible abilities to understand what surrounds him. Gemini seemed capable of seeing the world in real time and commenting on it, with logic worthy of a human being. His interlocutor was content with small oral reminders each time, which seemed sufficient for the AI ​​to guess what was being asked and provide complete answers. In reality, Gemini will probably never behave like this.

As several Internet users have noticed, the Google site indicates that several elements have been changed in the demonstration video, which makes Gemini appear as what it is not. There’s nothing really serious (in the end, Gemini is really capable of responding like in the video), but handling this demo risks doing Google a disservice.

Gemini cannot comment on the world in real time

In the Google video, the Gemini Ultra language model, which will be available in early 2024, is presented in its multimodal form. He seems able to listen to what is said to him, see the world in real time and respond with text spoken by a synthetic voice. The super-AI of science fiction films finally seems ready, with the potential to revolutionize the world. Google also presents Gemini as a “ally” rather than software.

In reality, the multimodality of Gemini Ultra has nothing to do with Google’s demonstration.

A bit like Google Bard, the Gemini Ultra in the demonstration is a conversational agent that accepts textual and photographic requests. Google never asked him any questions while speaking, everything was written. As for the content to analyze, Gemini Ultra did not see it live, but just took photos… Tasks that ChatGPT also knows how to do. Google also indicates that it has streamlined editing by shortening response times. In other words, Gemini takes time to formulate text. The synthetic voice was also added manually.

In the description of the video on YouTube, Google indicates: “For the purposes of this demonstration, latency has been reduced and Gemini responses have been shortened for brevity”.

In this example, Gemini comments on the drawing in real time, makes jokes and uses his deductive skills to imagine what happens next.
In this example, Google pretended that Gemini was commenting on a drawing in real time. In fact, with each development, he sent her a photo. // Source: Google

Even more problematic: the prompts were manipulated. When Google claims to ask Gemini “Which car goes the fastest?” »the reality is that he asked her: “Based on the aerodynamics of these cars, which car will go faster between the one on the left and the one on the right? Explain why and detail your answer”. This clue allowed him to give a complete answer, which mentioned aerodynamics, but his answer was not spontaneous.

Another example, when humans play the cup game, Gemini cannot see the movements. It was explained to him with text that “cup 1 has taken the place of cup 2”. The order of the stars in the solar system is also bogus, Google didn’t just ask “is this the right order” but “is this the correct order taking into account the distance from the Sun. Explain your reasoning ». The demo has nothing to do with it anymore.

Gemini's answers are real, but the questions are fake.  Google helped its AI appear intelligent.Gemini's answers are real, but the questions are fake.  Google helped its AI appear intelligent.
Gemini’s answers are real, but the questions are fake. Google helped its AI appear intelligent. // Source: Google

In a blog post published on its developer site, Google details certain prompts sent to Gemini. The company confirms that its demonstration consists of a text chat with images. The requests are much longer than those stated in the video, which helped Gemini provide answers that seem very intelligent. Its goal was undoubtedly to make ChatGPT outdated… at the risk of presenting a situation that does not exist.

In short, as it stands, Gemini Ultra is not Jarvis. It’s just a fancy version of Google Bard, with super understanding of images.

Google sends the wrong message

Should we conclude from this that Gemini is not as intelligent as advertised? The answer is no. The responses from the language model are stunning and prove that Google has probably caught up on OpenAI, after a complicated year during which Google often gave the impression that the creator of ChatGPT had completely taken it by surprise. Gemini appears to be a major step into the future, with the potential to add context understanding to Google’s services.

Despite this, it is impossible not to see a major error in Google’s communication. After impressing the world, the Californian is now disappointing it. Gemini Ultra seems incapable of carrying out new tasks. Ultimately, it works similarly to ChatGPT with its Vision module (which allows it to see images). It’s unfortunate to give the impression that Gemini is capable of watching live feeds and speaking with a human in real time, since the demo video ultimately serves as a disservice to Google’s true technological progress.

With its ChatGPT Voice feature, OpenAI is closer to the Gemini demo than Google. ChatGPT Voice reads ChatGPT responses with a synthetic voice and makes it seem like the AI ​​can speak. Google could probably do the same one day, but Gemini is incapable of doing so today. Why did you make people believe the opposite?

In 2024, the tech world should be fighting to continue to advance generative artificial intelligence. Google is positioning itself more than ever as OpenAI’s best rival. But it now needs to develop concrete, non-falsified uses that people will want to use (such as the integration of a lite version of Gemini into Pixel smartphones, with local processing). In this area, OpenAI has taken a lead.


Subscribe for free to Artificielles, our newsletter on AI, designed by AIs, verified by Numerama!



Source link -100