Gemini: Google lied to you about its capabilities!


Naim Bada

December 17, 2023 at 2:30 p.m.

6

Summary
  • A manufactured demo and a rushed launch?
  • Bard with Gemini, yet another taste of the artificial intelligence promised by Google

Google I/O Gemini © © Google

Sundar Pichaï unveils Gemini at the 2023 Google I/O conference © Google

Rumors have been circulating for almost a year, Gemini is finally here! Announced as the killer of GPT-4 and OpenAI, Bard’s new model still has work to do.

Without warning, Google announced and released its new model, Gemini, this week. Gemini’s performance in benchmarks is particularly impressive. According to Google, Gemini Pro outperformed GPT-3.5 in the majority of tests, while Gemini Ultra outperformed GPT-4, OpenAI’s most advanced model, in almost every area evaluated. These results suggest that Gemini could soon become a dominant player in the LLM space, but what is the reality?

A manufactured demo and launch rushed ?

Google’s recent Gemini presentation was met with a mixture of astonishment and skepticism. Google, eager to position itself as a leader in this field, recently revealed Gemini, its most advanced artificial intelligence model. However, revelations about its demo manipulation have raised crucial questions about Google’s integrity and transparency in the AI ​​race.


Google’s announcement of Gemini was a moment of apparent triumph. This model, touted as a major breakthrough, is designed to understand and combine various types of information, including text, images and videos. Its ability to simultaneously process multimodal data positioned it as a major innovation, potentially superior to competing models such as OpenAI’s GPT-4.

However, the blow quickly died down when reviews and analyzes revealed that Gemini’s impressive demo was largely fabricated. According to reports, Google admitted to shortening Gemini’s responses and reducing latency in its demo video to make it more engaging. These manipulations have raised concerns about the accuracy of the representation of Gemini’s real abilities.

The impact of these revelations is significant. They call into question not only Gemini’s ability to perform the tasks demonstrated, but also Google’s credibility in presenting its technological advances. In an industry where trust and reliability are paramount, these actions could affect the company’s reputation.

These developments come at a time when competition in the AI ​​field is fiercer than ever. Companies like OpenAI have gotten a head start with models like GPT-4, which have earned the trust and admiration of the public as well as the scientific community. Google, along with Gemini, seemed ready to join this race as a serious contender.

It is crucial to note that, despite these setbacks, Gemini still represents a potential step forward in the world of AI. Its multimodal capabilities and innovative approach deserve further recognition and exploration. However, for Gemini to fully realize its potential and gain market trust, Google must engage in more transparent and authentic communication.

What about the benchmarks presented by Google?

Gemini’s performance in the MMLU test, a key indicator of the performance of large-scale language models, has been questioned. According to critics, Gemini outperformed GPT-4 in this benchmark specifically using a methodology called CoT@32. However, this method differs from the standard 5-shot approach, where GPT-4 maintains a lead with a score of 86.4% compared to 83.7% for Gemini.

The 5-shot methodology, widely recognized as the standard for evaluating this type of benchmark, involves preparing the prompt with five examples. Google, however, is said to have invented a different approach around CoT@32 to claim Gemini’s superiority. This method, focusing on a consensus threshold to determine the use of majority versus reliance on maximum likelihood, appears optimized for specific outcomes rather than real-world application.

Google Gemini Benchmarks

The benchmarks published by Google in their white paper cover several areas @ Google

Real-world use of LLMs does not match the methodology CoT@32, thus raising doubts about the practical applicability of Gemini compared to GPT-4. Criticism has emerged on social media platforms, with users expressing disappointment at what they perceive to be “misleading” promotion from Gemini. These critiques highlight the importance of transparency and a standardized methodology in the presentation of AI benchmarks. Another important thing to note, the benchmarks were done on the June 2023 version of GPT-4. In the meantime, a certain much more efficient GPT-4 Turbo has been released…

Bard with Gemini, yet another taste of the artificial intelligence promised by Google

To summarize: the AI ​​revolution promised by Google when it founded DeepMind is still not here. Bard already looked like a rushed project when it launched earlier this year. Many expected Gemini as the GPT killer which would finally offer credible competition to OpenAI, but it is clear that Google is not there yet, even if it is getting closer! In 2024, there is no doubt that Sundar Pichai’s firm will overtake its rival, especially given the current context at OpenAI.

In fact, the Gemini Pro that Google offers us in Bard is close to GPT-3.5 in its performance. Add to that the integration of Google services and its persistent connection to the web, and we obtain an attractive package for free ChatGPT users (and some services based on GPT-3.5 like Perplexity or ChatSonic).

Google Bard

Download

Google Bard

  • A powerful generation model
  • A knowledge base updated in real time
  • Free and integrated into the Google ecosystem

Google Bard stands out as an AI chatbot intrinsically connected to the web from which it draws the majority of its knowledge. The main advantage of the service is that it is completely free and offers image recognition. The gradual integration into the Google ecosystem should make it a most capable chatbot for a variety of tasks.

Google Bard stands out as an AI chatbot intrinsically connected to the web from which it draws the majority of its knowledge. The main advantage of the service is that it is completely free and offers image recognition. The gradual integration into the Google ecosystem should make it a most capable chatbot for a variety of tasks.



Source link -99