What is Gemini? Everything you need to know about Google’s powerful new AI


Google has just released its most powerful AI model yet. But what can it do and above all what can it do for you? Here are all the details of this announcement which shakes up OpenAI’s leadership in generative AI, thanks to the multimodal model.

What is Google Gemini?

Gemini is a new and powerful artificial intelligence model from Google that can understand not only text, but also images, videos and audio. As a multimodal model, Gemini is described as capable of performing complex tasks in mathematics, physics and other fields, as well as understanding and generating high-quality code in various programming languages.

It is currently available through integrations with Google Bard and the Google Pixel 8 smartphone and will gradually be integrated into other Google services.

“Gemini is the result of a large-scale collaboration between teams at Google, including our colleagues at Google Research,” according to Dennis Hassabis, CEO and co-founder of Google DeepMind. “It was built from the ground up to be multimodal, meaning it can seamlessly generalize and understand, operate across and combine different types of information, including text, code, audio, l ‘image and video.”

Who created Gemini?

Gemini was created by Google and Alphabet, Google’s parent company, and touted as the company’s most advanced AI model to date. Google DeepMind also contributed significantly to the development of Gemini.

Google describes Gemini as a flexible model that can work across everything from Google data centers to mobile devices. To achieve this scalability, Gemini is available in three sizes: Gemini Nano, Gemini Pro and Gemini Ultra.

  • Gemini Nano: The size of the Gemini Nano model is designed to work on smartphones, particularly the Google Pixel 8. It is built to perform on-device tasks that require efficient AI processing without connecting to external servers, like suggest answers in chat applications or summarize a text.
  • Gemini Pro: Running on Google’s data centers, Gemini Pro is designed to power the latest version of the company’s AI chatbot, Bard. It is capable of providing fast response times and understanding complex queries.
  • Gemini Ultra : Although not yet available for widespread use, Google describes Gemini Ultra as its highest-performing model, exceeding “current state-of-the-art results on 30 of 32 academic benchmarks widely used in search and the development of large language models (LLM)”. It is designed for very complex tasks and should be commercialized at the end of its current testing phase.

How to access Gemini?

Gemini is now available on Google products in its Nano and Pro sizes, like the Pixel 8 phone and Bard chatbot, respectively. Google plans to integrate Gemini over time into its Search, Ads, Chrome and other services.

Developers and enterprise customers will be able to access Gemini Pro through the Gemini API in Google’s AI Studio and Google Cloud Vertex AI starting December 13. Android developers will have access to Gemini Nano via AICore, which will be available in preview.

Is Google Gemini available in France?

Gemini Pro is already available free of charge in 170 countries – but not yet in France – and only in English for the moment.

However, Google plans to expand it to other languages ​​and other regions of the world soon. To test it, simply use Google’s chatbot as you normally would.

How does Gemini differ from other AI models, like GPT-4?

Google’s new Gemini model appears to be one of the largest and most advanced AI models to date, although the release of the Ultra model will help determine that for sure.

Compared to other popular models currently powering AI chatbots, Gemini stands out for its native multimodal feature, whereas other models, like GPT-4, rely on plugins and integrations to be truly multimodal.


Gemini Ultra and Pro vs GPT-4


Comparison chart from Google that shows how Gemini Ultra and Pro compare to GPT-4 and OpenAI’s Whisper, respectively. Google/ZDNET

Compared to GPT-4, a primarily text-based model, Gemini easily performs multimodal tasks natively. While GPT-4 excels at language-related tasks, such as content creation and complex text analysis, it uses OpenAI plugins to perform image analysis and access the web, and it relies on DALL-E 3 and Whisper to generate images and process audio.

Google Gemini also seems more product-focused than other models currently available. It’s either integrated into the company’s ecosystem, or planned to be, since it powers the Bard and Pixel 8 devices. Other models, like GPT-4 and Meta’s Llama, are more focused on the Services and available to various third-party developers for applications, tools and services.


Source: “ZDNet.com”



Source link -97