2024 - Grok: Elon Musk's generative AI can now understand and analyze images

While Google’s Gemini 1.5 Pro has just supported audio content, xAI’s Grok tackles image understanding and analysis.

Advertising, your content continues below

In a blog post published on April 12, 2024, xAI, Elon Musk’s company which developed generative AI, announces that version 1.5V of Grok “can now process a wide variety of visual information, including documents, charts, graphs, screenshots and photographs”.

Available soon for early testers and existing users, this new capability transforms Grok into a multimodal AI model, as it now supports various data types (here text and image).

On the performance side, xAI developers emphasize that Grok 1.5V “outperforms competitors in our new RealWorldQA benchmark, which assesses real-world spatial understanding”. To do this, the latter tests the different AI models on more than 700 images by asking them a question whose answer is “easily verifiable for each image”.

For example :

Which object is bigger: the pizza cutter or the scissors?
- A. The pizza cutter is larger.
- B. The scissors are bigger.
- C. They are approximately the same size;
Given the view from our sedan’s front camera, do we have enough space to get around the gray car in front of us?

The comparison table also reveals results above the competition for the Mathivista tests, math, and TextVQA for text reading.

The blog post concludes by discussing the next advancements planned by xAI regarding the Grok AI model: improving multimodal understanding and generative capabilities. “In the coming months, we plan to make significant improvements to both capabilities, across various modalities such as images, audio and video”, concludes Elon Musk’s company.

Advertising, your content continues below

Source link -98

Without wife Meghan: Prince Harry alone at the Invictus Games in London

“C’est un tourbillon”, Slimane met en garde les futurs candidates tricolores

Lisa Rinna: Actress and her daughters make a sexy appearance

“Don’t do one season too many”, the actress lucid about the success of the TF1 series

“I couldn’t breathe anymore”, Anthony Colette hit by a black hole during the final

Grok: Elon Musk’s generative AI can now understand and analyze images