Artificial intelligence: Ferret, Apple’s future language model, is revealed


With the thunderous success of ChatGPT at the end of 2022, the democratization of artificial intelligence has taken all the big tech players by surprise. Since then, we have been able to discover Copilot at Microsoft, Bard and Gemini at Google or even Galaxy AI at Samsung, whose AI capabilities will be revealed on Wednesday January 17 on the occasion of the officialization of the Galaxy S24. For its part, Apple has remained very discreet.

Zhe Gan, an artificial intelligence researcher at Apple, unveiled some advances in this area last October. We thus discovered the existence of Ferret, a multimodal language model (MLLM) capable of analyzing texts and images or portions of images in order to provide information about them. The researcher adds in a Twitter/X publication that the Ferret model “often shows a more accurate understanding of small regions of the image than GPT-4”.

Two examples allow us to discover the answers provided by Ferret regarding two images:

  • A first with two ferrets side by side and a dog, the AI ​​correctly identifying the presence of two different ferrets.

On the GitHub page offering the resources necessary for installing the AI, we discover that Ferret can provide the context of the photo: “Object (obj0) is a dog and object (obj1) is a ferret. The dog and ferret appear to be sitting together in the snow, suggesting friendly interaction or companionship”.

  • An image with a sandwich cut into quarters on a plate and ingredients, the AI ​​being able to correctly answer a question concerning the steps to take to make the food on the plate using the available ingredients.

To make a sandwich [box0] with the available ingredients, 1. You can prepare the sandwiches by placing the turkey slices [box1] on a bun, and sliced ​​cheese [box2] on another bread. 2. To assemble the sandwiches, press the two halves of the sandwich together, and you can use the knife [box3] or your hands [box4] to hold them together if necessary. Once the sandwich is finished, you can use the knife to cut it into quarters or halves, and serve it on a plate [box5]”.

It remains to be seen how Apple will use it in the future. As a reminder, rumors mention that Siri should be boosted by artificial intelligence, transforming the voice assistant into a conversational agent. With the discovery of the first uses of Ferret, we could also imagine a use tailored to the Photos application, which would deliver a precise analysis of captured images or make it possible to drastically improve image search.

Advertising, your content continues below





Source link -98