Unsupervised learning: how the AIs that differentiate eagles from planes are trained


Big Data & AI Exhibition 2023, Paris – Certainly, generative AI occupies many corporate meetings. Perhaps to the point of forgetting that work in AI is progressing far from the buzz of ChatGPT. Unsupervised learning, for example, is one of the most promising techniques. This variation of automatic learning – also called Machine Learning – where the data is not labeled (unlike supervised learning) extracts classes or groups of objects presenting common characteristics without the help of a supervisor.

The ambition of this technique is to discover the structures underlying this unlabeled data, and it is a means of experimenting with how far artificial intelligence can take it in terms of performance. Armand Joulin, researcher in artificial intelligence, presented work on the training of visual recognition AI using an unsupervised learning technique at the Big Data & IA Show. A technique already used by Meta for voice recognition.

“AI image recognition began with supervised learning,” he recalls. “With this technique, we give an image to a machine, and we ask it to do a task with this image, such as recognizing a subject. To do this, we label the images and we teach the machine to recognize the labels.”

How to do object recognition on video?

“It is a very effective but time-consuming method. It also has limitations, because moving from image recognition to video, for example, requires training an AI.” And in addition to cameras, modern smartphones include more and more sensors, such as infrared. “Here too, doing infrared image recognition requires building a new AI” indicates the researcher.

Above all, classification is only one use case for visual recognition. Copy detection in the fight against plagiarism, style transfer or even captioning are all tasks that can be asked of a machine… provided that it is trained each time for this specific task , with a new neural network.

Armand Joulin is therefore trying to find a method for a machine to learn functions that can be used anywhere, without having to train new neural networks. And it sets two prerequisites

  • When learning the machine, it must be “educated” on different media (video, photo, selfie, radio, etc.)
  • You need a simple task, such as writing text from an image.

“But the problem,” he says, “is that we often describe two slightly different images in the same way.” Hence the idea of ​​asking the machine to describe not what is presented to it in an image, but to describe the differences between two visuals. To do this, the researcher has the AI ​​compare numerous images and thus teaches it to note the differences.

Recognizing a photo of a cat is good, but recognizing the differences between two photos of cats to describe them is much better.

This technique, called “no supervision”, is more efficient than supervised learning, he assures.

With this discrimination technique, AI can classify images by what is different or similar to them without labeling. It is a form of unsupervised classification.

“With this technology, the machine identifies the most discriminative part of what we offer it an image or video,” he explains.

AI can recognize differences, but also similarities on very different media. This is one of the capabilities of AI trained with unsupervised learning.

“We have passed the milestone of what AIs cannot do”

And it is also possible to ask him what the common points are between planes and birds on different media, such as photos and videos, but also with 3D objects.

“With this system, it is possible to have object recognition on video, the AI ​​identifies the differences and similarities between planes and eagles,” he says.

And to conclude: “We have now passed the milestone of what AI cannot do in terms of visual recognition”.



Source link -97