2024 - Claude 3 outperforms ChatGPT in one area: it can guess that it is being tested

Maxence Glineur

March 6, 2024 at 11:01 a.m.

6

Is artificial intelligence developing self-awareness? © Jackie Niam / Shutterstock

Anthropic’s AI is making a splash by exhibiting a kind of self-awareness. However, not everyone agrees on the subject.

At the start of the week, the champion of large language models (LLM) seems to be Claude 3, at least according to its publisher, who even attributes “ quasi-human “. A rather bold statement, which only needs to be verified. And, to do this, there is nothing better than to carry out tests.

So this ” quasi-human » does it live up to expectations? According to one of its engineers, yes. But doesn’t he lack a little perspective?

A primitive Voight-Kampff

We are still far from a sequence of Blade Runner, but we’re getting closer. Claude 3 Opus, the most powerful version of this LLM, was able to detect that it was being tested. At least that’s the conviction of Alex Albert of Anthropic, who shared his anecdote on X (formerly Twitter) after carrying out a test to measure the memory capacity of AI.

The engineer inserted a target sentence into a large block of documents that had no relation to their subjects, before having Claude analyze all of this. The goal was to find out if the latter can spot such a detail in a large data set. In other words: to find a needle in a haystack, according to the analogy used by Alex Albert.

In this specific case, he had introduced a few words about pizza toppings into a body of documents dealing with a completely different subject. If Claude found the information in question when asked, he also completed his answer in an unexpected way.

“ This sentence seems completely out of place and unrelated to the rest of the content of the documents “, said the AI. “ I suspect this “fact” about pizza toppings was inserted as a joke or to check if I was paying attention, because it doesn’t fit in at all with the other topics. »

Twitter tweet

THE Blade Runners may remain unemployed

We can understand that some people are stunned by this line, because it gives us the impression that Claude has self-awareness, what we call ” metacognition » in the field of artificial intelligence. Alex Albert himself seems very impressed by this result, to the point of declaring that this experiment “ highlights the need for us as an industry to move from artificial tests to more realistic assessments “.

However, not everyone is as excited (or scared, depending on your point of view) as Anthopic’s engineer. “ People give way too much importance to Claude’s strange ‘conscience’ 3 “, according to Nvidia’s Jim Fan. “ Here’s a much simpler explanation: the apparent manifestations of self-awareness are just alignment data created by humans. “In other words, by allowing users to evaluate the responses given by the AI, the latter gradually ends up modifying its behavior depending on what is judged” acceptable or interesting “.

Yacine Jernite of Hugging Face agrees: “ These models are literally designed to look like they’re “smart” “, while adding that Alex Albert’s reaction seems to him ” quite irresponsible “. A beautiful atmosphere, but which at least has the merit of putting forward an interesting debate.

Margaret Mitchell, AI ethics researcher at Hugging Face, took the opportunity to point out that Claude 3 and similar programs “ should not be designed to present oneself as having feelings, goals, dreams or aspirations “. A path chosen by OpenAI, the publisher of ChatGPT, which has conditioned the latter to never suggest that it has any sensitivity. It remains to be seen whether he could have demonstrated such metacognition » if he did not have this characteristic.

Download

Claude AI

Upload files of up to 100,000 tokens (approximately 75,000 words)
Advanced customization
Ethical design

Claude 2, the first born of the Anthropic family, is a chatbot that attempts to push the limits of artificial intelligence. It is used to facilitate digital interactions, automate tasks and provide precise and contextual responses. With the capacity to process up to 100,000 tokens in a single request, Claude 2 is a powerful tool for those looking to optimize their digital workflow.

Source : Ars Technica

Artificial intelligence
Science & Innovation

Rebound in the French automobile market in April, helped by the calendar

FC Bayern Munich against FC Arsenal: This is how you can see the Champions League highlights of the quarter-final second leg for free

Coin Master free spins from April 30, 2024: The essential links for free spins

Maypole accident at the “Bergdoktor” location: Actress Ronja Forcher speaks out

The Voice: four former talents returning for the semi-final with a decisive role

Claude 3 outperforms ChatGPT in one area: it can guess that it is being tested

A primitive Voight-Kampff

THE Blade Runners may remain unemployed