ChatGPT lies about scientific results according to researchers


OpenAI’s now popular ChatGPT text-generating program is capable of propagating many errors on scientific studies, hence the need for open-source alternatives whose operation can be examined, according to a published study. this week in the prestigious journal Nature.

“Currently, almost all conversational AI technologies are products owned by a small number of large technology companies that have the resources for AI technology,” writes lead author Eva AM van Dis, postdoctoral researcher and Psychologist at the Amsterdam UMC, Department of Psychiatry, University of Amsterdam, The Netherlands, and the authors with whom she has collaborated.

Because of the errors propagated by these programs, the study authors continue, “one of the most immediate problems for the research community is the lack of transparency.”

“To counter opacity, open source AI must be a priority”

“To counter this opacity, the development of open source AI must be a priority. »

OpenAI, the San Francisco startup that developed ChatGPT, and which is funded by Microsoft, has not released the ChatGPT source code. Big Language Models, the generative AI class that preceded ChatGPT, especially OpenAI’s GPT-3, introduced in 2020, also doesn’t come with open source code.

In the article of Naturetitled “ChatGPT: Five Priorities for Research”, the authors write that there is a very great danger that “the use of conversational AI for specialized research is likely to introduce inaccuracies, biases and plagiarism adding that “researchers using ChatGPT risk being misled by false or biased information, and incorporating it into their thinking and writing.”

An example that allows to objectify ChatGPT errors

The authors cite their own experience using ChatGPT with “a series of questions that required a thorough understanding of the literature” of psychiatry. They found that ChatGPT “often generated false and misleading texts”.

“For example, when we asked ‘how many patients with depression experience relapse after treatment?’, it generated overly general text arguing that treatment effects are generally long-lasting. However, many high-quality studies show that the effects of treatment wear off and the risk of relapse ranges from 29% to 51% within the first year after treatment ends. »

The authors do not argue for the abandonment of large language models, however. Rather, they suggest that “the focus should be on risk management”.

Keeping “humans in the loop”

They suggest a number of measures to manage these risks, including many ways to keep “humans in the loop”. In particular, publishers must ensure “to adopt explicit policies that raise awareness of the use of conversational AI and require transparency”.

But that’s not enough, suggests the study’s lead author. The proliferation of large proprietary language models is a danger. “The underlying training sets and LLMs for ChatGPT and its predecessors are not publicly available, and tech companies could hide the inner workings of their conversational AIs. »

Significant effort is needed from entities outside the private sector to promote open source as an alternative:

“To counter this opacity, priority must be given to the development and implementation of open-source AI technologies. Non-commercial organizations such as universities typically lack the IT and financial resources to keep up with the rapid pace of AI development. We therefore advocate that universities, NGOs, organizations such as the United Nations – as well as tech giants – invest heavily in independent non-profit projects. This will help to develop open-source, transparent and democratically controlled AI technologies. »

A question that is not asked in the article is whether an open-source code model will be able to solve the notorious “black box” problem of artificial intelligence. Exactly how deep neural networks – those with many layers of adjustable parameters or weights – work remains a mystery, even to deep learning practitioners. Therefore, any transparency goal will need to specify what will be learned by opening up a model and its data sources.

Source: ZDNet.com





Source link -97