And here is the dark side of ChatGPT and consort


For the past two weeks, you’ve been spending your evenings questioning ChatGPT, OpenAI’s formidable conversational AI. And you even start making plans to use it in your craft.

Admittedly, some answers are imprecise, even lame. But other tests show you impressive potential.

And now, after the fairly general enthusiasm for a computer solution that may appear “magical”, appears the first information on the construction of GPT-3 and ChatGPT, as well as competing tools. Obviously, under the hood, no magic. But above all, little ingenuity. Mostly labor that is paid for poorly by the job, and data scrapping on a scale never seen before.

Educate AI via clickworkers

Time magazine reveals on the one hand that OpenAI used Kenyan service providers to make ChatGPT less toxic. Less toxic? Yes, because GPT-3, the ChatGPT engine, has in the past shown an unenviable ability to spout violent, sexist and racist remarks.

Why ? Because this AI has been trained on hundreds of billions of words extracted from the Internet. And you can imagine, this huge data set contains its share of toxic content and a priori.

To “educate” the AI, OpenAI therefore had to put in place an additional security mechanism in order to offer a chatbot.

And there, no surprise, no invention, no magic recipe. Like social networks, it was traumatized moderators who rectified ChatGPT’s etiquette before OpenAI dared to offer it to the public.

A work always carried out under the umbrella of AI. Because it was for these providers to help create a moderating AI of ChatGPT. How ? By submitting to this AI “labels”, i.e. examples of violent content and hate speech.

To obtain these labels, OpenAI therefore sent tens of thousands of snippets of text to an outsourcing company in Kenya, named Sama, from November 2021. Obviously, this dataset contained terrible words, including accounts of sexual abuse of children, murder, suicide, or even torture.

Sama, for its part, presents itself as an “ethical AI” company and claims to have helped lift more than 50,000 people out of poverty. Sam’s employees employed on the OpenAI project were paid between $1.32 and $2 an hour.

These clickworkers, who made ChatGPT presentable, play a vital role in the AI ​​value chain. Because beyond the position of data scientist, the skills of data engineers, it is these armies of workers who enrich the data. Armies that are often invisible, masked by the technical innovations put forward by the tech giants.

AI and copyright

Beyond the problems of moderation and the use of a host of service providers paid by the task, the implementation of current AI systems also raises a question of law. The copyright and usage rights of the data and information contained in the corpora, or dataset, digested by the machines, would thus not be respected.

This is how a collective legal action (a class action) has just been formalized against Stability AI, a competitor of OpenAI, and creator of the services Stable Diffusion, Midjourney, and the DeviantArt platform. . Plaintiffs claim that generative AIs are trained from millions of pirated works. And they demand compensation.

It is the issue of authors’ consent to their works being used that is raised here. And to plead a gigantic piracy which moreover would now be used to compete with the work of the authors from now on.

This class action could spread and bring together more important players. Getty Images, one of the largest image banks in the world, is threatening to sue Stability AI. The company accuses the AI ​​specialist of having “illegally copied and processed millions of copyrighted images and associated metadata”.





Source link -97