AI learning data: the CNIL joins the debate


AI professionals often repeat it: use cases are the first levers for adoption and democratization. In this regard, ChatGPT has largely contributed to the accessibility of artificial intelligence techniques.

However, these breakthrough innovations can also highlight risks or abuses, particularly legal and ethical.

Competent authority in France on issues related to AI, the CNIL intends not to be left behind.

Strengthening the regulator’s expertise on AI

The Commission has announced the creation of a five-person service dedicated to artificial intelligence, the SIA. He is attached to the technology and innovation department, headed by the former national coordinator for the AI ​​strategy, Bertrand Pailhès.

The CNIL’s objective is to “strengthen its expertise on these systems and its understanding of the risks to privacy, while preparing for the entry into application of the European regulation on AI”.

The mission of the SIA will also consist in “developing relations with the actors of the ecosystem” and promoting the acculturation of professionals and individuals. The CNIL also recalls that the issue is not only technological.

Systems like ChatGPT also raise questions of law. Thus, the SIA, although attached to the technology and innovation department, will also be in “close” collaboration with “legal support”.

Performance hides risk

“As of now, and without waiting for the evolution of the legal framework, the creation of this service responds to a social issue whose importance is increasing every day”, underlines the protection authority, attentive to current events.

So much for the long-term work. It is coupled with a shorter-term mission, this time focusing on learning databases. To train these AIs, their designers generally use large volumes of data.

ChatGPT is an illustration of this with the use of data scrapping on the internet – combined with click workers. The CNIL recalls, without targeting an AI in particular, “that their operation is based on the processing of a large amount of data, very often personal”.

As a result, “their implementation thus entails risks for privacy”. The CNIL wishes to prevent operation in black-box mode, which is incompatible with certain regulatory requirements – in terms of explainability in particular.

Recommendations on the learning bases in 2023

In addition to the creation of the SIA, the Commission is therefore launching work on learning databases. The CNIL informs that public and private organizations ask it about the legality of certain uses concerning the constitution of training bases.

“The purpose of future work is therefore to clarify the CNIL’s position on this point and to promote good practices, under the requirements set by the GDPR, but also in the perspective of the proposed regulation on AI currently discussed at European level. »

The project on the bases of learning must lead to “concrete responses” in terms of the creation or use of such assets “with respect for the fundamental rights and freedoms of individuals”. CNIL recommendations will be published, then submitted for public consultation.





Source link -97