CroissantLLM, the “sovereign, open, ethical and frugal” model


There is no shortage of AI models on Hugging Face. There are one more now. This is CroissantLLM, a language model designed by the research teams of the MICS laboratory at CentraleSupélec and by manufacturers, including Illuin Technology.

Last month, the French unicorn Mistral AI presented Large, its latest generative AI model. And this announcement sparked some criticism within the digital ecosystem. The reason: unlike previous models, Large is not open.

A model resulting from an academic/industry partnership

Furthermore, for the launch of Large, Mistral AI partnered with Microsoft and its Azure cloud. CroissantLLM therefore stands out in terms of openness and also in terms of collaboration. The LLM “is the result of close collaboration between academia and industry.”

Several academic partners (Sorbonne University, INESC-ID, Instituto Superior Técnico, Carnegie Mellon University and Institut DATAIA) are associated. Industrial players also contributed to the design of the model.

Among them, Illuin Technology, Unbabel, Diabolocom and EqualAI, indicates Céline Hudelot, director of the MICS laboratory. Academic/industrial collaboration constitutes one of the specificities of CroissantLLM, but not the only one.

Its designers thus claim a sovereign, open, ethical and frugal language model – and which will be presented on March 7 at Paris La Défense as part of the AI ​​Workshops co-organized by Le Digital Lab de CentraleSupélec and Illuin Technology .

Opening of algorithms to the models and datasets used

Thanks to these characteristics, developers consider the LLM adapted to the needs of companies and their business processes. Of “French culture”, but bilingual because it was also trained with English content, the AI ​​model was trained via the computing power of the Jean Zay supercomputer.

“The datasets are also French and public, therefore known and traceable,” specify its designers. In addition, the openness is total and thus covers “from the algorithms to the models and datasets used.”

At the ethical level, “the research team ensured compliance with the rules set by the recent AI Act”. Finally, on frugality – “without making concessions on speed” – the LLM has 1.3B of parameters.

“It therefore does not need significant computing power to run, which allows it to run on smartphones and personal computers and not only by requiring several GPUs,” emphasize the researchers.

Low energy consumption, CroissantLLM “is the most efficient French-speaking model for its size” according to the benchmarks cited by its creators. It now remains to find its place on Hugging Face, which hosts nearly 500,000 models.



Source link -97