SNCF Groupe GPT: how generative AI is arriving in the railways

Management named it “SNCF Groupe GPT”. It is a tool that allows you to interact with company documents through a large language model (LLM), capable of understanding questions written in human language and searching for information in the corpus. documentary.

For employees, the interaction takes place from a web portal designed by SNCF and which allows them to query the language model from a computer or smartphone. “This is a project that we began to deploy in December, first among a small population of senior executives, but which we are gradually making available to a larger number of employees. Today, the “The tool is accessible to around 600 employees” explains Julien Nicolas, digital director of SNCF.

“Currently, we have set up two experiments with support functions, but also with more technical functions such as SNCF Réseau employees. This allows them to access technical documents relating to maintenance or procedures.”

The idea is to allow the company’s employees to ask a specific question on a particular point of information contained in the documents, and to obtain a synthesized response in natural language based on the information analyzed. The user can also upload a new document and ask SNCF Groupe GPT to analyze it to extract information. “For the moment, these are the main uses that we see for this tool” explains Julien Nicolas.

Translation on the fly with Trad SNCF

In addition to SNCF Groupe GPT, the company also announced the improvement of its language translation tool called Trad SNCF. Made available to employees present at the station, this allows them to collect orally a question asked by a non-French-speaking user and obtain a translation, but also to offer them a response in their language.

The system was inaugurated on the occasion of the Rugby World Cup which took place in France in September 2023, and it should be improved to take into account a greater number of languages ​​during the Olympic Games from 2024, for a total of 130 languages ​​supported.

On this occasion, the system will be improved and made available to agents at stations and on trains, as well as for reinforcements planned for the Olympic Games. This tool relies on several partner service providers in order to offer the best translation according to the language of the interlocutor. The data is sent to third-party servers but SNCF promises that the processing servers remain within the EU and that it does not carry out any analysis on the data.

Economic models to find

On these subjects, the SNCF is moving cautiously but does not want to be left behind. “For the moment, these are above all experiments with two main objectives: to acculturate employees and to bring out the most appropriate use cases” summarizes Julien Nicolas. For SNCF GPT, the project aims to test different language models on the market including OpenAI, Mistral or Anthropic.

“Initially, we chose to deploy the model on a reduced number of documents, a few hundred out of the company’s 70,000 documents” explains Julien Nicolas. These documents are analyzed through “secure enclaves” which avoid sending the documents directly to the different service providers.

“For the moment, this “on-demand” operation is the simplest to allow us to test the different providers. But we also have our own data centers and it is possible to install power directly on these data centers.” summarizes Julien Nicolas. Beyond just performance, the question of cost also arises: most services in the sector bill “per token”, that is to say per request: “It does not have the same cost to load a hundred documents for a few hundred employees and load 70,000 documents accessible to all employees. So we are really looking to identify possible returns on investments.”

Limit the risks

It remains to measure the performance of the different models and the risks of errors. We often mention the risk of “hallucination” of LLMs and their capacity to invent answers, but the digital director remains confident in the capacity of the models to remain faithful to the information extracted from the documents: “The suppliers have a configuration which allows us to ask the model to stick exclusively to the model you have loaded. This is what we do when it comes to querying it on the SNCF corpus.

It indicates each time the reference document as the source and if it does not find the answer, it says so.” In the same way, the device is entirely capable of scaling up, the real factor not being not here the quantity of document to analyze but their indexing.

SNCF continues the gradual deployment of its SNCF Groupe GPT to its various employees, without setting a deadline for the deployment of this tool on a large scale. “The idea is to allow everyone to test and imagine what it can bring them, and then decide whether or not we want to deploy it for a particular activity or service. What was important for us was to “to be “AI ready”, to have the infrastructure to be ready to use generative AI. We showed it with this SNCF GPT, and now we are opening it to different populations and depending on the feedback, it will be generalized on such and such a site.”

Source link -97