The giant Microsoft unveiled a powerful but small artificial intelligence model on Tuesday, designed to be more resource efficient. Its tasks promise to be limited, but more effective.
While some, like Meta in particular with its Llama 3, are propelling increasingly powerful language models (LLM) with 8 or 70 billion parameters, Microsoft has decided to draw the card of AI sobriety generative, with a smaller model. The Redmond firm presented Phi-3, this Tuesday, April 23, 2024, an LLM tinged with mischief, rather small, designed to take on more limited tasks with talent than some of its big brothers. The result is less use of resources, and therefore energy savings.
Phi-3 Mini, Microsoft’s little language model that could master the art of social networks
Microsoft plans to launch a series of three small AI models. The company presented the first of them, “Phi-3 Mini”, which is driven “only” by 3.8 billion parameters. Remember that the number of parameters corresponds to the number of complex instructions that a model can understand. In itself, Microsoft is putting an end to the excess and the war of numbers, to dedicate its small language models to more specific uses.
Phi-3 can thus outperform models twice its size on different benchmark tests, in coding, mathematics, language abilities. More precisely, it is capable of performing tasks such as creating posts and content for social networks, all while using less data.
Microsoft, which indicates that it has made Phi-3 Mini available on the Azure, Ollama and Hugging Face platforms, will soon publish the two other models in this new family: Phi-3 Small (7 billion parameters) and Phi-3 Medium (14 billion ). But back to our “Mini”.
Adapt models to specific tasks, for cheaper and greener AI
On paper, Phi-3 would therefore be as efficient as GPT 3.5 (the current free version of ChatGPT), while being less expensive and less resource intensive. It’s not necessarily stupid, and everyone could come out a winner, by choosing from the models that correspond to their uses.
Phi-3 could thus very well summarize the essence of a long document or identify industrial trends based on market research reports. Using a list of 3,000 words and a simple query, Phi-3 was also able to create children’s books.
At Anthropic for example, Claude 3 Haiku excels in summarizing large documents that include graphics. On Google’s side, the Gemma models in versions 2 billion and 7 billion parameters work better with simple conversational robots or for better understanding of languages. Llama 3, which we talked about above and which powers Meta’s new chatbot, shines in coding assistance. We could thus move towards a sort of giant marketplace of language models.
Source : The Verge
1