Microsoft now fixes its software bugs only with AI


For Microsoft, any machine-generated code should be treated with a “mixture of optimism and caution” because while programming can be automated using large language models, the coded. Some of these great pre-trained language models include OpenAI’s Codex, Google’s BERT natural language program, and DeepMind’s work on code generation. The OpenAI Codex, unveiled in August, is available through the Microsoft-owned GitHub Copilot tool.

To meet the challenge of code quality resulting from these artificial intelligence (AI)-enhanced language models, Microsoft researchers have just presented Jigsaw, a new tool capable of improving the performance of these models using of “post-processing techniques that understand the syntax and semantics of programs, then leverage user feedback to improve future performance.”

The tool is currently designed to synthesize Python’s Pandas API code using multimodal inputs, Microsoft says. Recall that Pandas is a popular data manipulation and analysis library for data scientists who use the Python programming language.

Review language models using AI

Language models like Codex allow a developer to use an English description for a piece of code and the model can synthesize the intended code in Python or JavaScript. However, this code may be incorrect or fail to compile or run, so the developer should check the code before using it, as Microsoft points out.

“The Jigsaw project aims to automate some of this verification to improve the productivity of developers who use large language models like Codex for code synthesis,” explains the Jigsaw team, from Microsoft Research.

Microsoft believes that Jigsaw can “completely automate” the entire process of checking code compilation, dealing with error messages, and testing whether the code produces what the developer intended it to produce. “Jigsaw takes as input an English description of the intended code, as well as an input/output example. This way, it associates an input with the associated output and provides quality assurance that the output Python code will compile and produce the intended output on the provided input,” they note.

No threat to human developers?

With Jigsaw, a data scientist or developer provides a description of the intended transformation in English, an input data frame, and the corresponding output data frame. Jigsaw then synthesizes the expected code.

According to Microsoft, Jigsaw can create the correct result in 30% of cases. How ? By pre-processing natural language and other parameters introduced in Codex and GPT-3, then sending the post-processed result back to a human developer for review and editing. This verification is then fed back into the pre-processing and post-processing mechanisms in order to improve them. If the code fails, Jigsaw repeats the repair process during the post-processing stage.

For the American giant, Jigsaw improves the accuracy of results to more than 60%. The Redmond firm also relies on user feedback to bring the accuracy of its tool to more than 80%. Microsoft notes that several challenges must be overcome before having a true “pair scheduler”. For example, it only tested the quality of the input/output of the synthesized code. In reality, the quality of the code would include determining if its performance is good, if it does not have security flaws and if it respects the licensing.

Source: ZDNet.com





Source link -97