A revolutionary technique for creating 3D scenes

George Drettakis can be happy with himself. With his team from the National Institute for Research in Digital Sciences and Technologies (Inria) in Sophia Antipolis (Alpes-Maritimes), GraphDeco, and a colleague from the Max Planck Institute of Computer Science in Saarbrücken (Germany), they developed an algorithm that beats competitors from large companies like Google and Nvidia, in an area where they excel, computer vision. More specifically, this new program realizes an old dream: three-dimensional rendering of scenes from simple photos.

In other words, a few shots of an object, a building, a crowd are enough to then see them from any angle, zoom, rotate… Special effects professionals in cinema, those in video games, architects to visualize their projects in their environment, real estate agencies to show houses, robotic engineers (to guide the machine, it is better to have “plans” in three dimensions)… are fond of such a function.

Until 2020, methods made it possible to make this rendering, but required a lot of calculation time for fairly imprecise results (lack of reflections, invisible details, “holes”…). To begin, from two images, taken from two different angles but having pixels in common, a depth map is calculated. This generates a sparse cloud of points in space, a sort of diaphanous ghost of the scene. This cloud is then densified to flesh out the ghost. Then, a computationally expensive step, a mesh of small triangles is deduced from these points, on which surfaces, colors and textures are applied to generate the shapes.

Fun demonstrations

In 2020, a Google team is revolutionizing the field with its NeRF method, based on artificial neural networks. This object, at the heart of contemporary artificial intelligence, is used to encode the scene in a very abstract way. This encoding takes time, forty-eight hours of calculation for the scenes which serve as a reference for researchers to test their algorithms, but the precision of the images is much better.

In the summer of 2023, the Sofia Antipolis team hits hard with its method “3D Gaussian Splatting” or 3DGS, for “bursting of three-dimensional Gaussians”. In thirty minutes, she obtains from a hundred photos a three-dimensional model which can then be seen from all angles at a rate of one hundred high-quality images per second. This is a hundred times more than Nvidia’s Instant NGP, which itself does a hundred times better than NeRF. “I’m not in the habit of putting myself forward. But I didn’t think I would encounter this situation where, after more than twenty years of work dedicated to this problem, I can almost say that it is solved”testifies George Drettakis, who insists on public funding for this work (Inria and grants from the European Research Council in particular).

You have 43.64% of this article left to read. The rest is reserved for subscribers.

source site-30