Just as we’re starting to come to terms with the power of the latest AI image generators, another breakthrough comes. Directly after DALL-E is Point-E, an AI renderer for 3D modeling with a similar way of working.
AI image generators made huge leaps forward last year, allowing anyone to create sometimes stunning images from a text prompt. For now, they can only render still 2D images, but OpenAI is the company behind one of the most popular renderers, DALL-E 2. He just revealed his latest research into an AI-powered 3D modeling tool…
After DALL-E comes Point-E, a model that looks ready to bring its revolutionary text-to-image technology to 3D modeling. OpenAI says the tool, trained on millions of 3D models, can generate 3D point clouds from simple text prompts. catch? The resolution is pretty poor.
The research paper, written by a team led by Alex Nichol, says that unlike other methods, Point-E “takes advantage of a large corpus of (text, image) pairs and allows it to follow diverse and complex prompts -to-3D model. trained on a smaller dataset (image, 3D) pairs.”
It says: “To generate a 3D object from a text prompt, we first sample an image using the text-to-image model and then sample a conditioned 3D object on the sampled image.” Point-E runs a synthetic 3D imaging through a series of diffusion models to create a 3D, RGB point cloud: first a coarse 1,024 point cloud and then a finer 4,096 point cloud.
The sample results in the research paper may seem simplistic when compared to the images the DALL-E 2 can produce and the 3D capabilities of existing systems. But creating 3D images is a very resource-hungry process. Programs like Google’s DreamFusion require hours of processing using multiple GPUs.
OpenAI realizes that its method performs worse in terms of quality of results, but says it generates samples in a very short time – we’re talking seconds instead of hours – and it only requires a GPU, making 3D modeling more accessible. You can already try it yourself, as OpenAI shares its source code on Github. (opens in new tab).
Read more: