The first extends the framing of an image in any direction you want, creating new elements from scratch that seemingly blend into the original work. The second is a brush that replaces elements in the image with whatever you type in the prompt.
If you haven’t gotten a DALL-E invite yet, the best way to experience the power of these tools is to load up this AR Instagram filter and watch what they can do.
It’s pretty simple. You load the filter, find a flat surface in your room, and set up an instant gallery of three classics: René Magritte’s The Son of Man, Johannes Vermeer’s Girl With a Pearl Earring, and Leonardo da Vinci’s Mona Lisa.
There’s nothing special in that until you get closer. Then, all of a sudden, Mona Lisa recedes to the back and you discover a whole new world around her full of depth and detail.
The process of outpainting and inpainting
The new artwork looks painted by the original artists, but it was made by modern artists exclusively using DALL-E.
“I actually come from a traditional fine art background, like oil painting. So I really wanted to make sure that I was respectful to the painting,” says Josephine Miller, an art director and AR/XR/3D artist who created the extended universe of The Son of Man.
Miller gave its protagonist a face, using DALL-E image-generation powers, that perfectly matches Magritte’s style. But it wasn’t as easy as conjuring the face of a faceless man using AI magic. In fact, it required a lot of trial and error to get the right face as Miller imagined it: “I did about 200 faces for it using the inpainting feature.”
Like with the other two paintings, the process required creating different versions of some parts to produce all the different elements needed to set up the feeling of depth.
In the case of The Son of Man, the face and all the other elements in the extended painting were cut out to make a two-and-a-half-dimensional environment, in which each of the cutout planes are set up in a 3D space, like the set layers on a theater stage or the multiple planes used in camera animation.
“When you just see the filter for the first time, the face shows like in the original painting. There’s no change,” says Manuel “manu.vision” Sainsily, an artist and XR design manager at Unity who made the extended version of the Mona Lisa.
It’s only when you approach that the full painting starts to reveal itself. And then, as you move your phone in any direction, you see other things, like the actual face of the person behind the apple.
“There are those thousands of men floating around. They all have individual faces, and Josephine worked on every single face. It’s insane,” Sainsily says.
The set was then programmed into a 3D space using Spark AR Studio to create the Instagram effect. The final product feels neat and satisfying, like you are accessing a secret bigger-than-life version of a painting that never was.
A lot more than just AI
Beyond the work of cutting out things and making the AR filter—which is a tedious, regular job—the process of coming up with the entire product is very hard work, not as easy as typing in a prompt and having the end result.
To show that and make a point that this AI technology is extremely powerful but yet just another tool in an artist’s arsenal, Sainsily says that they all filmed themselves doing the work, which also helped the creative process because it informed the other team members in charge of cutting, compositing, and programming of the criteria for producing the final piece.
At the end, this process—and the final result—is also a testament to the importance of humans in the creation of these works. While DALL-E, Stable Diffusion, and the rest of the text-to-image tools are fantastic, they are not yet at the point where they can make creative decisions. They just output. It requires the work of people like Miller and Sainsily to create compelling artwork with meaning. At least for now.