Daniel Restuccio
Issue: July 1, 2010


The ILM artists that created those amazing fire-bending sequences in The Last Airbender got a lot of help from the innovative application of GPU technology. For those scratching their heads wondering what that means, it’s using the graphics processing unit (GPU) as opposed to the central processing unit (CPU) to build frames of visual effects animation. The ability to switch this computationally intensive process to the GPU is possible thanks to the CUDA technology now embodied in Nvidia Quadro FX 4800 and FX 5800 pro graphics cards.

“Half the battle with rendering is all the computation,” says Dominick Spina senior technology product manager for digital film at Nvidia and former manager of software development at Digital Domain. “One of the big problems we had in production was getting the large amount of data on and off the GPU so there was always a bottleneck.”

What broke up that logjam was Nvidia’s CUDA a system processor organization that enables developers to take advantage of the parallel computing capabilities of the graphics chip.  Particularly when simulating and rendering physical effects, programmers can use CUDA to create tools for artists so they can quickly view multiple iterations of complex simulations of things like fire, water, air or anything where a large amounts of data need to be transformed into behaviorally realistic effects. The researchers at ILM have used CUDA and the Nvidia Quadro pro graphics cards to create applications that are totally transforming their visual effects workflow.

ILM first used GPU technology to build Lightspeed a previewing system that let artists see renders that were near final quality.  Toward the end of 2008, Chris Horvath and Willi Geiger built a system that could simulate and render fire for the effects in Harry Potter: The Half-Blood Prince.  Their workflow used some innovative programming techniques that enabled them to treat the graphics card not as a rendering accelerator but as a high performance general computing device. “At the end of Potter,” says Olivier Maury, ILM lead rendering engineer, “we were confident we could use GPUs reliably in production for simulation and final rendering."

Spina explains that what effects houses are trying to do is fundamentally “change the way that they can contribute to the filmmaking process.”   In the past, he explains, the movie director would request a bunch of shots with fire, and so somewhere in the post production process there would be a bunch of shots with fire.  With GPU technology they can “help the director visualize what fire actually can look like in the scene to see how it improves the storytelling. And because of that they are adding more incredibly realistic visual effects into the film.”

ILM moved forward using CUDA to develop Plume, which is both a 3D fluid simulator/solver and a rendering application. They use mostly the Nvidia Quadro FX 5800 pro graphics cards because of the 4GB of memory. The technology is deployed both to individual artists workstations and a GPU based-renderfarm.

One of the things you can quantify, continues Maury, is that since CUDA and the Nivida Quadro pro graphics cards are being used as a general-purpose 3D solution, it is possible to compare the GPU to the CPU technology. Previously, “We'd have one to two iterations a day, maximum.  With this solution we saw a ten-fold speed increase, and that was just on the simulation side, not even the rendering side. We went from these overnight simulation runs to having four, five, maybe six different versions a day of the same shot to show.  One of the best quotes I heard from a supervisor was, ‘Hey, just do another version. Show me something else, something different.’ That was something we could not do before.”

“Using traditional CPU-based rendering technology,” says Spina, “you typically get one opportunity to render a full resolution simulation, because it takes so long. Thus, historically, it has stripped all the creativity away, because production schedules are so tight. Now, a single frame that used to take a week can be rendered in a single hour. Even pre-visualization scenes that traditionally are low-resolution are now closer to final renders because they can get that quality now.”

The next-generation CUDA architecture, known as Fermi, will enhance GPU performance by another order of magnitude. With hundreds of CUDA cores and 3.0 billion transistors Fermi will have eight times the peak double-precision floating-point performance. The Fermi architecture has 3rd generation streaming multiprocessors, 2nd generation parallel thread execution, an improved memory subsystem, and a two-level, distributed thread scheduler called the GigaThread engine.