Director Robert Zemeckis and actor Tom Hanks took this groundbreaking approach to Chris Van Allsburg’s Caldecott Medal-winning storybook, with the crew at Sony Pictures Imageworks creating new processes for a new kind of moviemaking.
Leading the Imageworks team were senior visual effects supervisors Ken Ralston, a five-time Oscar winner, and Jerome Chen, nominated in 2000 for Stuart Little. "Three years ago, when we met with Bob [Zemeckis], there was no plan for how the film would be made," says Ralston, "other than he wanted it to have an artistic, otherworldly look."
Working from a script developed by Zemeckis, storyboards created by Doug Chiang, and production design by Rick Carter, the Imageworks crew developed a three-tiered test. In one, Tom Hanks and a child actor were photographed on a real set. "We shot it on high-def using a camera on a crane," Ralston says. "It took forever. Bob went out of his mind."
The second test had Tom in make-up on a blue-screen stage with actors playing the conductor and the kids. Third was a motion-capture test. "It took no time at all," says Ralston. "We didn’t wait for costume, for lights, for anything. Bob loved it."
"I thought it would be an animated film," says Chen, "but it was never a possibility. Bob wanted Tom to act in it, to play the boy… and the conductor, his own father, Santa Claus, and a hobo he meets on the train. He wanted the texture of real performances – to direct actors, not animators."
A Live-Action Animated Film
In other words, Zemeckis wanted to direct a live-action film starring animated characters. Unfortunately, the technology for capturing a full performance – face and body at the same time – did not exist; 360-degree motion capture, full face and body, with audio and video had never been done with one person, let alone multiple actors. So Imageworks developed a new system to do just that.
In the end, 80 percent of the human characters were motion captured, rather than performed by animators using keyframe animation – no small feat considering there are scenes in this film depicting Tom Hanks the boy riding on the shoulders of Tom Hanks the hobo skiing on the top of a train speeding through a terrain built like a roller-coaster ride; complex scenes with 30,000 elves, reindeer, and Santa; and a riotous scene with waiters doing a moon walk on the walls and ceiling of a passenger car while serving the children hot chocolate.
The action is limited largely to a train on Christmas Eve, but it’s a magical train that moves through more than 20 environments (all of which were modeled, painted, and sometimes animated), past herds of caribou and a pack of wolves, up a mountain, through snow and fog, all the while hissing steam and belching smoke into the moonlight- thanks to Maya, Houdini, RenderMan, a new proprietary Imageworks program called SPLAT that uses a painter’s algorithm to render smoke, and Imageworks’ proprietary lighting system.
"We didn’t light this movie like a live-action film," says Mark Lambert, senior CG supervisor. "The lighting is what made it not realistic. We could play up color in ways that wouldn’t happen in the real world- have three moons at one time, or make it foggy and moody. We could have beams of light coming through the dark. It’s Christmas Eve. Time is magical at that point."
However delightful the setting is, the film’s success depends on one big gamble: the application of Tom Hanks’ motion captured-performance to various characters, and of other adult performers to children.
Capturing the Details
On the set, the motion-capture magic unspooled on three stages: two large areas managed by Giant Studios for capturing the body performances of multiple actors using well-honed techniques, and one small stage, 10 by 10 feet square and eight feet tall, for capturing close-up body and facial performances for up to four actors simultaneously.
The innovation happened on the small stage. There, in addition to 32 reflective markers on their bodies, the actors wore 151 markers on their faces. The tiny markers- 2.7 mm dots the size of a 2B pencil lead- were painstakingly glued onto the actors by 15 make-up artists. "Before every scene, people with flashlights would inspect the actors’ faces to be sure all the markers were in place," says Demian Gordon, motion-capture supervisor, "and search the ground for any that fell off." In addition, anything the actors touched had markers, even the tiny circles of paper punched out of the tickets by the conductor.
Imageworks’ character technical directors, led by Alberto Menache, the senior CG supervisor who designed the studio’s muscle-based facial performance system, mapped the placement of markers on the actors’ faces. Gordon’s team placed Vicon 1.3 megapixel motion-capture cameras on a rig out of the way of actors working on the small stage.
The cameras were configured so that each could see a minimum number of dots. Working with Imageworks, Vicon updated its system to manage the 64 streams of data, which ultimately would act like one compound eye. "No one had wired together and captured data with more than 24 cameras before," says Gordon. "But we had to zoom in quite close to get the nuances of the facial performance, so we had four 12-camera volumes stitched together for the face because we couldn’t ever know where the actors would stand, or if an actor would look up or down, we had redundancy."
In addition to dots on the actors’ faces and bodies, the team put markers on props and sets, which were made of chicken wire or mesh so they wouldn’t hide dots from the motion capture cameras. Sometimes the actors even wore transparent costumes.
Every set was built in two sizes. When Hanks played the conductor, the crew used a life-sized set and props. When he or the other adult actors were children, the production used larger sets and props to make him look child-sized. Props to help the actors were rolled in and out, including a door that slid open like a train door and a tilting platform for scenes on top of the train. "It was tricky to keep track of everything," says Gordon. "We had normal people and little people playing elves and children and adults acting like children."
A group of people called the triangulators kept track of eyelines, sets, props, and problems having to do with scale, such as calculating how big an actor’s steps should be when playing a child so the motion-captured performance would look correct once it was scaled down.
Because adults performed all the roles of children, when Hanks acted as an adult, the scenes were motion-captured twice: once with children as stand-ins and once with the adults playing the children. When he interacted with another character he was playing – in one scene, for example, the conductor hands a ticket to Chris – Hanks worked with stand-ins. "We didn’t keep the children’s performances unless they were specifically called for," says Gordon. Actors moved from one motion-capture stage to another, depending on what the scene called for.
"The production crew had a giant tinker-toy kit that they could run in and assemble in any set at any time," says Gordon. " Bob wanted to shoot in continuity. It helped the actors keep their heads around where they were and who they were. Even though there were wireframe sets and props, the stages were still vague."
Says Chen, "The sets looked like the holodeck from Star Trek, everything grey and black with high-tech-looking lights. And then 20 feet away, inside the Widow Maker, there were all these people working on computers."
The Widow Maker
The Widow Maker was a 15-foot-wide, 50-foot-long room with desks on either side where 18 people from Giant and Imageworks checked the streaming data to make sure the performances were captured. Each take would often involve as much as three GB of data.
"To put this in context," says Gordon, who was also motion-capture supervisor for the last two Matrix films, "on The Matrix, six months of motion-capture data, every single file and every working version of every file I was involved in shooting totaled slightly less than 10 GB of data. On Polar, we shot 50 Gigs a day." Imageworks tweaked House of Moves’ Diva software to handle the data coming in and then tweaked Windows to handle the large file sizes. (House of Moves was acquired by Vicon last spring.)
"We could capture only three minutes at a time because there was so much data," says Chen. "Then we’d wait for a minute and a half to get confirmation that we got a good take. It was so stressful. Even though this had never been done before, everyone expected it would work. I had a walkie-talkie with an earwig to hear what was going on in the control room, but I took it off. They were always on the verge of it not working."
All the while Zemeckis directed the actors being motion-captured, as many as 12 cameramen were filming the performances. "If you’re not filming with a real camera, it can get confusing," says Chen. "We needed to be conscious of eye lines, of screen direction. [DP] Don Burgess was on set with us making notes for us to use later."
At the end of the day, Zemeckis took the video reference from the motion-capture sessions and began picking performances. "He wasn’t thinking of camera angles yet," says Chen. "He was looking at the way Tom ran in, the way he delivered a line."
Meanwhile, at Imageworks, the tracking and integration departments applied motion-capture data that matched Zemeckis’ selected performances to digital characters. At this point, the facial data was separated from the body data. The facial data was stabilized and attached to 300 muscles in Imageworks’ procedural facial system and sent down one pipeline. The body data, whether captured by Giant studios on the large stages or by the new Imageworks system on the small stage, was attached to digital characters in Alias Motionbuilder. Those digital characters, roughly animated with the captured data, were put into a digital set in Motionbuilder that matched the chicken-wire sets and props.
Virtual Remote Control
Zemeckis used the characters in that set to create camera moves. "The movie had to have a cinematic feel, like someone aimed a camera, not like animators keyframed a camera," says Chen. "So, because I wanted a real camera operator to operate the virtual camera, we gave Don Burgess a remote head like on a Technocrane." The resulting input device, which was connected to a computer running Motionbuilder, had one wheel that tilted the virtual camera and another wheel that panned it. The dolly move was keyframed. The entire system was dubbed Wheels.
When the film’s second DP, Rob Presley, operated the virtual camera in Wheels, he did so using the same devices he uses in the real world. "The only difference was that he was panning and tilting with a digital character, not a real one," Chen says. " Bob could take a master shot, close-ups, over the shoulders, all that stuff he normally would do. He shot the film like a live-action movie, but in the computer."
Because the figures in the real-time digital set initially included the body performance only, the crew put the facial performances Zemeckis had selected around the perimeter of the shot- which appeared as disembodied heads acting on the edges of the computer screen.
"Generally, Bob chose the facial performance on the set," says Ralston. "He had to, because when Tom the conductor talks to the little boy who is also Tom, Tom needed to play to the take of the conductor that’s driving the scene."
Separately, an Imageworks crew was refining digital characters’ structure and appearance. To create the models, all the human actors were cyber-scanned in costume, including Hanks wearing make-up for the various characters he played and the 24 kids on the train. Although an actor was scanned for Chris, the star, the final character was a hybrid. The characters’ bodies and skin were subdivision surfaces; the cloth in their costumes was created with NURBS.
"The models weren’t limited by the motion-capture," says Sean Phillips, senior CG supervisor. "We could use the data from any actor to drive any character and could map the facial performance from one actor to another."
Mapping the Muscles
The motion-capture data was mapped to the digital models’ 300 facial muscles using a system developed by Menache. "With a neutral face, there is no compression in any muscle, but when the face starts moving, a muscle starts compressing and expanding and those are the values I convert the data to," he explains. "The conversion outputs all the compression values for every muscle in the face given the state of the data."
Each character had its own unique set of muscles. "I took 80 pictures of Tom Hanks creating different facial expressions and then traced where I thought his muscles would be on waxed paper to come up with his anatomy," Menache says. The other characters have the same number of muscles, but the placement varies. To create individual anatomies, character technical directors put digital markers on the faces of the CG models and then ran a program that created muscles to match. The same arrangement of reflective dots was applied to the actor on set.
The facial performance could always be refined by animators. "They can go deep into the muscles and move a particular control or a group of muscles," Menache says. "To lift the right side of the mouth would involve 10 muscles, but just one control."
The bodies were rigged by character set-up supervisor J.J. Blumenkranz using an animation control system in Maya that allowed animators to work "above and beyond" the motion capture curves, according to Menache.
Moving the characters’ clothes and hair was another issue. "We had 40 characters to set up at the beginning and often had three or four characters per shot throughout the entire film," says Rob Bredow, senior CG supervisor. Rather than using the Maya cloth-based simulation engine developed for Stuart Little, the team developed a new "object cloth"-based pipeline with the help of Alias that could handle layers of clothes more efficiently.
For hair, they used a proprietary system based on subsets of hairs, an approach that’s similar to typical guide-hair-based simulation systems but different in certain proprietary ways that Imageworks is unwilling to reveal. "We have several simulation engines that we use depending on whether the character is in the background or foreground," explains Bredow.
Texture maps and shaders, including ones that created subsurface scattering of light in the skin, controlled the appearance of the characters. "We developed special shaders for the hair on 18 of the kids and six adults, and their eyes also needed their own shaders," says Lambert.
All of which contributed to the realization of storybook-style animated characters that were directed on stage by Zemeckis. "The biggest, most impossible challenge of the movie was making performances that weren’t cartoons," says Ralston, " Jerome and I spent a lot of time analyzing the faces, the eyes, how they were lit, how far we could take them without getting dead-on real, because real wasn’t the mandate. But it isn’t a cartoon, either. It’s a live-action film that was done in a computer. True human emotion had to come through.
"If you don’t think that’s hard," he adds, "just try and do it."