Balls to learning how to animate, let's film some parkour!

27 August 2016

Overlay of live footage with rotoscoping

Hallå! Welcome to my first game devlog.

I figured I'd start out talking about graphics, as they are a large source of anxiety for first-time developers. And not without reason! Everyone is a judgy bastard when it comes to how your low-budget game is presented. Oh sure, it's become easier for a solo developer to mask the familiar, rancid stench of programmer art with the equivalent of some pine-scented air freshener, but in practice this is a huge creative tradeoff. It really pays off to have a unique and consistent visual style in your work, even if you're not a proper artist.

Which I'm not. Lord no. What little visual talent I have is spread thinly between drafting/technical drawing (a survival skill picked up working as an engineer) and sandwich-grade graphic design. Oh and I can maybe do a scratchy pencil test of a background at 1/8th the speed of a regular artist. Drawing actual characters and animating them? That's just not going to happen.

But I wanted to do both of those things! So I figured the best way to fly closer to the sun would be to create an art and asset pipeline that played to my strengths, making the process for churning out Fine Art™ as close as possible to drafting. My weapon of choice for Fine Art™ is the vector editor Inkscape; very easy to pick up, and with a little practice can produce some polished results. I especially like the CAD-like pixel-perfect control you can use to make things like schematics or icons. Saved my bacon loads of times.

The only downside is that Inkscape, while wonderful, is not an animation editor. And even if it was, there was still the matter of sprouting talent from somewhere to draw realistic human movement. And I wasn't satisfied with the options of buying pre-fabbed animation packs from an asset store (artistic license strongly urges me to include a diverse range of player movements) or even commissioning someone competent to do it all (artistic license strongly urges me to be a cheap bastard). If only there was a way for dunce programmers like myself to cheat their way across this insurmountable art gap...

Rotoscoping

Let's wind the clock back to Prince of Persia (yes that one, from 1989); critically acclaimed, and one of the best-known examples in computer game history of programmer art surpassing regular art. Along with its predecessor Karateka, POP was one of the first home video games to use rotoscoped animation. And just look at it! Even on the Apple II with only 4 colours it's mind-blowingly detailed and smooth!

Photos used for Prince of Persia rotoscoping

But how was it done? The ingenious process is detailed in Jordan Mechner's "Making of Prince of Persia" diary:

Mechner would film his brother David performing a movement (e.g. climbing onto a wall) freehand with a VHS camcorder
The cassette was taken to a professional edit suite with a space age VHS deck that supported freeze-frame, and for every frame of animation at 15fps, Mechner would take a film photograph of the screen.
After the roll of film was developed Mechner would hand-draw a high contrast monochrome silhouette on each photo; blacking out the background with permanent marker and using white paint to add in shape detail rendered blurry by the video.
Finally, the finished sheet of silhouettes was mounted in front of the camcorder. The analogue video out was fed into an Apple II fitted with a GenLock device; no fancy editing software here, all it did was replace the black background of the Apple II screen with the live video, whereupon Mechner painted over the image again but digitally!

Another similar use of rotoscoping was by Eric Chahi in his masterpiece Another World. The setup was similar to Mechner's in that Chahi would film himself with a tripod-mounted VHS camcorder. The main difference is he owned a VHS deck with freeze-frame capability; he had this hooked up to his Amiga GenLock and would draw polygons over each frame of the frozen recording from the VHS cassette. In his notes Chahi describes this process as "a time attack race"; a paused VHS deck stops showing the picture after a minute or so to avoid ruining the playback heads. This setup was mostly used to animate closeups and cutscenes; by this point Chahi was already an accomplished game artist, and from what I gather drew a lot of the player movement animations freehand.

Comparison of Another World raw/rotoscoped footage

Instead of pixel-based bitmap graphics, most of the art was made with solid-filled polygons. At the time it was an innovative way to fit cinematic visuals into 2 floppy disks worth of space. Indeed, publisher Delphine Software found massive success using the same style in their next game Flashback, shortly before dooming the company with Shaquille O'Neal fighting simulator Shaq-Fu.

In the Another World engine, animation is stored as a fixed-FPS series of static keyframes. That is to say, every frame is a dumb list of pre-baked polygons. This can be rendered extremely quickly; no preprocessing or transformation was required, which is really good news on pride-of-the-90s computer hardware with no floating point processor. But this format has the same limitation as sprites: each animation has one native speed (~15fps), which looks jarring if the refresh rate of the device is higher (e.g. mobile devices run at 60fps). Also this can be a gruelling process for the animator, as they have to spend much of their time redrawing frame after frame of largely the same collection of body parts.

Regardless, I really liked this visual style of simple filled polygons. Most projects nowdays use flat-shaded 3D models to achieve this look (e.g. Toryansé, Kentucky Route Zero). In fact, if you have any measurable skill in the fields of modelling and rigging, this approach is considerably cheaper than creating 2D animation frames. (I'm unburdened by either, so pure expensive 2D it is!)

Skeletal animation

3D games have moved away from static animation techniques and instead rely on skeletal animation: that is, the skin of the character is created in 3D from vertices joined up as triangles, then rigged around a series of bones and joints which define the ranges of movement for each appendage. Bones and joints are just a convenient analogy to something we are already familiar with; the musculoskeletal system of an animal. In reality, they are a stack of mathematical operations used by the renderer to warp the vertices of the skin in real time.

What does this mean? Well, with Chahi's approach, for each animation keyframe you'd be setting the position of every vertex in the model. It's cheap to compute, but if you have lots of geometry you will waste a ton of time streaming the data in for every single frame. Also as mentioned before, a fixed FPS animation will look stuttery if your engine is drawing the rest of the world at a faster framerate.

Skeletal animation solves both these problems; you upload a single copy of the character model into memory, and your renderer uses a tiny list of bones to transform said model on the fly. Better still, you can make the animation buttery smooth by interpolating between frames. The traditional cheapo way doing this was linear interpolation (storing each bone frame as two points in (x, y) space); nowdays 2D animators prefer polar interpolation (storing each bone frame as a rotation angle and a length). Here's a demo of how both styles look from the same low-FPS source material:

Comparison of linear and polar interpolation

At first it doesn't look like much of a difference, but play close attention to the length of the character's legs around the knee joint as they swing over the bar. The linear tween looks weird and telescopy, because the lengths of the bones aren't preserved during the diagonal point-to-point movement. And as rotations/scaling are both fairly cheap and easy to do on today's graphics hardware, going for polar is a no-brainer.

Writing your own tools

If you haven't guessed from all that foreshadowing, my plan is to use rotoscoping, 2D vector graphics and skeletal animation. To my advantage I have the luxury of writing my engine for today's graphics hardware, and the technology to capture and edit HD video is super cheap.

There was one missing piece: tooling. When I was starting out in 2012, what I really needed was an animation tool that could do the following:

Construct a 2D skeleton with named bones
Use a feed of video frames as a background
Allow binding vector assets to each bone and editing them frame-by-frame
Provide some simple controls for adding metadata (e.g. set the relative scale, define a fixed point in space like a ledge, define the distance covered by a walk cycle)
Output all of the above in some format I can use in my asset pipeline

You'll be shocked to learn that I couldn't find a tool that did these things! The closest was Flash, which was a fairly ugly choice to contemplate, or a 3D animation tool like Maya or Blender, which (after a few hours of YouTube tutorials encouraging me to unlearn all of my Inkscape know-how) I had zero interest in learning how to use.

If I were starting out again I would probably consider Spine for skeletal animation; they seem to have fixed nearly all of the grievences people had with Flash, and it supports a lot of game engines out of the box. No idea if it supports rotoscoping, but the workflow looks pretty solid. Also worth checking out is the rotoscoping paint tool Paint of Persia, if pixel art is your forte.

In the end I wrote my own animation tool, jaiwhite. It's spartan and has the above features and nothing else. It's not the best tool, but by writing it I learned a fair amount about animation formats and making the data useable in a game engine.

The first thing I ever animated with it was a clip from the film Caddyshack: the iconic "Start a Party With a Radio in Your Golf Bag" scene. I rotoscoped it at the full 24fps of the source material, here's a playback capture at 48fps:

Rodney Dangerfield starting a party the only way he knows how

That went quite well, if a little jittery, so I moved onto phase 2: filming parkour.

Le Parkour

I have been training as a traceur for 4 years. At some stage I intend to write a full piece about how parkour has changed my life for the better, and how much I love the philosophy behind it, and how the wider parkour community is one of the most wonderfully diverse, amazing, inclusive groups of humans you'll ever come across. For now I'll leave you with this: if you've ever thought about trying parkour, go for it. Find a local parkour association and see if they do classes. You absolutely do not have to be a tank to start out, but if you persist you will become a tank. And for the love of God ignore everything on YouTube.

Where was I? Oh yeah. Filming. I used the following:

1x computer (Toshiba US build-a-PC)
1x Canon Vixia HF M500 NTSC camcorder + cheap SD card (Amazon US)
1x Fancier FT-6222A tripod (mystery HK seller on eBay)
1x Lufkin Autolock 8m tape measure (no idea, maybe Bunnings?)

The camera I have is not the best. It films in 720p30 and has lots of motion blur, but it's more than acceptable for low-rent mocap. If you have sunk all your money nursing an expensive photography habit, a DSLR with a rapid-fire capture feature makes nice crisp frames to animate with. I wouldn't recommend using a GoPro/other fixed-focus action camera, as the fish eye lens would distort the footage too much.

After a few practice runs I found a good rhythm for recording. I'd go to a place with an unobstructed view of an obstacle, set the camera up to frame the obstacle square-on with plenty of side margins to capture the movement's entrance and exit, mark out 1 metre on the ground with the tape measure, and then go ahead and perform the same action multiple times. After each take I would run past the camera and double-tap the record button to switch to a new file, then immediately go back for another shot.

This worked out pretty well. When you drill a movement in parkour there's a sort of bell curve for smoothness and control; the first attempts are a little rusty as you size up the obstacle, but after a few tries muscle memory will outgrow hesitation and you can focus entirely on form. Until you fatigue, at which point form is the first thing to get thrown under the bus.

Editing

Great! We have an SD card full of parkour footage. What comes next?

File manager window full of parkour videos

If your answer was "some light filing", have a biscuit. Nothing fancy, just renaming all of the takes from today to have the name of the movement. Now out of the footage collected, pick a movement and go find the least silly-looking take. Eventually you'll realise that there aren't any (because you're an adult) and just guess which one'll animate the cleanest. I use ffmpeg to convert the video to a series of still images to use as keyframes.

For this runthrough I've picked a dash vault. On a practical level these are pretty useless; a speed vault or cat pass will do just as well for clearing an obstacle, with a much lower risk of catching both your feet on the edge and faceplanting. But they do wonders for your self-esteem and I'm about as shallow as they come! On a more technical level this is a good animation to run as a test because all of the limb movements happen exactly (if you squint) along the viewing plane, meaning a single copy of the limbs drawn in at a side-facing angle should be enough for the whole animation. Other more complicated movements (e.g. lazy vaults, rolls) will have limbs overtly pointing towards/away from the screen, which might be better handled with multiple angles.

Let's run through the rotoscoping process. I start out by marking some useful points on the ground: the length of the 1 metre tape so that each animation has the same scale, and the start and finish point of the animation to create a walk cycle. There are other markers for e.g. ledges, to indicate a climbing animation around a fixed point, but we don't need them. Next, the bones are drawn in. My model has 19 bones; 4 for each limb, 2 for the back, one for the head. The bones are stored as a flat tree structure, with each keyframe storing relative polar coordinates plus depth for all the bones. I set the base of the neck as the "anchor" point from which all of the bones are defined, and ground level between the feet as the (x, y) origin of the model to match the physics engine.

There's a common naming scheme for the bones (e.g. head, forearm-l) so that multiple animations can use the same skeleton tree. We can make a reverse copy of a directional animation for free by replacing "-l" with "-r" and inverting all the x coordinates. If the animation needs to span two depth planes (e.g. that lazy vault animation I showed earlier), there's a hack where you can add a non-functional "plane" bone to represent the barrier.

After that, you pose the skeleton to match each frame of footage. Unlike Spine, jaiwhite has zero safeguards. You drag the nodes around until the pose looks okay. Plus there's no undo! I'm guessing this is very unhelpful for consistency if you're animating out of thin air? But we aren't, so... eat that, nerds! For my 30fps source video, I found that sampling at 10fps kept a lot of detail while smoothing out the jitter.

Rigging

And now the hardest part by far: drawing a vector representation of the body in Inkscape, and mapping it to the skeleton. By "drawing" I of course mean "noob tracing". All of the above was based on stuff I did over the previous couple of years. This? This is brand new. I have put it off for ages.

It's important to expand on why have I put it off for ages. I kept thinking about where I wanted to be (quickly-prototypable easy 2D rigging for my skeletons that I can churn out) in terms of what I already had (an animation format without the ability to store hints for the vector graphics format, an animation tool with semi-working support for drawing vector graphics, some dinky Python scripts), and kept getting hung up on how it was technically possible to move forward without changing anything! Just very, very unreasonably hard. And hey, it's not like there's a time limit to any of this...

That loud screeching sound was the whole project grinding to a dead stop. To get things back on track, I wrote down all of the roadblocks and a workaround strategy for each. Here's that list in full:

No animation file format support for binding graphics to bones
- Add another int32 index to each bone movement frame, denoting which graphics frame to use for each limb. Even if we start out with only one perspective (i.e. set the index to 0 for all frames), there's enough wiggle room to scale up later.
- Once a standard set of perspectives was picked, all a skin would have to do is make frames for each orientation.
jaiwhite support for editing graphics on the skeleton is fiddly and weak
- Shelve jaiwhite graphics editing for now; we don't want it to be easy to add a lot of extra graphics frames, otherwise making new character skins will be a nightmare
- Instead, pick a number of reference poses (e.g. front facing, profile, 45deg) and create graphics for these in Inkscape, because it's a good tool and you're good at that
- To whit, figure out a way of storing the reference poses bone information + graphics information in a SVG file, make a template pose in Inkscape, then bodge up another Python script to export this to the engine graphics format
- Don't worry about getting multiple poses to tesselate with the same mesh, I'm sure Future Me won't mind cleaning up that particular mess
No game engine support for binding graphics to bones
- Add engine stubs to load a skin, bind a graphics file containing all limbs (i.e. a complete skin) to an animation object, and render the correct graphics frame based on the bone frame
No vector graphics for player character yet
- Well trace something, you munch!

Extending the animation file format was by far the easiest part; all my asset formats were done in Protocol Buffers, where you can write the schema once and get lovely type-checked C++/Python/whatever bindings for free. Adding an extra int32 for the graphics frame took one additional line to the .proto file.

The rest I bodged together with scripts. I wrote one to dump a single frame from my animation format into a SVG template, with one layer per bone and the bones drawn in as arrowed lines, and another to convert it back to my graphics format. Once that was set up I traced over each limb with low-res polygons. I stress that this is pure drafting; choosing a handful of points along the curves and connecting the dots. If I did not have the photo as a base I would be reeeeally struggling. If you were feeling lazy you could make a single arm and leg then copy them across, but as all my bone lengths were non-fixed I opted to draw them in twice.

There we have it, a perfect likeness! Now we just go through the motions of binding those shapes to the bones we did earlier, load it all into the engine, and we should have a finished product that's ready to ship!

Wrongly proportioned animation plays back at weird speed

Well, maybe if you work for Activision! Hi and welcome to game development.

The thickness was a bit of a surprise. I did a thorough recheck of the source material to make sure I hadn't merely eaten too many pies that day, but it turned out I forgot to add the metre-autoscaling feature to the skin compiler. As in, the parallel axis of each bone was saved okay, but the perpendicular axis was saved in whatever weirdo non-metric moon units the SVG was drawn in, not the engine's 1 metre:100px.

After fixing up the scaling, colour palette, playback speed, missing arm, deformed limbs, joint seam ordering and edge lighting, it looked more like this!

Conclusions

I think this is an okay result, coming from a programmer with coarse drawing ability and zero prior experience of animating things. Rotoscoping is a powerful technique that hasn't aged a bit, and is incredibly cheap to produce compared to proper motion capture. I am at a loss as to why it has dropped in popularity, maybe it's the aforementioned lack of good tools?

Okay, so perhaps this exact method for producing animation is hard to recommend. You do waste a lot of time dicking about with tooling and engine support; time that could be spent rotoscoping statically! In essence, for a tool-and-engine driven approach you need to be sure that that [estimated time spent dicking about] < [estimated time agonizing over animating everything by hand].

Also you absolutely don't need to learn parkour for the filming part! [1] Rotoscoping is useful for capturing any physical object or realistic human movement you care to name.

Anyway, the actual conclusion I wanted to bring up was that you should always try and take advantage of whatever tools and skillsets you have to hand. After all, part of the fun in amateur gamedev is to try a bit of everything yourself, even the stuff you suck at.

(TIGSource thread for this post)

[1]	Seriously you should though it's super fun and useful

Posted 2016/08/27 - gamedev, parkour, animation, rotoscoping, drawing, inkscape, gifs, screenshot saturday, infinite chessboard of doom