This post is part of a development update series about my upcoming indie game, Non-Essential Personnel. I've been writing these every Friday for a few weeks now. So far so good. =)
Why are simple things so hard?
So, last week I got rock-throwing working. Hooray! Now the game has something to do. You can break rocks. You can pick them up. And you can throw them back at the ground.
The trouble is, the rocks look really bad when thrown. It took me a long time to figure out why. And the entire time I couldn't understand why they looked bad. They just did. it nagged at me.
It bothered me.
Screenshots of the game look fine though. My handy-dandy new screenshooter tool built-in to the engine makes taking screenshots a breeze now, so here's one of a rock in flight.
That little speck between Joe and the pile of other rocks? That's the flying rock.
It's supposed to be that size, that color, and in that spot... so what's wrong?
The trouble is, what my eyes see in the game, and what the screenshot shows don't seem to agree.
After driving myself crazy trying to figure out why simple things like a few moving pixels look weird, I've discovered that it's because our eyes aren't really built to work well with video game rendering.
At first, I checked all kinds of things. This kind of problem is sometimes called ghosting, which can be caused by atmospheric transmission of television signals and bad monitors, among other things. It's probably not TV signals, but let's check the math on the bad monitor option.
My LCD monitor is refreshing at 60 times per second. The rendering engine is pushing out 60 fps to match (with vsync on), so that gives the monitor a little over 16 ms to update its little pixel-like things when a new frame is ready. My monitor has a lovely refresh rate of about 3 ms, so that means the image should be sitting there on my screen for about 13 ms before changing again. There's no way the monitor could be the source of the ghosting then. As far as I can tell, my display is faithfully reproducing the images I send it. If I had a nice camera, I could take a picture of it, and I bet it would look pretty close to the screenshot. The "camera" on my phone doesn't seem to be up to the job though.
Fun Fact Box You can apparently buy a monitor with 60 Hz refresh rate and a 25 ms response time. There doesn't seem to be much point though unless you wanted to use it as a billboard.
Ok, so if the monitor is faithfully showing one distinct frame about every 16 ms, then why do I see afterimages when the pixels start moving at anything faster than a leisurely pace?
Eyes are weird
Strangely enough, our eyes don't work on a concept of frames per second. Human vision is a continuous thing where signals of color accumulate in the brain and leave at a certain decay rate. This accumulation/decay phenomenon has the effect of blurring together colors over certain timescales.
What timescales, you ask? Well, a random photographer on the internet that ranked highly in the search results says colors get blurred over a time window of about 33 ms. That means, at any given moment in time, your eye/brain is processing about 33 ms worth of Stuff We See. Anything that moves fast enough in that time to look different from start to finish will look blurry to you.
Could this eye-induced blurring be responsible for our ghosting problem?
Maybe? Let's check the math.
In 33 ms, our game engine is showing a little over 2 frames of Stuff We See. In the game, I see the rock where it's supposed to be, and an extra ghost rock or two, so that matches nicely. 2-3 rocks, 2-3 frames in the blurring timescale. The flying bit of block in this case is moving at about 14 pixels per frame, and since it's only 8 pixels wide, that's enough to look significantly different between frames. We haven't shot any huge holes in our theory just yet, so let's just run with it and see what happens.
So the block is moving fast enough between frames to fall within the blurring timescale of our eyes. Then why does it look ghosted instead of... well, blurry?
It's because of how bits of rock move in artificial game worlds. What would happen if you threw a rock in the real world at the same speed? Try it out. Go to your back yard and throw a rock at about 14 pixels per frame and watch what happens.
The motion of rocks (and lots of other things) is continuous in the real world, rather than discrete like in the game world. Meaning, at every instant in time in the real world, a rock moving at a constant non-zero speed occupies a unique position in space. In the game world, at every instant in time, that rock occupies some position in space, but a lot of those positions are the same.
Words are hard. This is a lot easier to explain with pictures.
In our flying rock scenario, let's say the "position" axis is the y-position of the rock. A 33 ms blurring window (placed arbitrarily) is shown too. Now let's look at which positions the rock spends most of its time in within the 33 ms window. This ends up being a kind of a continuous version of a histogram and it's a rough idea of our eyes see after accounting for the blurring.
In this time window in the real world, the rock spends most of its time at the greater y-position (ie, the apex of the rock's flight trajectory). There's no time spent at positions greater than that, and a little bit less time is spent in lower y positions as the rock falls.
In the game world, the rock spends all of its time at just three positions. There's no time spent at any positions in between them.
Bingo! Here's where our ghosting is coming from!
We're used to seeing things like the graph on the left. Smooth. Pretty. Continuous.
Stuff like the graph on the right looks weird to our eyes. Nothing in the macro-scale real world moves via a series of teleportations. Real object have to occupy all the points in between too. So when our eyes blur discrete motion like that, it doesn't look like what we're used to. It looks like ghosts.
If we want our game engine to stop looking weird, we need to find a way to turn the right graph into a graph more like the left graph.
Post-processing to the rescue!
Since our game objects move discretely, so there's no way the motion blurring done by our eyes/brain can ever look normal. Even if we had some kind of perfect continuous physics simulation, we'd still be limited by the refresh rate of the monitor, which is far too slow to make the natural motion blur look normal. That means the only way we can get objects to look like they're moving normally is to simulate motion blurring effects in our rendering pipline.
The naive way to simulate motion blur is to just render the moving object at lots of extra points each frame and average all the resulting images together.
Actaully rendering the objects multiple times is super expensive to do though. The overdraw is just horrendous, so let's just see how far we can get with approximations and hacks instead. =P
To simulate motion blur, NEP uses some post-processing shader tricks. The basic idea is to send information about color and velocity to a post processing shader, which uses some math to blur the colors based on the velocities. Getting color to the shader is easy since that's the whole point of post-processing. To send velocities to the shader, we'll just pretend velocities are actually colors and use the same idea.
When the game engine renders each frame, the colors are written to a texture in a frame buffer using the usual rendering shaders. To get velocity information, we just modify that rendering shader to write the velocities to another texture in the same frame buffer. This trick is generaly known as Multiple render targets, and it looks something like this:
We're actually rendering to multiple targets at once thanks to the marvels of modern programmable rendering pipelines. Then, both textures are sent to another shader in a post processing step to compute the actual motion blur.
Motion blur in shaders
There are lots of ways to approximate motion blur. One commonly used method applies blur to a pixel if it has a non-zero velocity. The blurring itself it done by sampling the color texture multiple times along the direction of the velocity and then averaging the colors together into the output pixel. It's a very cheap and efficient technique for motion blur, but it looks just awful in our flying rock case.
We actually want the rock to look like it's smeared along the velocity direction, so we need a way to blur pixels that are outside the rock too.
One way to do this is Geometry stretching, but that just seems like waaaay too much work for a 2D engine where most of the geometry is quads with pixel-perfect textures and heavy alpha blending. It's simple enough to perturb the quad vertices in a vertex shader, but vertex normals aren't treated the same way in 2d rendering as they are in 3d rendering, and you'd need to somehow undo that transformation on the texture coords so the sprites (ironically) don't get distored. That's not the kind of distorion we're going for.
A much older and simpler technique computes motion blur by searching around each pixel for nearby pixels in motion. Then the moving pixels are blended into the out putpixel. This is the basic idea behind a convolution filter, but it's notoriously expensive to implement due to the large number of texture lookups required.
Ignoring the expense for a moment, once we've found the nearby moving pixels, we implement the motion blur in basically the same way as before. For each nearby moving pixel, we ask if its velocity vector reaches the output pixel. If no, we ignore that moving pixel. If yes, we we sample the color texture at that position and average it with all the samples for this output pixel.
Even implementing a medium-sized convolution filter, say 5x5 pixels, would require doing 25 texture lookups per pixel (x2 since we're doing color lookups and velocity lookups). That's asking the shader to do rougly 25 times more work accessing textures than usual, and it will only let us blur things up to 2 pixels away. If our rock is moving at roughly 14 pixels per frame, that's not nearly enough distance to make a reasonable-loooking motion blur. We need to get at least a 10 pixel blur radius to smear the rock more realistically, but that means the shader would need to implmement a 21x21 convolution filter and do roughly two orders of magnitude more work. That's way too slow for real-time rendering, even on today's super fast GPUs.
That means it's approximation time!
A convolution filter based on a 21x21 kernel matrix will look at every pixel whose L-infinity norm is at most 10 to the out pixel. But we probably don't need to look at each and every one of those pixels to compute a somewhat-realistic-looking motion blur. What if we just pick a few dozen or so of our favorite pixels from that grid, and see what that blur looks like?
It's pretty fast, but how does it look?
Ah, fast-moving things looks appropriately blurry now and my eye is sufficiently fooled that the rocks are moving continuously instead of discretely. It's not perfect, but it's a pretty good approximation. My OCD can rest for a bit. =)
Now that I have a proper multi-stage rendering pipeline, I can do all kinds of fancy rendering tricks. Next, I want to do environmental reflections. ie, making metal things look shiny. The game will eventually have lots of machines in it and it would look weird if the metal bits weren't shiny. After that, I might experiment with HDR and tone mapping. It would make caves vs outdoor rendering look better, but it's not stricly necessary, so I'm on the fence about that one. Maybe I should work more on gameplay and fixing things that look objectively bad instead of just trying to cram in more pretty.
I also announce blog posts on Twitter.Follow @cuchaz