Rendering 3d From Scratch Chapter 3 - Math!
The ultimate goal is to take a bunch of 3d shapes and draw them onto a 2d computer screen. Sounds easy, right? We’re just removing a dimension, after all. Well, it turns out it is pretty straightforward, but there are a few important pieces of math we have to internalize.
First, let’s start with how the eye works:
Now, I don’t mean ‘eye’ in the sense of a human eye. This isn’t going to be a biology lesson. I just mean eye in the sense of something that observes a 3d scene. Sometimes, the term camera will be used as well. When an eye observes a 3d scene, light rays bounce off the objects and fly straight into the eye. A real eye (or camera, for that matter) has a lens, which focuses those rays and then an “upside-down” image is drawn onto the back of the eye (in a camera, an upside-down image is drawn onto a sensor). Your brain does the work of flipping this image, and it turns out your brain is pretty good about making adjustments to this.
Real eyes have the disadvantage that they don’t really have any information about the world until light rays have hit them. Here, we have a leg up on the eye. We don’t need a lens, and we certainly don’t need to draw an image upside-down and then flip it. We can use the fact that light rays travel in a perfectly straight line to our advantage and we can paint our scene to an imaginary plane floating in front of our eye:
Once we’ve drawn everything onto a plane, we have a 2d image that can be drawn onto a screen!
Okay, so we’ve figured out the real problem we’re trying to solve. We’re trying to take points in 3d space, and figure out where those points fall on a plane that is floating in front of the camera. We know that they all fall somewhere on a perfectly straight line from the point to the camera, but the question is how far? How can we figure out where these rays intersect with our plane? First thing’s first, let’s represent a plane.
A plane is an infinitely large 2d object somewhere in space. It’s often represented by a point and a normal. The point tells where it is in 3d space, and the normal tells what direction it’s facing. In our case, we can arbitrarily choose a point between our camera, and our “point of interest” (the point that the camera is looking at). And the normal of the plane is easy, too. It’s just a unit vector pointing at our camera, as we want our resulting 2d image to be facing our camera.
We have 3d shapes and a 2d plane pointing at our camera, let’s draw a line from a point on one of our shapes to our camera and see where it intersects our plane:
Now let’s label everything:
v = A vertex on our shape
c = The position of the camera
p = An arbitrary point on our plane
μ = The angle between VN and VC
θ = The angle between VC and VP
N = A unit vector normal to our plane
How do we figure out ‘x’, the intersection point of a light ray and our plane? It’s a difficult problem, but luckily there’s an amazing trigonometric tool for figuring this out: the dot product! The dot product relates the magnitudes of two vectors to the cosine of the angle between them.
$$a·b=||a||\space||b||\space{cos(\theta)}$$
What we can do with the dot product is figure out the magnitude of the line VC. Then, all we need to do to get the point ‘x’ is take the point ‘v’ and add the direction vector ‘VC’ multiplied by the magnitude ||VC||. It looks something like this:
$$x=v+\hat{VC}\space||VC||$$
We have to figure out ||VC||. Let’s write out some known formulas given the dot product and the above diagram:
$$VP·N=||VP||\space||N||\space{cos(\theta+\mu)}$$
N is a normal vector, so it’s magnitude is 1. We can get rid of it.
$$VP·N=||VP||\space\space{cos(\theta+\mu)}$$
Okay, keep that in mind. Now let’s do another one:
$$\hat{VC}·N=||\hat{VC}||\space||N||\space{cos(\mu)}$$
The hat means it’s normalized. Normalizing a vector is quite easy, you just divide by its magnitude. A normalized vector will have a magnitude of 1, so in this formula, we can get rid of everything but the cosine:
$$\hat{VC}·N={cos(\mu)}$$
If we divide the two values, we get:
$${VP·N\over\hat{VC}·N}={||VP||\space{cos(\theta+\mu)}\over{cos(\mu)}}$$
Okay, but where does ||VC|| come in? Well, for that, we can just use basic geometry. You recall SOHCAHTOA, right?
$$cos(\theta+\mu)={||VN||\over||VP||}$$
$$cos(\mu)={||VN||\over||VC||}$$
Do a bit of algebra:
$$cos(\mu)={{||VP||*cos(\theta + \mu)}\over||VC||}$$
$$||VC||={{||VP||*cos(\theta + \mu)}\over{cos(\mu)}}$$
Ah! So that formula above is equal to ||VC||! So, all we have to do to determine the magnitude of VC is to take the ratio of those two dot products! Amazing. Here is how you do a dot product in code:
As you can see, this method lives on our Vec3 class from before. That’s a reasonable place for it. Now we can do the full ‘project face’ method. It’s a straightforward process of taking each vertex of a face and projecting it onto a plane. This is what it looks like:
Now, that’s a lot to take in, so I’ll stop there. At this point, you’re armed with the fundamentals of 3d drawing. You know how to take a point in 3d space and convert it into a 2d screen coordinate. Next time, we’ll talk about converting these points on a plane into pixels on a computer screen!