Understanding 3D matrix transforms

Translation, Scaling, Rotation, and Skewing?!

In elementary school, we are taught translation, rotation, re-sizing/scaling, and reflection. The first three are used heavily in computer graphics — and they’re done using matrix multiplication.

If you’ve ever done a 2D or 3D game’s UI, you might have encountered transformations. Tutorials elsewhere describe them very superficially — they don’t dive into the mathematics on which those concepts were built. That works if you’re building fairly simple applications. At one point, it becomes necessary to just know it.

Here, I’m going to describe how transformations apply to points (and then objects) in a coordinate space.

Redefining points & vectors to fit our needs

A simple set of rules can help in reinforcing the definitions of points and vectors:

  • In a n-dimensional space, a point can be represented using ordered pairs/triples.
  • A vector can be added to a point to get another point.

 

 

Vector <a, b> can be added to point (x, y)

  • Similarly, the difference of two points can be taken to get a vector.
  • A vector can be “scaled”, e.g. multiplied by a scalar to increase or decrease its magnitude. If that scalar is negative, then it will be flipped and will be rotated 180 degrees.

Vector Space

Our n-dimensional vector space is described using the origin O(0, 0[, 0]). Any point can be derived as the sum of the origin O and a vector V.

 

 

[, a] notation demonstrates the concept for one higher dimension, i.e., P(x, y[, z]) is P(x, y) in a 2D context and P(x, y, z) in a 3D context.

Transformations

Matrices

I’m going to demonstrate how matrices can be used to translate, scale, and rotate any object consisting of vertices/control-points. Each transformation is applied to each point, rather than the object as a whole.

But, why should we use matrices for translation and scaling? After all, they are basic addition and multiplication operations on a 2D point. This is because of the associative property of matrix multiplication. You can multiply the matrices of multiple transformations to form one resulting matrix that can be directly applied on a point.

 

 

This is one reason why GPUs are optimized for fast matrix multiplications. In computer graphics, we need to apply lots of transforms to our 3D model to display it to the end-user on a 2D monitor. Those transforms are compiled down into one matrix which is applied to all the points in the 3D world.

Points as matrices

As we’re going to be using matrices, a point needs to be represented as a matrix rather than an ordered set.

Before going into 3D space, we’re going to first handle the simple 2D case. A point in 2D space is going to be represented using matrices.

 

 

A point is essentially the multiplication of two matrices — one describing the point’s coordinates and the other describing unit vectors and origin of the vector space.

Hence, we are going to shorthand the matrix form into one as:

 

 

Shorthand

1. Translation

Suppose we want translate a point P(x, y) by (δx, δy) to get to P`(δx, δy).

 

 

P to P` in matrix form

To do this translation, we multiply P by the translation matrix:

 

 

P x T(δx, δy) = P’

2. Scaling

It is counter-intuitive to think of “scaling” a point, rather than an object. So let’s take a rectangle centered at the origin. We want to zoom in 2x; by intuition, we will multiply the coordinates of each point by 2 here.

 

 

The inner rectangle is scaled twice to produce the outer rectangle

And it actually works. However, this doesn’t work in the case of an object that isn’t centered at the origin.

It will also translate the whole object away from the origin.

 

 

The smaller rectangle is scaled directly 2x; the result is shifted to the top-right.

To solve this “automatic shifting” problem, we do scaling in three steps:

  • Translate the object so that its center lies on the origin. Call the translation vector V.
  • Scale the control-points one-by-one.
  • Reverse the first step, i.e. translate the object with vector -V.

Scaling Transform — Instead of multiplying the coordinates of each point by the scale, we can instead use the following matrix:

 

 

Scaling transform matrix

To complete all three steps, we will multiply three transformation matrices as follows:

 

 

Full scaling transformation, when the object’s barycenter lies at c(x,y)

The point c(x,y) here is the barycenter of the object. This is just the average of all the control-points.

3. Rotation

2D rotation is fairly simple to visualize. It is done around the origin, where the clockwise direction is for positive angles.

 

 

Rotate (2,1) by 90 degrees about the origin

High school math helps us here by telling us a point P(x, y) becomes P’(X, Y) after rotating through θ, where

 

 

Check https://matthew-brett.github.io/teaching/rotation_2d.html out for proof/explanation!

The rotation matrix is fairly simple to follow:

 

 

Rotation matrix

Again, when we are rotating an object w.r.t its center, we must first bring its center to the origin via translation.

 

 

Rotating an object around its barycenter c(x,y)

3D Transformations

If you work with OpenGL or WebGL, you’re going to work in a 3D vector space; hence, generalizing the previous three transforms into 3D space makes them a lot more useful.

In a 3D space, a point is represented by a 1x3 matrix.

 

 

1. Translation

 

 

3D Translation Matrix

2. Scaling

 

 

3D scaling matrix

Again, we must translate an object so that its center lies on the origin before scaling it.

3. Rotation

Rotation is a complicated scenario for 3D transforms. Here, you need an axis around which you rotate the object.

Before generalizing the rotation for any axis, let’s do it around the x-, y-, and z-axes. After doing it with one axis, the other two will become fairly easy.

  • z-axis: Imagine a 3D coordinate system, where the x-y plane is your screen/monitor. A point on this plane is (h, k, 0); when you rotate is along the z-axis, which is pointing towards you, its z-component will still be zero, i.e. (h’, k’, 0). Hence, you can treat the rotation as happening in 2D with the x-y coordinates solely.

 

 

Rotation along z-axis

  • y-axis: Here, you are rotating in the z-x plane with y unchanged. It can be treated as 2D rotation with z-x coordinates solely.

 

 

Rotation along y-axis

  • x-axis: Again, it is in the y-z plane and x is unchanged.

 

 

Rotation along x-axis

Generalization to any axis: An axis is essentially a 3D line. It can be characterized with a point A on that line and a vector L along the line.

To rotate P along an axis, we will make A that point that is the intersection of the axis and its perpendicular passing through P, i.e. the orthogonal projection of P on the axis.

To do the transformation, we will now translate A to the origin and then rotate the vector L along one axis (we’ll use the z-axis here).

  1. Translate A to the origin
  2. Rotate vector L w.r.t y-axis so that it lies in the y-z plane.

 

 

The original vector (unchanged). We must rotate as shown by the cupping arrow. The vector is projected on the x-y to increase clarity.

3. Now, we rotate the vector w.r.t the x-axis so that it is aligned with the z-axis.

 

 

Now, we have transformed our coordinates so that our axis is aligned with the z-axis. We can apply the R(z) transform directly now, provided we have the angle alpha, which is the required rotation we want.

After applying the R(z) rotation, we must reverse the three preliminary transformation in order.

Overall, the whole rotation can be written as the product of 7 matrices:

 

 

Remember, these seven transformation can be multiplied beforehand to form one matrix, which is then applied on each control point. This is beauty of matrices in the world of graphics.

Using this method of rotation suffers from the Gimbal lock; hence, a more advanced method called “quaternion rotation” is employed in real-world implementation. I’ll discuss that in a separate story!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值