Reading: Nested Transforms

Previously, we looked at the instance transformation: translate, rotate and scale, to place on object in the scene. The following discussion is about more complex sequences of transformations, and nested transformations.

Composite Objects

When building a complex object (like a bicycle or a teddy bear) or a complex scene, we rarely want to build it out of low-level vertices and faces. Instead, we'd like to build it out of high-level components, maybe cubes, spheres and cylinders, or even higher-level things, like wheels, trees, arms and legs, and so on. In this reading, we'll learn about building complex objects out of components.

Let's take a snowperson as our basic example. A snowperson is a composite object comprised of three or four components: three white spheres and an optional orange cone for the nose. (No, we're not doing anything as complicated as Olaf.)

Our basic strategy will be to:

Create a container object that will be the representation of our snowperson. The spheres and such that make up the snowperson are put in this container.
The container will be an instance of THREE.Object3D, which is the parent class of THREE.Mesh. We can also use THREE.Group, which is nearly identical and is intended to "make working with groups of objects syntactically cleaner." The parent class has the attributes and methods we need for the instance transform (so we can place our snowperson in the scene), but doesn't need vertices and faces, like a THREE.Geometry object.
The container has methods to add components, using the same .add() method we know from adding things to the scene. It also has properties that include the list of components (.children) and methods to find a particular component.
We can use the instance transform to place our components into the container. That is, we decide where the origin and axes will be for our container (snowperson) and arrange the components within that structure.

Technically, a coordinate system is the directions that the x, y, and z axes point, and a frame is a coordinate system plus a location for the origin. Thus, a snowperson can have its own frame and we position the components in that frame.

You should usually choose frames for your composite objects that make it easy to position them within your scene (or a larger composite object that they are components of).

Demo: A Snowperson

Consider the following simple snowperson:

Here is the function that makes the snowperson:

The local function addSphere() creates a sphere geometry of a given radius, a mesh, positions the sphere at a given height above the origin, and adds it to the container, snowObject. We've done stuff like this before, adding things to a scene. snowObject is just a THREE.Object3D, created a few lines later.

Note that we need to set the position of the bottom sphere up by its radius, if we want the position (origin) of the snowperson to be at its base and not in the center of any of its spheres. It's convenient for positioning the snowperson to have the origin there, so we go to the extra effort of positioning its spheres relative to the desired origin for the snowperson. That is, the spheres are positioned in the snowperson's frame.

Now read over the whole function, to make sure you understand all the parts.

Next, here is the code that creates and adds a snowperson named frosty to the scene. Adding a snowperson isn't really much different from adding a box.

var scene = new THREE.Scene();

// create the snowPerson and add it to the scene
var rbot = 3;
var rmid = 2;
var rtop = 1;
var frosty = createSnowPerson(rbot,rmid,rtop);
scene.add(frosty);

Here is a more elaborate version of the snowperson demo that includes GUI controls: snowperson with GUI. As you play with the GUI, notice that the instance transform (position in the global coordinate frame for the scene) applies to the whole snowperson, not just one part. That's very powerful.

Demo: Rotating a Cone

Positioning something is pretty straightforward (though we did see some trickiness with positioning the spheres in the snowperson's frame), but rotating is a bit more complicated. If we create a THREE.ConeGeometry, say with a length of 10, it is constructed in a frame with the Y axis running through the center of the cone, with the top at 5 (half the height) and the bottom at -5.

Rotations, for mathematical reasons that we'll see soon, are always performed around the origin. That's fine for certain kinds of rotations, but not so great for others. However, we can use composite objects to rotate around whatever location we want.

Consider the following demo:

Cone Rotation

Try to use the GUI to swing the cone like a bell from the top.

Here is the code that makes that scene:

function makeConeScene(scene,parameters) {
    var rb = parameters.radiusBottom;
    var height = parameters.height;
    // using globals for these objects, to allow updates from GUI.
    inner = TW.createMesh(new THREE.ConeGeometry(rb, height));
    inner.name = "inner";
    inner.position.y = parameters.positionOffset;
    inner.rotation.z = parameters.innerRotation;

    outer = new THREE.Object3D();
    outer.name = "outer";
    outer.add(inner);
    outer.rotation.z = parameters.outerRotation;
    scene.add(outer);
}

The code just creates a cone, which is the inner object, and the outer object is just a THREE.Object3D. The inner one is placed inside the outer at an offset. If we rotate the inner one, the cone rotates around its center, as we expected. If we rotate the outer one, the displaced cone rotates around the origin of the container.

If you offset the cone downward by half its height and set the inner rotation to zero, the point of the cone is at the origin of the outer frame, and we can rotate the cone around its apex. Very cool!

Demo: A Leg

Rotating around something other than the center of an object is very useful. Here's how we might model a jointed leg:

Here is the code that makes that leg:

It's longer and more complicated than the earlier code, but not conceptually harder. Still, it can help to visualize the various parts. Here's an annotated screen shot:

Leg with components shown in yellow

The yellow highlighting indicates the higher-level components:

foot: the object containing the shoe. Yes, this seems backwards, but I wanted the foot component to have the ankle joint, and, to avoid confusion, I wanted to give a different name for the component. The origin is the ankle.
lower leg: the object containing the calf and the foot. The origin is the knee.
leg: the object containing the thigh and the lower leg. The origin is the hip.

Kinematics and Inverse Kinematics

Programming with nested frames is not necessarily easy, but it is amazingly easier than if we did not have nested frames. Imagine trying to position a soccer ball near the end of the foot: You need to know the (x,y,z) coordinates of the end of the foot, so that you can position the soccer ball there. So, all you have to do is calculate the end point of the foot, given the lengths of the limbs and the joint angles. An hour or so of trigonometry should do the trick.... The problem of figuring out where the endpoint is given a set of limb lengths and joint angles is called a kinematics problem.

Aside: Note that your body/mind is amazingly good at calculating the joint angles necessary to put your feet and hands where they need to be. Roboticists call that the inverse kinematics problem, and it's not easy! It also comes up in 3D Graphics and Animation, as we try to have graphical objects interact with each other.

Sequences of Transformations

How is the barn like a picket of a fence? They're certainly different in size, and of course a picket is tall and skinny, while a barn is relatively squat. But these are just scalings. In other words, given a scale transformation, the barn is the same as a picket. So, we can make a picket fence by drawing the barn many times!

Here's a demo that creates a fence:

Here is the code that makes that scene:

Notice the transformations in the code. By the time we get to the end of the fence, that picket has been translated numPickets times (set to 30 in the code that calls the makeFence() function).

Here's a slightly more complex example, a curved fence:

Here is the code that makes that scene:

All these rotations and translations accumulate in the object (the picket), yet the object has finite representational space. (Okay, it could keep a list of all the modifications that have happened to it, but that would be ugly and, as we'll see, unnecessary.)

Cloning

In the code above, you'll note a new method clone() that we used to create each new picket. The clone method is defined for all descendants of THREE.Object3D, and is a convenient way to make another instance of something you've built.

The clone() method takes two optional arguments, which are rarely supplied. The first is the object to put the copied stuff into; the default is to create a new THREE.Object3D. The second argument is boolean: true means a recursive copy, and false is just a copy of this object, but constituent objects are shared. By constituent objects I mean the parts of a composite object, like the spheres of a snowperson. The default is recursive copying.

Scene Graphs

The data structure we are building here eventually is added to the THREE.Scene object (itself an instance of THREE.Object3D), along with other graphical objects. The entire scene, then, is a graph of objects. (A directed, acyclic graph or DAG.) It is therefore called the scene graph. The scene graph for the leg example looks like this:

Representing an Affine Transformation

The preceding sections have shown you how to model complex objects as the composition of one or more components, and sets of affine transformations on the underlying geometries. Remember that an affine transformation transforms a line into a line and a triangle into a triangle, and it includes the translation (position), rotation, and scaling transformations we've used.

We've seen that many transformations might be applied to some geometry as it is being drawn. Think about the transformations for the vertices of the foot, which are affected by at least three rotations, translations, and so forth.

How does the graphics system represent an affine transformation? When it's rendering the last picket of a fence, is it remembering each transformation call, or is it somehow able to remember just the aggregate? It's the latter, but we need to do some work first to see why.

Because there's a fair amount of math, the next part of the reading is written in LaTeX, so please read the following PDF file (also posted directly on the schedule page):

affine math

Even though we won't typically be working directly with matrices in our programs, there are still good reasons to learn this math:

If you don't understand the concepts behind the math, you won't understand the terminology that OpenGL uses (matrices? CTM? homogeneous coordinates?)
Seeing how the major operations are defined as matrix multiplications is pretty cool.
Homogeneous coordinates are a neat trick.

The Teddy Bear

Here's a screenshot of the Teddy Bear:

screenshot of TeddyBear

TeddyBear with tons of modifiable parameters — play with this a bit, and look at the code.

// when true, objects are drawn in wireframe
var wireframe = false;

// Colors and materials for the teddy bear.

var bodyColor = 0xD08050;
var headColor = 0xB07040;   // like body but slightly darker
var bodyMaterial = new THREE.MeshBasicMaterial({color: bodyColor});
var headMaterial = new THREE.MeshBasicMaterial({color: headColor});
var blackMaterial = new THREE.MeshBasicMaterial({color: 0x000000});
bodyMaterial.wireframe = wireframe;
headMaterial.wireframe = wireframe;

function createNose(params) {
    var sd = params.sphereDetail || 10;
    var radius = params.noseRadius || 0.6;
    var noseGeometry = new THREE.SphereGeometry(radius,sd,sd);
    var noseMesh = new THREE.Mesh(noseGeometry, blackMaterial);
    return noseMesh;
}

function addNose(head,params) {
    /* adds a nose to the head. It's placed by creating a composite object
     * centered in the middle of the head, and positioning the nose at the
     * head radius on +Z, then rotating around X by a little. */
    var noseframe = new THREE.Object3D();
    var nose = createNose(params);
    var radius = params.headRadius || 2;
    nose.position.z = radius; // within the noseframe
    noseframe.add(nose);
    var angle = params.noseRotation || TW.degrees2radians(10);
    noseframe.rotation.x = angle;
    head.add(noseframe);
    return head;
}

function createEar(params) {
    // side is 1 (right) or -1 (left)
    var sd = params.sphereDetail || 10;
    var radius = params.earRadius || 0.6;
    var earGeometry = new THREE.SphereGeometry(radius,sd,sd);
    var ear = new THREE.Mesh(earGeometry, bodyMaterial);
     //Flattens the sphere to make it look more like a flat disk
    ear.scale.z = params.earScale || 0.5;
    return ear;
}

function addEar(head,params,side) {
    /* adds an ear to the head on the right (side=1) or left
     * (side=-1). The center of the ear is flush with the surface of the
     * head by moving it out by the radius, and rotating it around the z
     * axis to get it to the desired height. */
    var earframe = new THREE.Object3D();
    var ear = createEar(params);
    var radius = params.headRadius || 2;
    var angle = params.earAngle || Math.PI/4;
    ear.position.x = side * radius; // within the earframe
    earframe.rotation.z = side * angle;
    earframe.add(ear);
    head.add(earframe);
    return head;
}

function createEye(params) {
    var sd = params.sphereDetail || 10;
    var radius = params.eyeRadius || 0.3;
    var eyeGeometry = new THREE.SphereGeometry(radius,sd,sd);
    var eyeMesh = new THREE.Mesh(eyeGeometry, blackMaterial);
    return eyeMesh;
}

function addEye(head,params,side) {
    /* adds an eye to the head on the right (side=1) or left
     * (side=-1). The center of the eye is flush with the surface of the
     * head by moving it out along the z axis by the radius, and rotating
     * it around the x and then y axes to get it to the desired height. */
    var eyeframe = new THREE.Object3D();
    var eye = createEye(params);
    var radius = params.headRadius || 2;
    eye.position.z = radius; // within the eyeframe
    var angleX = params.eyeAngleX || -Math.PI/6;
    var angleY = params.eyeAngleY || Math.PI/6;
    eyeframe.rotation.x = angleX;
    eyeframe.rotation.y = side * angleY;
    eyeframe.add(eye);
    head.add(eyeframe);
    return head;
}

function createHead(params) {
    /* Returns a teddy bear head object, with origin in the center, and
     * eyes on the +Z side of the head, and ears on the left (-X) and
     * right (+X) sides. */
    var head = new THREE.Object3D();

    var sd = params.sphereDetail || 10;
    var radius = params.headRadius || 2;
    var headGeometry = new THREE.SphereGeometry(radius, sd, sd);
    var headMesh = new THREE.Mesh(headGeometry, headMaterial);
    head.add(headMesh);
    if(params.nose) {
        addNose(head,params);
    }
    if(params.ears) {
        addEar(head,params,1);
        addEar(head,params,-1);
    }
    if(params.eyes) {
        addEye(head,params,1);
        addEye(head,params,-1);
    }
    return head;
}

function createArm(params) {
    /* returns an Object with the center at the shoulder and the negative
     * Y axis running down the center. */
    var arm = new THREE.Object3D();
    var top = params.armRadiusTop || 0.7;
    var bot = params.armRadiusBottom || 0.6;
    var len = params.armLength || 5;
    var cd  = params.cylinderDetail || 10;
    // console.log("cd is "+cd);
    // Turns out there's an error in Three.js if cd is a non-integer
    var armGeom = new THREE.CylinderGeometry(top,bot,len,cd);
    var armMesh = new THREE.Mesh( armGeom, headMaterial );
    armMesh.position.y = -len/2;
    arm.add(armMesh);
    return arm;
}

function addArm(bear,params,side) {
    /* adds an arm to the bear on the right (side=1) or left (side=-1). */
    var arm = createArm(params);
    var radius = params.bodyRadius || 3;
    var scale = params.bodyScaleY || 2; 
    var sx = params.shoulderWidth  || radius * 0.5;
    var sy = params.shoulderHeight || scale * radius * 0.7;
    // console.log("adding arms at "+sx+","+sy);
    arm.position.set( side * sx, sy, 0 );
    arm.rotation.z = side * Math.PI/2;
    bear.add(arm);
}
    
function createLimb(radiusTop, radiusBottom, length, params) {
    /* returns an Object with the center at the top and the negative Y
     * axis running down the center. */
    var limb = new THREE.Object3D();
    var cd  = params.cylinderDetail || 10;
    // console.log("cd is "+cd);
    // Turns out there's an error in Three.js if cd is a non-integer
    var limbGeom = new THREE.CylinderGeometry(radiusTop,radiusBottom,length,cd);
    var limbMesh = new THREE.Mesh( limbGeom, headMaterial );
    limbMesh.position.y = -length/2;
    limb.add(limbMesh);
    return limb;
}

function addLeg(bear,params,side) {
    /* adds a leg to the bear on the right (side=1) or left (side=-1). */
    var top = params.legRadiusTop || 0.7;
    var bot = params.legRadiusBottom || 0.6;
    var len = params.legLength || 5;
    var leg = createLimb(top,bot,len,params);
    leg.name = (side == 1 ? "right leg" : "left leg");
    var radius = params.bodyRadius || 3;
    var scale = params.bodyScaleY || 2; 
    var hx = side * params.hipWidth  || side * radius * 0.5;
    var hy = params.hipHeight || scale * radius * -0.7;
    console.log("adding "+leg.name+" at "+hx+","+hy);
    leg.position.set( hx, hy, 0 );
    console.log("rotating to "+params.legRotationZ);
    leg.rotation.x = params.legRotationX;
    leg.rotation.z = side * params.legRotationZ;
    bear.add(leg);
}

function createBody(params) {
    var body = new THREE.Object3D();
    var radius = params.bodyRadius || 3;
    var sd = params.sphereDetail || 20;
    var bodyGeom = new THREE.SphereGeometry(radius,sd,sd);
    var bodyMesh = new THREE.Mesh(bodyGeom, bodyMaterial);
    var scale = params.bodyScaleY || 2;
    bodyMesh.scale.y = scale;
    body.add(bodyMesh);
    if(params.arms) {
        addArm(body,params,1);
        addArm(body,params,-1);
    }
    if(params.legs) {
        console.log("adding legs");
        addLeg(body,params,1);
        addLeg(body,params,-1);
    }
    return body;
}

function createTeddyBear(params) {
    var bear = new THREE.Object3D();
    var body = createBody(params);
    bear.add(body);
    if(params.head) {
        var head = createHead(params);
        var bs = params.bodyScaleY || 2;
        var br = params.bodyRadius || 3;
        var hr = params.headRadius || 1;
        // calculate position for the center of the head
        head.position.y = bs*br+hr;
        bear.add(head);
    }
    return bear;
}

There are lots of tricks in this code. Look for the following:

Positions are calculated as much as possible, so that, for example, if you make the body bigger (either changing bodyRadius or bodyScale), the head still ends up in the right place.
Looking at addNose(), you can place the nose right on the surface of the head by positioning it using the head radius, and you can move it up and down on the head in an extra frame that you can rotate around the center of the head.
Similarly, in addEar(), the ears are placed on the surface of the head, symmetrically, by multiplying the x position by -1 for the left ear and +1 for the right ear. We can move them up and down the head by rotating this earframe around the Z axis.
The ear itself is a flattened sphere, done by scaling in the Z axis direction.
Eyes are just spheres, like the nose, and placed similarly, except there are two of them, so we use the -1/+1 trick for symmetry, and we have an angle around the Y axis (eye separation) and another angle around the X axis (high or low eye location).
Arms are cylinders, but the placement is a bit tougher, because if we put them right on the surface of the body, the joint will stick out. So, we do some approximating here.
We use non-uniform scaling on the body, to make the sphere into an ellipse.

Summary

The key ideas from this reading are:

Build complex items out of components that are higher level than vertices and faces, such as boxes, spheres, cylinders, and of course, other complex components built out of them.
We can build up our scene out of a hierarchy of components. This is called the scene graph.
The embedded frames make it possible to use the instance transform on higher-level components: whole legs and teddy bears, instead of just a box.
The embedded frames also make it possible to put the reference point (origin of the object) and axes wherever you like, so that you can position, rotate and scale however you like.
The mathematics of a frame is captured by a transformation matrix, which uses homogeneous coordinates.
Homogeneous coordinates allow all of our affine transformations to be represented as a matrix multiplication in four-dimensional space.
Because every affine transformation is a matrix multiplication, we can represent a sequence of transformations by a single matrix, just by multiplying the transformation matrices.
In fact, each Three.js Object3D keeps track of the collection of transformations on it using a matrix. (Actually, Three.js uses quaternions, which is a somewhat different representation, with similar properties and some advantages. We won't be discussing them. Three.js has a method, updateMatrix() to convert its quaternion representation to a 4x4 matrix.)
Graphics cards are really good at matrix multiplication.

Questions about Affine Transformation Matrices

Some thought questions about affine transformations.

Give an example of two rotations that are commutative. Give an example of two rotations that are not commutative.
Are translations commutative?
Do translations commute with scaling? With rotation?

Why Math?

In the past, some students have wondered why we're learning the matrix stuff and how that connects with the Graphics stuff. It's a legitimate question. It doesn't seem to be necessary to using the Three.js functions. Here are some reasons:

Practical: While it may not be strictly necessary, it can be helpful to have an intuition of what goes on behind the scenes. It can help to explain why certain Three.js properties and methods exist, such as _modelViewMatrix and updateMatrix(), and help you to understand the terminology of the reference books, such as the OpenGL function, glTranslatef(), which "multiplies the current matrix by a translation matrix." It will make you better graphics programmers.
Conceptual: OpenGL has been the standard in graphics APIs for about 25 years. Will it still be the standard in another 25 years? Maybe. Probably not. However, whatever the new standard, it will probably still use homogeneous coordinates and affine transformations, defined as matrices or quaternions. We try to teach you concepts that will last.
Standard: As much as this course has been redesigned so that we can get to more fun things faster, there's still an expectation about what a computer graphics course is about, and the math is an important component. Some of you may go on in this field, go to grad school or work for Pixar, and we want you to have the solid ground that will take you there.
Rigor: Learning the math can be hard, but undertaking intellectually demanding tasks is part of what college is about. So, even if this is the last experience you have with computer graphics, it's still good for you to have grappled with difficult concepts. (This is the eat your broccoli reason.)