Reading: Introduction to OpenGL/WebGL, Three.js, and the Barn
Overview
To get started with graphics programming, we need to introduce a number of important, but somewhat disparate topics. By the end of this reading, you should be able to understand all of the example code for the barn demo. The topics we'll cover are:
- what an API is, and the three APIs we'll use in CS307: OpenGL/WebGL, Three.js, and TW
- how information is stored and processed in OpenGL/WebGL
- some geometrical objects that we'll want to be able to draw, and how they're defined in OpenGL/WebGL and Three.js
- how to create a simple scene using Three.js and TW
After all this, we'll finally be ready to read the code for the barn.
This reading also contains pointers to additional documentation for Three.js.
What is an API?
An Application Programmer Interface is a set of programming elements (types, variables, functions, and methods) that enable some new capability or allow the programmer to interact with a piece of hardware, such as a robot or a graphics card.
We have three APIs:
- WebGL. This is a standard graphics API, a subset of the full OpenGL
API that is supported by most graphics cards. (I'll use the terms
OpenGL
andWebGL
interchangeably.) OpenGL/WebGL is about modeling and rendering. The specification is documented in the online man pages, but these are difficult to read at best. Fortunately, we will use very little of this directly, because we are using Three.js. - Three.js. This is an API built on
top of WebGL, doing a lot of the modeling and rendering for you. WebGL is still there,
underneath, but we will rarely see it. Three.js is very powerful. It allows you to
ignore a lot of detailed technical concepts in Computer Graphics that plain WebGL would
force you to know, and the programming is far less work. I think that using Three.js
and learning the basic concepts is a good combination for this course. Your JavaScript
code might look like this:
var box_geom = new THREE.BoxGeometry(1,2,3); // width, height, depth
- TW. This is Scott Anderson's home-grown API that packages
up certain common
operations in our Three.js programs. It is layered on top of Three.js
— a thin layer at that! It does things like setting up a camera
for you and allowing you to toggle whether to show the coordinate axes. All of
these functions are available in the
TW
object, so your JavaScript code might look like this:var box_mesh = TW.createMesh( box_geom ); TW.cameraSetup( args );
Don't worry if you don't yet understand this code. The point is the syntax.
The TW API is written in JavaScript, Three.js and WebGL. You are always welcome and encouraged to read the code in threejs/libs/tw-sp21.js. It may change from time to time, since it's still under development, but that should be transparent to you.
The OpenGL Pipeline
The OpenGL API and the graphics card are implemented as a pipeline. (Those of you who have taken a computer architecture course will be familiar with this term.) What does that mean? It means that your calls to the OpenGL functions (or to code that eventually calls the OpenGL functions) will often put data into one end of a pipeline, where they undergo transformations of various sorts, finally emerging at the other end. The other end of the OpenGL pipeline is the monitor, or some graphics output device. You can read a lot more at Rendering Pipeline Overview.
Essentially, we put vertices of objects (and other information about materials, lights, and the camera) into one end of the pipeline and get image pixels out the other end.
Of course, that's not the only thing that happens. Some API functions modify the pipeline, such as specifying lighting, so that subsequent computations are modified in different ways.
For efficiency, the pipeline hangs onto some information and other
information just slips through, with only pixels to show for it. That is,
some data is not retained. In particular, vertices are
not retained. So, the barn
does not exist as far as the rendering
pipeline is concerned. If we want to look at the barn from a different
angle, we have to move the camera and render it again (by render,
I
mean send all the vertices down the pipeline again, converting them to
pixels).
In Three.js, we build data structures of vertices and faces, and when the scene is rendered from a particular viewpoint, the mathematics of the pipeline is re-executed with these values. The graphics card has memory, called buffers, that can be used to retain vertex values and attributes, avoiding the cost of re-sending the vertices down to the graphics card, which would otherwise be a performance bottleneck. Three.js does this for us.
Geometry Data Structures
The simplest things that OpenGL/WebGL can draw are objects that you're familiar with from the time you were first able to hold a crayon. However, some of them are a little different in OpenGL/WebGL and in Three.js. Also, how they're drawn is, perhaps, not what you'd think.
- Points/Vertices. These are just dots. Mathematically, points have location and nothing else (e.g. no area, so no color). We will use vertices to construct geometrical objects.
- Vectors. These are arrows. Mathematically, vectors have direction and magnitude and nothing else. You can't draw vectors in OpenGL, but they are used a lot. For example, vectors are used to specify the orientation of the camera, the direction of a light ray, and the orientation of a surface that the light is falling on. We won't see any vectors at first, but we'll be getting to them soon.
- Line Segments. In graphics, we are almost never interested in lines, which are infinite. Instead, we're almost always interested in line segments, defined by their endpoints. A line segment is just a pair of points/vertices.
- Triangles. For almost any real object, we don't want lines, we
want surfaces, and the representation of a surface is often
done by breaking it down into triangles, often an enormous number of
triangles. One great and important advantage of triangles is that they
are necessarily planar (flat)
and convex (no dents), and bad things can happen if you
try to draw a polygon that is either non-planar or concave. Those bad
things can't happen if you draw triangles. Three.js uses triangles as
its universal representation of geometry, even if the API call says that
you're building, say, a sphere or a cylinder. The actual representation
is a polygonal approximation of the smooth object. You can
usually specify the amount of smoothness in the approximation: a sphere
built out of 100 tiny triangles will be smoother than one built out of
10 largish triangles, as illustrated in the picture below.
Of course, the one with more triangles will take
longer to render. Usually, other factors will dominate, though, so be
thoughtful, but don't worry too much about performance. In fact, CG
experts describe the complexity of a scene in terms of the number of
triangles, and graphics cards will sometimes boast about how many
triangles they can render per second.
- Polygons. We said earlier that in graphics, we only draw line segments, not lines. Similarly, we don't draw planes, which are infinite, we draw polygons. Most of the time, though, we'll draw triangles, so we'll break up a polygon into some number of triangles.
- Polyhedra. A polyhedron is a 3D figure made up of vertices, edges and faces.
Front and Back of Triangles
Each face, of course, has two sides, just like a coin. One of these is the front and the other is the back. The (default) technical definition of the front is the side where the specified vertices of the face are given in counter-clockwise order. Here, face is defined from the front, and we will use the convention that the front corresponds to the outside of the barn.
Here's an example from the barn. The front of the barn is a polygon with five vertices (and, of course, five sides). Here it is, with the vertices numbered from zero to four (the numbering is arbitrary; we could have numbered them any way we want):
Because Three.js represents everything as triangles, we break up this polygon into three triangles. (Again, this is arbitrary — how many ways can you find to break up the polygon into triangles?)
We then define a face (a triangle) by listing its three vertices, in counter-clockwise order from the outside. So, 4,3,2 is one way to describe the top of the barn. The triangles 3,2,4 and 2,4,3 are equivalent, but 2,3,4 is not, because that would say we are currently looking at the back of the triangle.
This is important because, by default, Three.js only draws the front of triangles. That is, it skips rendering any back-facing triangles. You can see how that cuts in half the amount of rendering that has to be done, and at very little cost. For example, if we're only going to look at the outside of the barn, there's no reason to ever draw the back side of any of the faces.
Now, as it happens, in this demo I have specified to Three.js that the
barn faces are two-sided, so that Three.js will render both sides. That
way, if you dolly in so that the camera goes inside the barn, you'll be
able to see the other
side.
If you use the default, efficient setting of one-sided faces and you get your vertices in the wrong order, those faces will not be drawn — a strange and painful problem to debug.
Normal Vectors
Each face also has an associated vector that is perpendicular to the face, which mathematicians and computer graphics people call the normal vector. We'll learn that these are crucial in lighting computations.
The front of the barn lies in the z=0 plane, because all the z coordinates of its vertices are zero. Therefore, the front of barn has a normal vector that is parallel to the z axis. Take a moment to try to visualize this.
You can also try playing with this visualization of the coordinate axes to exercise your intuition. Check that the following claims make sense to you:
- The red arrow is a vector marking the X axis.
- The green arrow is a vector marking the Y axis.
- The blue arrow is a vector marking the Z axis. The Z axis initially points towards you, the viewer. You'll need to move the camera a little to see it — click and drag to move the camera.
- The axes are all perpendicular to each other. Synonymously, they are all normal to each other.
- The blue vector is the normal vector for the XY plane, which is also
the z=0 plane. Similarly, the green vector (
up
) is the normal vector for the XZ plane (theground
). Finally, the red vector is the normal vector for the YZ plane.
Fortunately, Three.js can compute normal vectors for us. Eventually, we'll learn how to do this as well.
Points and Vectors in Three.js
If we choose a suitable origin and coordinate system (more about origins and coordinate systems later), we can define a point in 3D space with just three numbers. We can also define vectors with just three numbers. Consider the following four objects:
P = (1, 5, 3)
Q = (4, 2, 8)
v = P-Q = (-3, 3, -5)
w = Q-P = (3, -3, 5)
P and Q are points in space. A vector can be thought of as an arrow between two points or even as a movement from one to the other. Therefore, the vector v from Q to P is just P-Q (subtract the three components, respectively). The vector w, from P to Q, is just the negative of vector v. In 3D, both points and vectors are represented as a triple of numbers.
How do we specify points and vectors to Three.js? Here's an example:
var P = new THREE.Vector3(1,2,3);
The authors of Three.js only defined a Vector3
class, which we'll
use for both points and vectors. However, they are different concepts, so we
will continue to use the term point
for the locations of vertices and the
term vector
for directions of rays.
Drawing in Three.js
How do we draw stuff in Three.js? It's a two-step process: we represent something and then we render it. So how do we represent stuff in Three.js? For example, how would we represent the barn that we saw on the first day of class? To answer that, we need to learn some non-obvious Three.js terminology:
- Geometry
- is a structure of vertices and faces and associated geometrical information, such as the vectors that specify the orientations of faces.
- Material
- is a set of properties that directly or indirectly
specify the color of the object. The colors of the faces can be set
directly by the material (such as,
this face is red
), or indirectly (this face interacts with light in the following ways ...
). We'll learn more about materials that interact with light a bit later. - Meshes
- A geometry is combined with a material to yield a mesh, which then has enough information to be rendered.
So, in Three.js, we'll build a geometry object, combine it with some material, and add it to the scene. Later, we'll learn about how scenes are rendered. Building a geometry looks like this:
var barnGeometry = new THREE.Geometry(); // add the front barnGeometry.vertices.push(new THREE.Vector3(0, 0, 0)); barnGeometry.vertices.push(new THREE.Vector3(30, 0, 0)); barnGeometry.vertices.push(new THREE.Vector3(30, 40, 0)); ... // front faces barnGeometry.faces.push(new THREE.Face3(0, 1, 2)); barnGeometry.faces.push(new THREE.Face3(0, 2, 3)); barnGeometry.faces.push(new THREE.Face3(3, 2, 4)); ...
The above code just shows a snippet of the construction of the barn geometry, for three of the ten vertices that define the 3D structure of the barn, and three triangular faces on the front of the barn.
Coordinates
Because we're using 3D models, all of our vertices will have 3 components: X, Y, and Z. You can set up your own camera anywhere you want, but initially the camera is such that X increases to the right and Y increases going up. The Z coordinate increases towards you, so things farther from you in the scene have smaller Z components or even negative ones. When we read the code for the barn, you'll see that the front of the barn is at Z=0 and the back is at Z=-len, when len is the length of the barn.
In OpenGL/WebGL, our coordinates can have (pretty much) any scale we want. For example, you could have all your X, Y, and Z coordinates be between 0 and 1. Or you could have them all be between 1 and 100. Or between -1000 and +1000. Furthermore, these numbers can mean anything you want, so your coordinates can be in millimeters, kilometers or light years. So if you're imagining a real barn, perhaps the numbers are in feet or meters.
The TW module can set up the camera for you if you tell it the range of your coordinates, called the scene bounding box. (This camera is often a bit too far for good realism, but is helpful for debugging.)
Creating a Simple Scene with Three.js and TW
Our introduction to the barn geometry began with specifying the
coordinates of vertices that define its 3D structure. Fortunately, Three.js
has a number of built-in geometries for common objects such as planes,
boxes, spheres, and so on. To see how a simple webpage can be constructed
that contains graphics created with Three.js and TW, view the source
code for this Scene with a Box. The
<head>
section, also shown below, specifies a canvas style
and loads three
JavaScript code files. The first JS code file is version 95 of the Three.js
library (there's also an extra string that references the previous version 80
used in this
course). The second JS code file provides additional functions to control the
camera view with your mouse (an extra string referring to older code
that used version 80 of Three.js is also included here). The third
JS code file loaded by this page is the Spring 2021 version of Scott
Anderson's TW module, which we will use this spring.
<!doctype html> <html> <head> <meta charset="utf-8"> <title>Basic Box</title> <style> /* feel free to change the canvas style. If you want to use the entire window, set width: 100% and height: 100% */ canvas { display: block; margin: 10px auto; width: 80%; height: 500px; } </style> <script src="https://cs.wellesley.edu/~cs307/threejs/libs/three-r95.all.js"> "https://cs.wellesley.edu/~cs307/threejs/libs/three-r80.min.js" </script> <script src="https://cs.wellesley.edu/~cs307/threejs/libs/OrbitControls-r95.js"> "https://cs.wellesley.edu/~cs307/threejs/libs/OrbitControls-r80.js" </script> <script src="https://cs.wellesley.edu/~cs307/threejs/libs/tw-sp21.js"> "https://cs.wellesley.edu/~cs307/threejs/libs/tw-fa18.js" </script> </head> <body> ... </body> </html>
Inside the <body>
section of the page, you'll see a header and JavaScript code to
create the scene:
<h1>Scene with a Box</h1> <script> // Create an initial empty Scene var scene = new THREE.Scene(); // Create a 3D rectangular box of a given width, height, depth var boxGeom = new THREE.BoxGeometry(4,6,8); // Create a mesh for the box // TW.createMesh() assigns a set of demo materials to object faces var boxMesh = TW.createMesh(boxGeom); // Add the box mesh to the scene scene.add(boxMesh); // Create a renderer to render the scene var renderer = new THREE.WebGLRenderer(); // TW.mainInit() initializes TW, adds the canvas to the document, // enables display of 3D coordinate axes, sets up keyboard controls TW.mainInit(renderer,scene); // Set up a camera for the scene var state = TW.cameraSetup(renderer, scene, {minx: -2.5, maxx: 2.5, miny: -3.5, maxy: 3.5, minz: -4.5, maxz: 4.5}); </script>
The key elements include:
- a
THREE.Scene
object that is a container used to store all the objects and lights for the scene we want to render - a 3D object in the scene with a geometry that is a built-in
THREE.BoxGeometry
and surface material created by theTW.createMesh()
function. This object is added to the scene using theadd()
method - a
THREE.WebGLRenderer
object that will use your graphics card to render the scene - a call to
TW.mainInit()
that performs the operations described in the comments - a call to
TW.cameraSetup()
that creates a camera to view the scene. The third argument specifies the scene bounding box, the range of x,y,z coordinates for a virtual box that encompasses the scene to be viewed.
Note that the box, by default, is centered on the origin of the 3D coordinate frame, which you can see if you view the axes by typing the 'a' key. The scene bounding box is slightly larger than the box itself, which you can see by typing the 'b' key. Drag your mouse over the figure to move the camera view.
You can imagine creating a much more complex scene, by adding more stuff to
the scene
object.
The Barn Code
We're now ready to understand the code for the barn demo. Click on the link to view a page that includes the rendered barn and the JavaScript code that created it. Carefully read through the code and comments (both the comments in the code and the additional text on the webpage). Note that there isn't a lot of complex JavaScript coding here. Most of the functions are just straightline code, calling other functions. We don't even (yet) have any loops or conditionals!
Learning More Three.js
We'll learn much more Three.js over the semester, and most of what you need to know will be covered in the online readings and lecture notes for the course. Other sources include:
- Book by Jos Dirksen, Learn Three.js — Third Edition. This subdirectory of the cs307 course directory contains Dirksen's code for all the examples in his book, which is based on version 95 of the Three.js library. Here are a few examples to give you a taste (do not worry about understanding the code at this point):
- Online Three.js documentation — this is more of a reference source than a tutorial, but includes many code examples and links to the Three.js source code on GitHub. The online CS307 course notes contain many links to specific documentation pages.