(Reading: We strongly suggest you read chapters 5 and 9 of Head First HTML with these notes.)
Today, we will build on our idea of representation to discuss how images are represented in digital form. We'll work up to it, first starting with how color is represented (which is based on the physiology of the human eye), then looking at images as rectangular arrangements of spots of pure color. Finally, we'll calculate the filesize of an image and discuss one way of compressing the file so that it is smaller and therefore faster to download. This compression is, in fact, a different representation of the information. We'll also briefly mention two other representations, but we won't spend time in class discussing those formats. You can read that on your own.
Before we see how all possible colors can be represented, let's first start with a simple list of a few built-in colors that we can specify by name.
What color names can we use? All standards-compliant browsers promise to handle (at least) the following 17 color names. (Ignore the middle column for now.) View this page on the web to see the color samples.
| Color name | #RRGGBB | Example |
|---|---|---|
| black | #000000 | |
| gray | #808080 | |
| silver | #C0C0C0 | |
| white | #FFFFFF | |
| maroon | #800000 | |
| red | #FF0000 | |
| orange | #FFA500 | |
| olive | #808000 | |
| yellow | #FFFF00 | |
| green | #008000 | |
| lime | #00FF00 | |
| teal | #008080 | |
| aqua | #00FFFF | |
| navy | #000080 | |
| blue | #0000FF | |
| purple | #800080 | |
| fuchsia | #FF00FF | |
You can use such colors in a variety of ways. For example, to make colored text, you can use the CSS properties of "color" and "background-color" in the SPAN tag. There are many other CSS properties whose values are colors.
This sentence uses too many colors!
This is accomplished by:
<span style="color: black; background-color: yellow">This</span> <span style="color: red; background-color: gray">sentence</span> <span style="color: white; background-color: olive">uses</span> <span style="color: lime; background-color: maroon">too</span> <span style="color: aqua; background-color: purple">many</span> <span style="color: fuchsia; background-color: gray">colors!</span>
For other colors, it is safest to express them numerically
(though many browsers will recognize these
color names). Furthermore, those non-standard names will not pass the validator, as many of you discovered in earlier assignments.
Like numbers,
text characters and everything else, colors are represented by numbers in
the computer. How? For that, we need to understand additive colors and color vision.
Our retinas happen to have rod-shaped cells that are sensitive to
all light, and cone-shaped cells that come in three kinds:
red-sensitive, green-sensitive, and blue-sensitive. Therefore, there
are three (additive) primary colors: Red, Green and Blue or RGB. All
visible colors are seen by exciting these three types of cells in
various degrees. (For more information, consult these Wikipedia articles
on additive color
and color
vision.)
Color monitors and TV sets use RGB to display all their colors,
including yellow, chartreuse, you name it. So, every color is some amount
of Red, some amount of Green, and some amount of Blue.
On computers, RGB color components are standardly defined on a scale from 0 to 255,
which is 8 bits or 1 byte.
Play with the Color Pad applet to get a feel for this. Examples:
The knowledge of RGB colors comes in handy with CSS.
In CSS, you can specify a color in several ways. In the following
numerical examples, all three specify turquoise, a light blue-green color like
this.
Many browsers support more than the 17 standard names, but it is unwise
to count on all browsers supporting an odd name like "turquoise." It's
safest to use one of the numerical methods.
People use decimal (base 10), computers use binary (base 2), but
programmers often use hexadecimal (base
16) for convenience.
Binary numerals get long very fast. It is not easy to remember 24
binary digits, but you can more easily remember 6 hexadecimal
digits. Each hexadecimal digit represents exactly four binary digits
(bits). (This is because 24=16.)
One way to understand hexadecimal is by analogy with decimal, but
we're all so familiar with decimal numerals that our reflexes get in
the way. (In fact, humans throughout history have used many different
numeral
systems; decimal is not sacrosanct.) So, we first need to break
down decimal notation so that you can see the analogy with
hexadecimal. For now, we'll stick with two-digit numerals, but the
same ideas extend to any larger numbers.
Decimal notation works by organizing things into groups of ten,
then counting the groups and the leftovers: Suppose you had a bunch of
sticks on the ground and you bundled them all into groups of 10 with
some left over (fewer than 10). Now, use a symbol to denote the number
of bundles and another symbol to denote the number of sticks left
over. You've just invented two-digit numbers in base 10.
Hexadecimal: Do the same thing with bundles of 16, and you've invented
two-digit numbers in base 16. For example, if you had thirty-five sticks
(I'm trying to avoid decimal notation), they could be bundled into two
groups of sixteen and three left over, so the hexadecimal notation is 23.
Careful! That numeral isn't the decimal number twenty-three! It's still
thirty-five sticks, but we write it down in hexadecimal as 23.
To distinguish a decimal numeral from a hexadecimal numeral, we use
subscripts. So, to say that thirty-five sticks is written 23 in
hexadecimal, we can write:
3510=2316 Both decimal and hexadecimal notations are based on place
value. We say that 2316 means 3510 because
it's a "2" in the sixteens place and "3" in the ones
place, just like 3510 has a "3" in the tens place
and a "5" in the ones place.
Let's take another example. Suppose we have 2610
sticks. That's one group of 16 and 10 left over. How do we write
that number in hexadecimal? Is it 11016??? That is, a "1"
in the sixteens place followed by a "10" in the ones
place??? No; that would be confusing, since it would look like a
three-digit numeral. We need a symbol that means ten. We can't use
"10," since that's not a single symbol. Instead, we use "A"; that is,
A16=1010. Similarly, "B" means 11, "C" means
12, "D" means 13, "E" means 14, and "F" means 15. We don't need any
more symbols, because we can't have 16 things left over, since that
would make another group of 16. The following table summarizes these
correspondences and what we've done so far.
Additive (RGB) Colors and Color Vision
RGB Colors and CSS
color: rgb(64,224,208); /* three RGB numbers in the range 0-255 */
color: rgb(25%,88%,82%); /* three RGB percentages */
color: #40E0D0; /* three RGB numbers expressed as a hexadecimal triple */
color: turquoise; /* a color name supported in many, but not all, browsers */
Hexadecimal
| Decimal | 0 | 1 | ... | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | ... | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Hexadecimal | 0 | 1 | ... | 9 | A | B | C | D | E | F | 10 | 11 | 12 | ... | 1C | 1D | 1E | 1F | 20 | 21 | 22 | 23 | 24 |
To convert a big decimal number to hexadecimal, just divide. For example, 23010 divided by 16 is 1410 with a remainder of 610. Thus, the hexadecimal numeral is E616. To convert a hexadecimal number to decimal, just multiply: E616=E*16+6=14*16+6=230.
Try the following conversions as an in-class exercise. You can use a calculator, you can ask your neighbors, anything you like.
| Dec | Hex | Dec | Hex |
|---|---|---|---|
| 7 | 22 | ||
| 26 | 100 | ||
| 127 | 149 | ||
| 240 | 255 |
You can check your work with the following form:
We already know that every color in a computer is a combination some amount of each of the three primary colors: red, green and blue. The amounts are always given in the same order: red, green, blue. The amounts are numbers from 0 to 25510, or, in hexadecimal, 00 to FF16. Each primary is expressed as a two-digit numeral in hexadecimal, using a leading zero if necessary so that the numeral is always two digits. Three pairs of hexadecimal digits completely specifies a color. Finally, the notation for a color always starts with a pound sign (#). For example, a color like (35, 230, 10) would be written #23E60A.
Now you can construct the colors above (gold, cornflower and DodgerBlue1). Furthermore, you can understand the middle column of the table at the top. Let's go back to that table and observe the following:
FF is greater than 80
80 for FF,
which is roughly half the brightness of that primary. For example,
red (#FF0000) is preceded by maroon
(#800000).
Here is a more complete color name list .
Using a web page you created previously (or this example web page ), experiment with defining a color numerically. Use the SPAN tag to color some text. If you can't think of a color to try, try Chocolate. The syntax is:
<SPAN style="color: #RRGGBB"> text </span>
That's it! It takes some practice to get the hang of computing the hexadecimal numerals, but nothing you haven't done before.
Now that we know how to represent a color, we can represent images. You can think of an image as a rectangular 2D grid of spots of pure color, each represented as RRGGBB. A spot of pure color is called a pixel, short for picture element, the atom of a picture. Pixels are better seen if you blow up an image several times; here are some examples. Check out the following description of pixels on page 44.
Every image on the computer monitor is represented with pixels, including the windows themselves! Such images are saved in files that, in addition to the image data, contain information on the size of the image, the set of colors used, the origin of the image, etc. Depending on how exactly this information is saved, we refer to them as image formats. GIF, JPG, PNG, QT, and BMP are some of the well-known image formats. We will talk more about image formats, below. For now, we will focus on the number of pixels and the representation of each pixel, and consequently, the file size of the image.
We said above that the amount of each primary color is a number from 0 to 25510 or 00 to FF16. It is no coincidence that this is exactly one byte (8 bits). A byte is a convenient chunk of computer memory, so one byte was devoted to representing the amount of a single primary color. Thus, it takes 3 bytes (24 bits) to represent a single spot of pure color.
Aside: with 256 values for each primary, that yields 256 x 256 x 256 = 16,777,216 colors. Humans can distinguish over 10 million colors, so 24-bit color is sufficient to represent more colors than humans can distinguish. All modern monitors use this so-called 24-bit color. Some old monitors used 16-bit or 8-bit color, which were relatively impoverished, being only able to represent 65,536 colors (for a 16-bit monitor) or 256 colors (for an 8-bit monitor). Of course, a black-and-white monitor can only represent two colors, which could be called 1-bit color.
Since each pixel takes 24 bits (3 bytes), to represent, even a small picture can require a surprising amount of space.
Example: A good monitor might have 100 pixels to the inch, so a picture the size of a 3x5 index card would be 300 pixels by 500 pixels. That's a total of 300x500=150,000 pixels. Since each pixel takes 3 bytes, the file size for the image is at least
300 x 500 x 3 = 450,000 bytes
This is about 450 kilobytes (abbreviated kB, the "k" is lowercase, but the
Bis uppercase; see the note on abbreviations) or nearly half a megabyte. Not only is that a lot of storage space, but more importantly it takes significant time to download unless your modem is very fast. For example, if you have an old-style telephone modem can only handle 56kbps (56k bits per second) = 7kBps (7k bytes per second), you will need a little over 1 minute to download it (recall that 1 byte = 8 bits). That's a lot of time.
Telephone modems? Yes, some people still use telephone modems. But faster DSL modems (ranging from 128 kbps to 1500kbps) and cable modems (ranging from 300 kbps to 6000kbps) have become very popular.
However, the advent of faster connection speeds has been accompanied by the rise of websites with content (higher-resolution photos, songs, videos) that completely consumes the additional bandwidth. So no one ever has enough network bandwidth, and it's wise to avoid squandering it. If someone in your audience finds your web site slow to download, they'll move on to another web site.
On the first day of class, Scott took approximately 30 digital pictures of students, each of which was about 2MB.
Short of making our images smaller (fewer pixels), what can we do to speed up the downloads? We can compress the files.
There are two classes of compression techniques:
We will look in detail at one kind of lossless compression, which is indexed color (GIF encoding), because it gives us a window into the kinds of ideas and techniques that matter in designing representations of information.
The idea behind indexed color is that if a particular color is used many times in an image, we can create a "shorthand" for it. In fact, if we limit the number of colors, each one can be assigned a shorthand. What will be confusing is that the colors are, of course, represented as numerals and so are the shorthands! For example, instead of saying (for the umpteenth time), color #D619E0, we'll just say, for example, color number 5. This will only work, however, if the shorthands really are shorter. They are, and we'll see exactly how much.
One way to think about indexed color is that we are creating a "paint-by-numbers" picture. We choose
Example (see this earlier example): Imagine that the 300x500 picture uses only two colors, say red and yellow. Suppose we make up a table of colors (two entries) and then represent the image with an array of "color indexes," like a paint-by-numbers set.
- What is the numbered list of colors? There are just two:
index color 0 #FF0000 1 #FFFF00 - We then paint the picture using just two numbers, 0 and 1. A zero means a pixel is red, and a one means the pixel is yellow.
- How many bits does it take to represent this image? Well, there are 300x500 or 150,000 pixels, but each one is just 1 bit, so it takes 150,000 bits or 150,000/8 = 18,750 bytes or about 18 kB. Compare that with the 450 kB in our earlier example, and you can see this is much smaller. In fact, it's 1/24th the size, since each pixel takes 1 bit to represent rather than 24. It'll be 24 times faster to download.
- What about that table of colors? That's called the color palette, by analogy with an artist's palette. That has to be represented too. Otherwise, the browser would know there were only two colors in the picture, but wouldn't know what colors they are. There are two entries in this palette, each of which is 3 bytes (24 bits), so add at least 6 more bytes to the representation.
You can see the general scheme at work: we create a table of all the colors used in the picture. The shorthand for a color is simply its index in the table. We will limit the table so that the shorthands will be at most 8 bits. Since the shorthands are all replacing 24-bit color specifications, the shorthand is at most one-third the size. In the example above, the shorthand is 1/24th the size.
Let's continue with the example. What is the filesize if the image uses 4 colors, say red, yellow, blue and lime? In that case, the table looks like this:
index color 00 #FF0000 01 #FFFF00 10 #0000FF 11 #00FF00 As you can see, the shorthand is now two bits instead of one. Therefore, the 150,000 pixels require 300,000 bits or 300,000/8=37,500 bytes or about 37.5kB. Obviously, this is about twice the size of the previous example, since each shorthand is now twice as big. Nevertheless, it's still much smaller than the 450 kB uncompressed file.
What about the size of the palette? That's now twice as big, too. Four entries at 3 bytes each adds 12 bytes to the filesize, which is a negligible increase to the 37.5 kB.
What's the pattern here? The number of colors in the original image determines the size of the palette, which determines the number of bits in each shorthand, which then determines the size of the file as a whole. The shorthand for a color is simply the binary numeral for the row that the color is in the table. For example, the color red in the last example was in row zero (00 in binary) and the color lime was in row 3 (11 in binary). However, the relationship between the number of colors and the size of the shorthand is not an obvious one. Let's do one more example before we state the rule.
Suppose that the same 300x500 image uses 16 colors, say sixteen of the named colors that we began this lecture with. In that case, the table looks like this:
Indexes, Color names, hexadecimal values, and samples shorthand Color name #RRGGBB Example 0000 black #000000
0001 gray #808080
0010 silver #C0C0C0
0011 white #FFFFFF
0100 maroon #800000
0101 red #FF0000
0110 olive #808000
0111 yellow #FFFF00
1000 green #008000
1001 lime #00FF00
1010 teal #008080
1011 aqua #00FFFF
1100 navy #000080
1101 blue #0000FF
1110 purple #800080
1111 fuchsia #FF00FF
As you can see, the shorthand is now four bits. Therefore, the 150,000 pixels require 600,000 bits or 600,000/8=75,000 bytes or about 75 kB. Larger, but still much smaller than the 450 kB uncompressed file.
What about the size of the palette? Sixteen entries at 3 bytes each adds 48 bytes to the filesize.
You can see that the number of bits required for each pixel is the key quantity. This quantity is called bits per pixel or "bpp." It's also often called "bit depth" so that the file size of an image is just width times height times bit depth, almost as if it were a physically 3D box.
Finally, we can state the rule:
The bit depth of an image must be large enough so that the number of rows in the table is enough for all the colors. If the bit depth is d, the number of rows in the table is 2d.
Here's the exact relationship:
Mapping bit-depth to number of colors bit-depth max colors 1 2 2 4 3 8 4 16 5 32 6 64 7 128 8 256
Consider an image that is 80 x 100.
In sumary, you can reduce your image file size by using fewer colors. Of course, this may reduce the quality of your image. It's a tradeoff.
The GIF file format (i.e., image representation) is the best known example of indexed color format. Here is how it works: Imagine a mural painter who will go to your house and paint a mural on your wall, anything you want. But there's a catch: she'll only make one trip to your house, and her van only holds 256 cans of paint. She has a warehouse of 16 million cans of paint, and you can choose any 256 that you want, but you can't have a mural with more than 256 different colors in it.
This is the essential idea behind GIF images and indexed color.
We've learned how indexed color works and how it affects file size. This is important not only for the theoretical understanding of why representations matter, but also for the practical usefulness of understanding how to reduce the sizes of your images. In this section, we'll review how to compute the approximate size of a indexed-color (GIF) image. Why do we do this? Because it combines all the conceptual issues into one small calculation.
Note that there are many additional details affecting the size of GIF images that we will not cover here. One detail is a certain amount of fixed overhead for representing information like the file type (so that the computer can tell it's a GIF image and not a JPEG, PNG or even a DOC file), and a few bytes to store the dimensions of the image (its width and height and the number of colors). Another detail is that the bits representing the pixels can be compressed further (in a lossless way) using a standard bit-compression algorithm. And GIF images support transparency and animations as well. For more information see this summary and this detailed explanation of GIF file format. We will not be concerned with these details. We will focus on the relationship between the file size and the dimensions of the image, including the number of colors.
A key concept in the computation is the bit-depth of the image. Read on page 19 the definition of bit-depth. It's the number of bits necessary to represent the desired number of colors. Remember that the number of colors is 2d, where d is the bit depth. It's an exponential relationship. Adding just one bit to the bit-depth doubles the number of colors you can have.
Recall that the GIF representation comes in two parts:
Thus, our computation breaks down into two parts.
|
width * height * bit-depth / 8 |
|
num_colors * 3 |
To find the rough size of an image, we first determine the
bit-depth, then we compute the file-size using the two formulas above.
(This is the rough
size because, remember, we are omitting some
fixed overhead and further compression techniques.) You can combine
them into one formula:
|
(width * height * bitdepth) / 8 + (3 * numcolors) |
Finally, because the file size will usually be large (thousands or millions of byte), we divide by 1000 or 1,000,000 to convert to kilobytes or megabytes, as appropriate.
Find the file size for the following images:
Most image manipulation programs will tell you the file size in
whatever format you're working with (GIF, JPEG, PNG, BMP, TIFF ...)
and many will also estimate the download times for various network
speeds. Thus, there is no reason in practice to have to compute these
values by hand.
Nevertheless, in this course, we expect you to
understand this computation and be able to do it.
Why would we do this? The main reason is intuition. Why does adding one more color sometimes matter a lot and sometimes matter hardly at all? This formula explains why. The mathematical relationship captured by this formula is logarithmic and it goes in discrete steps instead of a smooth curve. This is probably not something you've encountered very often in your life, and so, frankly, it probably seems a bit odd right now. If you just rely on computer programs to do the arithmetic for you, the result will continue to surprise you. It's worth acquiring some intution about this, so that you'll be more confident and effective when you're manipulating images.
Furthermore, many other relationships in computer science have this kind of stepwise logarithmic (or stepwise exponential) behavior, and so intuition about this kind of relationship is a good foundation for further exploration in the field.
(The following sections will probably not be covered in class, due to time constraints, but feel free to bring questions.)
GIF is popular, but it's not the only image format. A big limitation of GIF is that it can have a maximum of only 256 colors. Images of real-world objects have thousands of subtle shades of colors: reducing the number of colors to 256 would make it look cartoon-like.
For a real-world image, we would probably choose the JPEG image format. JPEG is a lossy compression technique. The details are complicated, but you can look at examples (say on the Wikipedia article, above) and see that blocks of similar colors are replaced with something like their average color. This reduces the number of colors, which is then compressed by a scheme somewhat like the indexed-color idea.
Another popular and important format is PNG
| PNG | GIF | JPEG |
|---|---|---|
| (27,639 bytes) | w/ 2 colors (738 bytes) | at 5% quality (918 bytes) |
![]() |
![]() |
![]() |
Consider the comparison above, in which the same simple image is represented in three different formats. The JPEG image looks bad, and yet it's actually bigger than the GIF. This example illustrates a situation in which GIF does better than JPEG. Of course, there are situations in which JPEG does better than GIF. Finally, although PNG is bigger than either, it can be edited by Fireworks in vector mode, not bitmap mode, which means that the red cross is an object you can manipulate, not just a collection of bits.
No one format is best