How Computers Work

Computer Components

A computer is a complex electronic machine, but its operations can be understood sufficiently in terms of few interconnected components:

The processor is further divided in two smaller components

In the picture note that the components are connected through a set of wires called the bus.

computer comprising processor, memory and I/O

The operations of any computer are as follows ad infinitum ("forever"):

  1. fetch instruction from memory
  2. decode the instruction. What is the processor supposed to do? Add? Subtract? Move some data to/from memory?
  3. execute the instruction. This may involve such things as telling the processor to add two numbers, getting some data from memory, storing some data to memory, or telling some I/O device to do something.
  4. REPEAT!

This is called the fetch/execute cycle. Note that the processor is so much faster than the I/O devices, that I/O doesn't appear prominently here. A modern processor is able to execute millions of instructions while waiting for an I/O device, even a fast device like the disk or the network.

The main point here is that the computer doesn't know how to do anything "automatically": there's always some program code (set of instructions) telling it how to do anything.

The input/output unit is connected with the usual peripherals such as keyboard, mouse, the various "drives" (such as hard drives, floppy drives, zip drives, DVD drives, CDROM drives, etc), monitors, printers etc. So, the components and operations of a computer are remarkably simple. Complication enters only for performance reasons, but any computer that you are likely to see these days contains the above components.

How information is represented

To complete this simple description of a computer we need to explain how the computer handles information: numbers, text, pictures, sound, movies, instructions.

The computer is an electronic device. Each of its wires can either carry electric current or... not carry current. So, like a light switch, it understands only two states. It turns out that this is enough to make the whole idea work. In fact, any system that can represent at least two states can represent information. Take, for example, the Morse code that is used in telegraphy. Morse is a sound transmission system that can carry a short beep (represented by a dot) and a long beeeeeep (represented by a dash). Any letter or number can be represented by a combination of these two symbols. Click here to see a Morse translator.

Similarly with computers. To represent a number, we use the binary arithmetic system, not the decimal number system that we use in everyday life. In the binary system, any number can be represented using only two symbols, 0 and 1. (Morse is almost, but not quite (due to the pauses between letters) a binary system. A system closely related to Morse is used by computers to do data compression. More about it later.) Here is how the binary numbers correspond to our decimal numbers:

DecimalBinary
00
11
210
311
4100
5101
6110
7111
81000
91001
101010
111011
121100
131101
141110
151111

And so on. Both systems are positional: a great idea that we owe to Arab mathematicians, because before them, counting in Roman was tough (DCCCLXXXII + CXVIII = M, you know...) and counting in Greek was almost impossible (omega pi beta + rho iota eta = alpha).

Positional means that the position of each symbol within the number determines its value. Thus, 3 has a different meaning in the rightmost position than it has in the immediate left (30). We do it without thinking, but we all know that the meaning of 1492 is:

1492 = 1*1000 + 4*100 + 9*10 + 2*1

Similarly, number 10011 in binary means 19 because

19 = 1*16 + 0*8 + 0*4 + 1*2 + 1*1

The decimal system is also called "base 10" and the binary "base 2". You probably have not realized it, but you have been using the binary system when you deal with bottles: jack, gill, chopin, pint, quart, pottle, gallon, peck demibushel, bushel, kilderkin, barrel, hogshead, pipe, tun. Two of each unit equal one next unit:

2 Jacks 1 Gill
2 Gills 1 Chopin
2 Chopins 1 Pint
2 Pints 1 Quart
2 Quarts 1 Pottle
2 Pottles 1 Gallon
2 Gallons 1 Peck
2 Pecks 1 DemiBushel
2 DemiBushels 1 Bushel
2 Bushels 1 Kilderkin
2 Kilderkins 1 Barrel
2 Barrels 1 Hogshead
2 Hogsheads 1 Pipe
2 Pipes 1 Tun

So, a gallon contains 8 pints and a tun contains 256 gallons.

Of course, we can have positional systems on different bases, like base 12 (AKA "a dozen") and base 7 (AKA a week). Here are some more details on conversion between different number systems .

Below is a form that can help you convert between bytes (8-bit binary numbers) and decimal easily. Feel free to play with it.

Binary:

128 64 32 16 8 4 2 1

Decimal:

What's so grand about a positional system? Arithmetic calculations are much easier than in non-positional systems, Can you imagine what second grade would be like if you had to calculate that XLVIII + LXVII = CXV?

How text is represented

Text is represented with the so-called ASCII code. Years ago, the manufacturers of early computers decided to represent every possible character (visible or invisible, like the space or the newline) with a number. The result was (partially) the code you see below.

The ASCII character set

So, A is represented by decimal number 65 or binary number 01000001. The greeting "Hi!" is represented by the sequence 72 105 33 or in binary 010010000110100100100001. Of course some care must be taken to recognize when we are looking at a number and when we are looking at a string of characters. But that's not difficult.

Control Characters

You'll notice that the table above starts with ASCII code 32, which is for the space character; yes, even the space character (sometimes denoted SPC) needs to be represented. The actual code starts at 0 (which is the null character, sometimes used to represent the end of a string), but the first 32 characters are "control" characters, because they were used to control the early printers. For example, the TAB character is ASCII code 9. Since those characters are not interesting in the context of this class, we've omitted them from the table.

Line Endings

If all we had to worry about was characters, text representation would be pretty straightforward. However, text is organized into lines, and for historical reasons, one of the subtle differences among Windows, Mac and Unix/Linux is how line endings are represented. In the olden days before Window, Mac and Unix, the early teletype printers used two control characters at the end of each line: the carriage return character to move the print head back to the left, and the linefeed character to move the paper up by one line.

The Mac represents the end of a line with a carriage return character (CR, which is ASCII code 13). Linux uses a line feed character (NL, which is ASCII code 10). Windows uses both, just like the olden days.

Usually, when you transfer a text file from system to system, the FTP program (Fetch, WinSCP, or whatever) substitutes the appropriate line ending. The "text mode" of transfer says to make these substitutions; "binary mode" makes no substitutions and is more suitable for non-text files, such as images or programs. Note that HTML and CSS are both kinds of text. Most FTP programs have a "guess which mode" setting that usually works pretty well, but can occasionally make a mistake.

Unicode

The early ASCII system had space for 256 symbols, enough to represent all English characters, punctuation marks, numbers etc. It turns out that there are other languages on Earth besides English, (;-) and recent software is being written to accommodate those, too, via a much larger code called Unicode. You may want to read up on that in your spare time.

Instructions

Once you can represent numbers and characters, you can also represent instructions! It was this observation that led von Neumann and his collaborators created a general purpose, reprogrammable computer. Again, one needs to keep track of when you are looking at instructions versus a string of characters.

In the future we will learn how the computer represents images, sound and movies.

Bits and Bytes

Groups of bits that are used to represent characters came to be known as a byte. That Wikipedia page also discusses the history of the word and the names for larger groups of bytes, such as

While technically a "kilobyte" is 1000 bytes, most computer scientists will mean 1024 bytes when they say "kilobyte," and so on for the other terms.

Computers these days come with huge amounts of storage space on the hard drive (often hundreds of GB), but they are usually able to process less than 1 GB of them at a time (their main memory or RAM). We will use these symbols often in future lectures.

© Computer Science 110 Staff
This work is licensed under a Creative Commons License
Date Modified: Thursday, 24-Jan-2008 12:39:00 EST