cc
or gcc
accepts arguments for
each stage and allows you to stop and examine the result at any
stage boundary.
- The C pre-processor is
responsible for handling pre-processor directives (those
lines beginning with
#
). Lines with#include
are replaced by the contents of the referenced file (with different search rules for names in quotes versus those in angle brackets). Names introduced with#define
are systematically replaced with their definitions throughout the program, expanding as necessary in the case of macro definitions.#if
and its relatives are processed. You can invoke the C pre-processor independently using the commandcpp
or you may examine the result by using thegcc -E
. - The actual compiler translates pre-processed
source into assembly language. You may examine the
assembly language output with
gcc -S
. Assembly language file names normally end with.s
in Unix-like systems. - The assembler converts the assembly language
source to an object,
.o
, file. An object file is not an executable: it may require definitions from other files, including libraries. The assembler can be run separately with theas
command. You can stop the compilation process here usinggcc -c
and get an unlinked object file. - The linker resolves all the references in a
set of object,
.o
files (and libraries or archive,.a
files) and produces an executable image. The linker can be run separately using theld
command, and you can get some debugging hints by noticing when an error message is preceded byld:
, which means that what follows is a link time error (probably a missing object file or library).
There are two important gcc
command line
arguments you should get in the habit of using.
-Wall
tells the gcc
to print out
all warnings. This will often help you to spot a
surprising number of errors that won't stop the program from
compiling but will make it run incorrectly. The other argument
you should always specify is -g
, which tells
gcc
to emit special information the
gdb
debugger can use to help you debug your
program.
A Common Confusion
As stated above, the result of each phase of compilation can be viewed using appropriate compiler arguments, it is unusual to stop compilation after the pre-processing phase or after the assembly code has been generated. However, it is common, in fact, it is the usual routine of building practical systems, to stop the compiler after it produces object (.o
) files and to use it again in a separate
linking step.
Beginners get confused about this because, unfortunately, we
use the same shell command both for producing object files and
for linking them togeter (gcc
, for example).
Despite the same command name, the activities are different, and
what is required is different in the two cases.
Suppose a program in a file
called control-panel.c
needs to use a linked list
package and a specialized graphics package whose source is
in linked-list.c
and window-toolkit.c
,
respectively. We want to build program
a control-panel
program, but how do we do this?
We will proceed in two phases:
- Convert all the source files to object files, and
- Link all the object files together into an executable.
In order to do the first job, we will perform the first 3
phases of compilation on each .c
file individually,
and we don't need the other .c
files for
this. The compiler only needs to know the types of any
variables or functions that will be used in a particular file
are but are defined elsewhere. For example, the code
in control-panel.c
will refer to list and graphics
functions like cons()
and resize_window()
, but these defintions will be
in the other .c
files. These types will be written
in corresponding .h
header files that
are #include
d by control-panel.c
.
I.e., control-panel.c
will contain lines like this:
#include "linked-list.h"
#include "window-toolkit.h"
It is important to understand that this only provides
the compiler with type information so it knows how big data
values returned from external functions are, how many arguments
functions take, etc. This is enough to produce the object code
for control-panel.c
.
To get the object file for control-panel.c
we need
to tell the compiler not to produce an exectuable, but
to stop after compiler phase 3 by using the -c
compiler switch:
gcc -Wall -g -o control-panel.o -c control-panel.c
If you omit the -c
, then gcc
will
assume you wnat an executable program, but when it gets to
compiler phase 3 it will find it doesn't have the actual
definition of, say cons()
. You'll get an error
about a missing reference to cons()
, and you'll be
told that ld
failed, i.e., the program could not be
linked.
We will repeat this procedure for all the source files in the
system we are building, and then we will have a bunch
of .o
files that are refer to values and functions
that they don't yet have access to.
The final build phase happens after all the object files are made. The executable program will need the actual definitions of externally defined items in order to run, so the object files must be linked together. That is, we need to perform phase 3 of the compilation process. This time, we already have all the object files, but we need to resolve references among them. We don't need the header files any more, nor do we need the C source files.
gcc -Wall -g -o control-panel control-panel.o linked-list.o window-toolkit.oWe shall see that this build process, which can get very involved, can be automated.
Modified: 5 March 2008