Systems Programming

Welcome to Systems Programming!

Thank you for joining this class! As you know, Systems Programming is still a new class. This and the fact that we are a small, brave band of investigators implies: Have you ever been hacking away on a programming project and asked “How does that work” only to be told “Don't worry about that!”? This course is here to answer some of these questions.

Today, we will:

  1. Establish our schedule for the term.
  2. Define systems programming.
  3. Go over the structure and mechanics of this course.
  4. Cover basics of programming in C on Unix.

What is systems programming?

Systems programming is a term that has evolved over time, and it can encompass quite a few disparate disciplines. Generally, it's understood in opposition to application programming, which is the writing of programs for regular people (not computer scientists exclusively) to use to get something done. Word processors, spreadsheets, web browsers, payroll programs are all applications programs.

Systems programs are the tools that make application programming possible. Often, this means that systems code is the intermediary between the hardware and other programs. Operating systems are a great example: They regulate access to precious resources, protect multiple users from each other, and provide a consistent interface to hardware capabilities. Device drivers are another example, often considered part of the operating system: They provide a consistent interface to the (rest of the) operating system for various pieces of hardware. All the programming language tools we use (compilers, assemblers, linkers, loaders, debuggers) are also infrastructure tools that fall into the systems programming category. Database tools, too, can be considered systems programs. Finally, a lot of systems programming (and system administration) involves the shell (on Unix systems): shell programming and use of associated tools, like awk and sed, are all in a day's work for a systems programmer.

Whether or not systems programs are part of the operating system, systems programmers must generally be aware of OS details, especially system calls. You have no doubt made heavy use of libraries in the past. Libraries are subprograms (or pre-defined classes) that implement commonly needed features that are not, strictly speaking, part of the programming language. I/O libraries (or classes) are a classic example. But eventually the program will require the OS to do something on its behalf (send output to a printer, transfer bytes from a file, terminate another process), and these requests are called system calls. You will learn about a lot of Linux system calls in this class.

As you can see, systems programming covers a broad range of very different, very challenging subjects. We will begin by developing the low-level, OS aware programming skills necessary for writing systems code. We will then pick a few specific systems problems, including a substantial project, and explore them in some depth. In order to do this, we have to choose a particular platform to explore. We will work in Linux because it's readily available, it's a real live working example, and all the source code is available anytime you just want to peek and see how something works! I am thinking about taking a few peeks into the MacOS this term, which has a Unix-based system at its core.

All the principles you have applied to engineering application code (abstraction, modularity, etc.) apply eqully to systems code. But systems programmers need to know how things work. A lot of the structure of today's systems code reflects the history of the development of these systems as well as the oddities and idiosyncrasies of hardware and the software to control it.

At the end of this course, you should be able to understand systems issues including the tradeoffs in a particular design, be facile with Linux/Unix and C programming, and be prepared to tackle any systems task you'd care to hack on!

Course Outline

The first part of the course will cover the following topics:
  1. Linux/Unix OS.
  2. C programming.
  3. File I/O.
  4. Dynamic memory management
  5. Multi-tasking.
The first 4 elements will be woven together. For example, to understand file I/O in C, we'll have to know something about the Unix file system model, including buffering, I/O devices, notions of users and groups, permissions. Similarly, we'll have to know something about runtime memory layout and pointers in C to understand dynamic memory management.

We will spend approximately 5–6 weeks on these 5 area. After that, you will have acquired sufficient systems programming skills to attack real problems. To gain experience in this area, there will be a class project, which is yet to be determined.

Depending on the time remaining and class interest, we may also explore:

Course requirements

Think, inquire, be curious, have no fear!
Program, program, program some more.
Take a couple exams (at least one take home).
Program some more.
Program a project and present the results.
See course policies and procedures for more.

All good courses should change your brain. This semester, you will learn to think like a systems person. The brain's ability to change is remarkable — it's the miracle that makes us human. But the process can be difficult, even painful. There will be times when your brain will hurt. For those times, we will provide as much support as we can: We have the course conference, instructor office hours, and personal appointments to help you. Use these resources. You should also help each other, but be aware that there is a collaboration policy that imposes limits.

C what I mean

Read Scott Anderson's introduction to C for Java programmers and browse the books and links described in the course references.

Why are we programming in C? It's not just to boost your resumes! Any systems programmer in a Unix environment must know C (and shell programming, awk, sed, etc.).

C goes back long way, having been invented as a kind of “structured assembly language” in which to write the Unix operating system. It has a syntax many find convenient (and upon which the syntax of Java is based), yet it provides fairly low-level control over what happens in the machine. Unix systems continue to be written in C, though C++ is making inroads.

C is often maligned for its permissiveness with types, its lack of support for high-level programming (eg, the lack of automatic garbage collection and data abstraction), and the degree to which it exposes assumptions about details of the underlying system. If you are writing or prototyping a large application, these are indeed serious shortcomings. However, C was created over 35 years ago as a way to write operating system code in something that, unlike assembly language, could be ported easily to new machines, but that allowed data manipulations at nearly the same low (assembly language) level. I.e., C's designers wanted a small “semantic gap&rdquo between C programs and the actual machine. In this, it has been very successful.

Here is a brief introduction to C.


Author: Mark A. Sheldon
Modified: 27 January 2008