CS 240 Lab 11

Exploring Processes

CS 240 Lab 11

This lab will let you explore a few things about processes, including the fork function call and the ps and top programs.

Setup

For this lab, we’ll work locally on a Linux machine in the lab, instead of connecting to the CS department server as usual. This way, if you accidentally run a “fork bomb” you won’t cause trouble for anyone else.

If you are working remotely on this lab (e.g., because your professor is sick) then you should use SSH to connect to one of the computers in L037 INSTEAD of to cs.wellesley.edu using the following list:

Click here to see the list of computer names in L037

In VSCode you can use the remote develop extension, or in a terminal you can use ssh to connect to a remote computer. You can identify it by IP address, or by a “host name” if one is set up. All of the L037 computers have their own host names and they all share the same list of authorized users and passwords, and so for example since my username is pmwh I could run ssh pmwh@thrush.wellesley.edu to connect to the computer whose host name is thrush.wellesley.edu. The names of the computers in L037 are (add .wellesley.edu to get the full host name):

  • boa
  • chimp
  • gibbon
  • goose
  • gorilla
  • lemur
  • marlin
  • perch
  • swallow

(these ones should be available right now)

  • robin (this one is available but being used for research, please DON’T use it)

And the rest which are either off or booted into Windows:

  • cardinal
  • dove
  • finch
  • gecko
  • gull
  • herring
  • jay
  • lark
  • orangutan
  • stork
  • tamarin
  • thrush
  • wren
Pick one from the top list of available computers at random to use today. If you and another group are both using the same one, that’s fine, but if either of you does end up running a fork bomb, you’ll both notice it…

Open VSCode on the local computer (NOT on your laptop), and start a terminal (don’t connect anywhere).

If you’re remote, instead open VSCode on your own computer and use the remote develop extension but instead of connecting to cs.wellesley.edu connect to one of the available computers from the list above.

Then run the following commands to download the starter files and unzip them in your cs240-repos folder:

cd cs240-repos
wget https://cs.wellesley.edu/~cs240/s24/lab/lab11/process.zip
unzip process.zip
cd process
ls

This should create a process directory and change into it, where you should see a Makefile and a number of example fork programs.

Now run hostname. What do you see?

Example answer: You should see the hostname of the computer you chose to connect to (or are working at), something like swallow.wellesley.edu. IMPORTANT: If you see cs.wellesley.edu then you’re connected to the department server and you need to reconnect to a different computer before proceeding with the lab.

Important: Make sure you are NOT connected to cs.wellesley.edu before continuing.

Before we use these files, we’ll first explore some tools for managing processes.

Exploring Processes

Open Firefox (if you don’t already have it open, which you probably do if you’re reading this).

If you’re remote, you can’t open Firefox on the remote computer easily. Instead, use the systemd process for the following exercises. You don’t have to launch it since it’s already running.

Then in a terminal, run top, which gives an interactive list of running processes.

Exercise 1

Looking at the “Tasks” line near the top, how many total processes are there?

Example answer: Answers will vary. Could be anywhere from a few dozen to a few hundred. This is the “total” number that’s right after the word “Tasks:”

How many tasks are running vs. sleeping?

Example answer: There will usually be 1 running process while the rest are sleeping. In some cases there might be a few stopped or zombie processes.

The %Cpu line shows what percentage of the CPU time is spent in various states, with us for “user code,” sy for “system code, id for”idle” and a few other categories. What percentage of the time is the CPU idle?

Example answer: There may be a lot of variance here, but assuming your system isn’t doing anything strenuous, it’s probably at 95% or more idle.

Below the highlighted line that starts with “PID” (for “process ID”) there is a list of current processes ranked by CPU usage. This will refresh itself every second or so as top re-measures things. Which processes are using the most CPU on your system?

Example answer: This will vary a lot. Firefox, VSCode, and/or various system processes like a window manager may be on top. If you do things like open a new tab in Firefox (especially one that does something like play a video), this will make it use more cycles. If you’re remote, you won’t see Firefox or VSCode, although there may be a VSCode manager process. You will probably see systemd no matter where you are, as that’s a core operating system process.

Exercise 2

Next, run the ps command. This will list processes in the current session, which by default is only the current terminal (not usually helpful). But if you run ps ux it will list all processes belonging to your user ID. Looking at the TIME column (it’s in minutes:seconds format), how many processes have used at least 1 second of CPU time?

Example answer: Another answer that obviously varies. In a lot of situations only a few processes use any significant CPU time, however. Sometimes none of them will have used that much CPU, especially if you’re remote.

In addition to ps, on Linux the /proc folder holds “files” which display information about running processes. Run ls /proc, and observe the contents. What do you think the numbered directories correspond with?

Example answer: Each running process gets a directory in /proc. The numbers are their process IDs, which you just saw with ps.

Run cat /proc/interrupts to print out the contents of that file. How many function call interrupts have occurred?

Example answer: Exact answer will vary, but numbers in the 10’s to 100’s of thousands are common. An observation here: we usually think of our code as running continuously on the CPU, one instruction after another. But we see here that in fact we get swapped in to run a few instructions, then get swapped out for a bit, then swapped back in, etc., running bit by bit in small slices. Mostly we don’t end up caring about this, since the OS provides a good abstraction that makes it seem like we just run straight through.

Use ps and/or to to find the PID for Firefox (or systemd if you’re remote), then cd to change into that directory within /proc (e.g., cd /proc/1179 if the Firefox PID is 1179). Now run cat status to see the status of the process. From the status output near the end, how many context switches has Firefox (or systemd) experienced?

Example answer: Results will vary, but will often be at least thousands if not tens or hundreds of thousands.

Finally, use ls task or pstree -p <PID> to look at child processes. How many child processes has Firefox (or systemd) created?

Example answer: Again this depends, but probably at least a few. If you’re inspecting systemd, it will be a huge tree with many key operating system services included like audio, networking, etc.

fork Predictions

In this section, you will simulate the execution of several forking programs by hand, then predict the printed output from a run of the program.

On your local Linux machine, you should have a directory called process (if not, run through the setup instructions at the start of the lab above). This directory contains several forking programs. If you need to, you can download a zipped copy of the process directory from here.

BEFORK you start, beware: one of the examples you will see is a so-called “fork bomb,” a program that, when run, will fork processes infinitely. Operating systems are designed to share resources among a finite number of processes – they can’t handle infinitely many processes at once. If allowed to run unfettered, a fork bomb will bring an operating system to its knees – it will consume all available resources just to keep track of existing processes. Fortunately, modern operating systems have protections against fork bombs: they limit the number of processes a given user can create. Nonetheless, running a fork bomb may still bring your login session to a crawl or completely crash it. So try to avoid running the fork bomb example!

Notes:

  • The exit function terminates the current process. (see man 3 exit for more details)
  • The waitpid function pauses the current process until the process whose PID is given as its first argument has exited. In other words, a call to waitpid(P,...,...) does not return until process P has exited. (man waitpid for more)
  • For simplicity of reading, these example programs are in poor style: they do not check to see if the fork operations succeed or fail. When writing your own code using fork, you should always check for errors!

Exercise 3

Do not run any of the programs at first; instead, for each of the files forkex*.c in your process repository, predict the result of running the program. If multiple possible outputs exist, describe them. Drawing sketches of the processes may be helpful!

Use make all to compile the programs, and run the .bin files this produces using ./.

  1. forkex1.c
    1. Is this the fork bomb?

      yes
      no

    2. What will it output?

      Example answer:

      “hello” will be printed four times. The first time through the loop, fork will create two clones of the process, each about to enter the second iteration of the loop. Each of them will then fork again on that second iteration, before ending the loop and printing “hello.” The four hellos should appear all together about 2 seconds after the start of the program.

    3. When you run it does that match what you predicted?

      Example answer: Yes (I hope).

  2. forkex2.c
    1. Is this the fork bomb?

      yes
      no

    2. What will it output?

      Example answer:

      This will print “hello” eight times: four times from doit, following the same logic as forkex1.c, and then 4 more times as each of those processes separately returns to main. As before, all eight hellos should appear roughly 2 seconds after starting the program.

    3. When you run it does that match what you predicted?

      Example answer:

      I thought for a second it would only print 5 times, not 8, before double-checking myself about how the forked processes proceed, so that might be surprising…

  3. forkex3.c
    1. Is this the fork bomb?

      yes
      no

    2. What will it output?

      Example answer:

      This will print “x = 4,” “x = 3,” and “x = 2” once each, together about 1 second after the program starts. In theory, there’s no guarantee about what order the three statements will be printed in, except that 4 comes before 3 (because both of those are printed in the parent process in that order).

    3. When you run it does that match what you predicted?

      Example answer:

      When I tested this it repeatedly printed 4/3/2 in that order, which surprised me, as I expected the 2 to sometimes be first or in the middle.

  4. forkex4.c
    1. Is this the fork bomb?

      yes
      no

    2. What will it output?

      Example answer:

      The main path here just prints out a single “hello” after 1 second and then exits. But if the first fork returns 0, then we go into the if and sleep another second and then fork again, printing “hello” there and then exiting. So there will be 3 hellos printed, with the first after 1 second and the next two after about another second.

    3. When you run it does that match what you predicted?

      Example answer:

      One surprise here is that the second and third “hello” get written on top of the prompt for the next command, since main exits before either of the child processes get a chance to do their printing. This is a good demonstration of the fact that especially when dealing with multi-process code, printed output is likely to show up at weird times and/or in weird places.

  5. forkex5.c
    1. Is this the fork bomb?

      yes
      no
      sort of

    2. What will it output?

      Example answer:

      The main path here prints “Welcome from ” waits 1 second, sees that it’s the parent, and returns.

      The child meanwhile will print “hello I am ” and then recurse, sleeping again before splitting. But the parent in this split will once again return, this time to the exit call in the conditional. The child will sleep, print, and recurse.

      So this will keep printing “hello I am ” once a second, and it will also immediately return control to the shell, where the prints will make things difficult. Eventually, one of the processes will crash when it runs out of stack space, but this may take thousands of seconds if not longer. This program is also hard to stop: normally you’d hit control-C, but the initial process has already ended. And if control-C doesn’t work, you’d look up the PID of an offending process and force-quit it (the kill command works for this). But that doesn’t work here, since each process is only around for 1 second. Shutting down that terminal and starting a new one is valid, but you could also type out “kill …” along with the first few digits of the PID it is about to print, then right after it prints that number, complete the last digit and hit ‘enter’ to kill it during the sleep before it can fork again.

      This program isn’t a fork bomb, because it only ever has 1 process active and it eventually crashes by itself, but it’s very badly-behaved.

    3. When you run it does that match what you predicted?

      Example answer:

      Here the printing over the command prompt and difficulty in stopping it is probably surprising.

  6. forkex6.c
    1. Is this the fork bomb?

      yes
      no

    2. What will it output?

      Example answer:

      It will print “counter = 2.”

      The child process sets counter = 1 in its memory, and then exits without printing anything. The parent process is unaffected by this, and sets counter = 2 before the print happens (++ before vs. after a variable affects whether the increment happens before or after the line where the ++ appears).

    3. When you run it does that match what you predicted?

      Example answer:

      Yes (hopefully).

  7. forkex7.c
    1. Is this the fork bomb?

      yes
      no
      sort of

    2. What will it output?

      Example answer:

      If you don’t give it a command-line argument, it will tell you it needs one. Otherwise, it will convert its command-line argument to an integer nn and fork into 2n2^n processes, each of which prints “hello” once (after sleeping for nn seconds).

      If you use a small number, this is fine, but if you enter a large number (I wouldn’t go for more than ~5) it may cause issues, and a very large number will almost certainly cause issues. The forking process is slow (doubling the number of processes every second, rather than as quickly as possible) but once launched, when it splits the first time it becomes more difficult to kill because there are multiple process IDs to shut down.

      You can use control-Z to pause the process, and all its descendants (a process plus its descendants is called a “job” in shell terminology). You can then use killall forkex7.bin to kill all processes started with that command, including the forked clones. Finally, fg or bg will bring them out of suspension at which point they’ll actually be stopped (killing a suspended process doesn’t actually stop it until its resumed). You can also use control-C to interrupt the process and its descendants. So this program is easier to deal with than forkex5.bin was.

    3. When you run it does that match what you predicted? (Do NOT run it with a number greater than 4 as the argument).

      Example answer:

      Sometimes you’ll see some of the ’hello’s get printed after the next prompt: the program waits to quit until the main process is done, and all of the processes should sleep for about the same amount of time, but there’s slight variation in the exact timing, and if the first parent process exits before others, the prompt may be printed before they get their turn to print.

The End

You should use remaining time in lab to get started on the concurrency assignment.