Shell
Contents
Overview
- Assign: Tuesday, 3 Nov
- Checkpoint: send a progress update Friday, 6 Nov
- Checkpoint: aim to complete IO redirection Monday, 9 Nov
- Due: Wednesday, 11 Nov
- Teams: pairs or individuals
- Submit:
git add
,git commit
, andgit push
your completed code. - Reference:
In this project, you will build a Unix-like shell. You are welcome to reference the CS 240 Shell assignment, including adapting the starter code from that assignment or your own completed code from that assignment (with self-attribution). As a word of caution, some design choices made in the provided code for the CS 240 shell may be an awkward fit for the requirements of this shell. Even if you wish to consult that code, building from scratch will help you be sure you understand all your code (and feel more accomplished!).
A top-level piece of advice is to understand the intended behavior of a feature before trying to implement it. The best way to do that is usually to experiment with that feature in the normal shell that you are already using in your work. The specification intentionally leaves lots of room for a variety of implementation choices and requires that you learn/understand how several key system calls work and interact with each other. Once you have looked at some documentation, I am quite happy to field questions as you interpret it and think about how to put these system calls together to accomplish your goal. This understanding piece will be a major component of the “programming” that you do for this project.
Requirements
Once you report your team to me, I will create a shared GitHub repository for you to use.
What to Submit
Upon submission, your repository should contain at least these files:
README.md
: A text file, formatted using Markdown, that documents:- How to compile your shell.
- How to use any other items included in the repository, such as
tests or a
Makefile
. - What features your shell supports as well as how to use them, including examples.
- A brief design guide explaining the major structural components of your shell code.
- Any assumptions, non-standard behaviors, or known limitations of your shell.
- At least one
.c
file containing code for your shell. You are welcome to organized your code into as many C source (.c
) and header (.h
) files as you like.
Additionally, it is recommended, but not required, that you include:
Makefile
: Basic rules for compiling/cleaning your shell executable withmake
.- Some tests. You may wish to copy the text infrastructure from the Syscalls assignment and replace the test specifications with a series of your own tests for the shell. Using shell scripts is also a useful way to save and test a sequence of commands.
What to Implement
Your shell program should support:
- The ability run in “interactive mode,” where the user types individual commands at a command prompt, or “script mode,” where your shell is invoked with a file containing a sequence of commands separated by newlines.
- A few standard builtin commands, including at least
cd
andexit
. - The ability to run simple single foreground executable
commands by invoking executables outside the shell
with command-line arguments (e.g.,
/bin/ls cs341/shell
). - Support for input and output redirection and all kinds of executable commands. (Note, builtin commands should not support redirection.)
- Support for pipes of executable commands. (Note, builtin commands should not support pipes.)
Suggestions
It is recommended that your shell support a “verbose mode”
available either by changing a single static const
variable (a
constant) or macro in the source code or by passing a -v
argument
when invoking the shell. In verbose mode, the shell will print
additional metadata as described below.
You are also encouraged to explore other advanced shell feature of your choice if you have time.
Features
Interactive Mode and Script Mode
The shell is in interactive mode when the shell itself is invoked with zero arguments:
$ ./your-shell
your-prompt> /bin/ls
Makefile README.md your-shell.c
your-prompt> exit
$
When invoked in interactive mode, the shell should print a prompt each time it is ready to accept a new command.
The shell is in script mode when it is invoked with one argument, assumed to be the name of a file containing a sequence of commands separated by newlines.
$ cat your-script.sh
/bin/ls
echo hello world
exit
$ ./your-shell your-script.sh
your-prompt> /bin/ls
Makefile README.md your-shell.c
your-prompt> echo hello world
hello world
your-prompt> exit
$
When invoked in script mode, the shell should do the following for each line of the file whose path is given by the first argument:
- Print a prompt followed by the line from the file.
- Run the command indicated by the line from the file.
When invoked with 2 or more arguments, the shell should immediately exit in error with a message about its proper usage.
In both modes:
- The shell should exit when it encounters the
exit
command or the end-of-file (EOF) indicator, which can be typed at an interactive shell with Control-D. - After each command line, the prompt for the next command-line must not be shown until the previous command line has completed.
Builtin Commands
Builtin commands invoke functions of the shell itself to change state of the shell process or terminate it. They do not create new processes. The shell must support at least these builtins:
-
The builtin command
cd
takes a single path as an argument and changes the working directory of the shell process to the directory given by that path, if it exists. Optionally,cd
may support any other convenience features such as:cd
with no arguments changes the working directory to the current user’s home directory.cd -
changes to the previous working directory.- more of your choosing.
-
The builtin command
exit
terminates the shell process itself.
Optionally, the shell may support any other builtin commands that
interest you such as pushd
or popd
.
Executable Commands
Any command that is not a builtin command is assumed to be an executable command. For executable commands, the shell must create a new child process and execute the given command in that new process. Once the child process has completed (and no sooner), the shell should continue to the next command prompt.
For example, this command runs the executable from the file /bin/ls
with command-line arguments /bin/ls
and cs341/shell
.
your-prompt> /bin/ls cs341/shell
Makefile README.md your-shell.c
your-prompt>
If the command gives an executable that does not exist, an error message to this effect should appear, and then the shell should continue to the next command prompt.
your-prompt> no-such-executable whhaaaaaat
error: could not find executable "no-such-executable"
your-prompt>
Do not implement executables; implement the logic to launch an arbitrary executable.
(Now with an orange box and new wording, since it is important!)
Note: the shell itself does not implement any executables! Neither
will you implement any executables except the shell. You will not
write the logic of ls
or cat
, etc.
The shell merely invokes existing executable files, given their path
in the filesystem. The shell is a launcher of executables, not a
provider of executables. This means that a single case in the shell
can handle all possible executable commands, whether they be
executables that came with the system (like ls
) or brand new
programs that we write and compile.
When running the executable command /bin/ls cs341/shell
, the shell
has no clue what ls
does or even whether it exists. It just attempts
to launch a process and exec
the file at the given path, /bin/ls
,
with the given arguments, "/bin/ls"
and "cs341/shell"
.
In recommended verbose mode, the shell should print a message indicating the PID, executable name (and optionally arguments), and exit status of the child process when it launches and when it completes:
your-prompt> /bin/ls cs341/shell
[Launching: 2534 /bin/ls]
Makefile README.md your-shell.c
[Completed: 2534 /bin/ls with exit code 0]
your-prompt>
Optionally, the shell may support invoking executables by name
only (without a complete path) by searching for executables with that
name in directories listed by the PATH
environment variable. For
example, ls cs341/shell
would have the same behavior as /bin/ls
cs341/shell
, assuming that /bin
is listed in the PATH
environment
variable. Check documentation of exec
-related functions as a
starting point or implement path search yourself.
Input and Output Redirection
For executable commands, the shell should include support for
redirecting the standard input from a file (./executable <
input-file.txt
), redirecting the standard output to a file
(./executable > output-file.txt
), or both (./executable <
input-file.txt > output-file.txt
).
The key symbols identifying redirection are <
for input redirection
and >
for output redirection. In both cases, the file name for
redirection appears as the next token in the command line string to
the right of the <
or >
token.
The effect of input redirection is that the standard input
file descriptor (stdin
) of the child process that runs the
executable should be connected to the given file instead of the
terminal keyboard.
your-prompt> /bin/cat input.txt
Hello world.
This is a file.
your-prompt> /bin/cat < input.txt
Hello world.
This is a file.
your-prompt>
Redirects are entirely invisible to the executable
Notice that the strings "<"
and "input.txt"
are not part
of the argument array passed when executing /bin/cat
. They are
special directives to the shell indicating the shell should set up
redirection of stdin
for the child process in which it executes
/bin/cat
.
If the input file does not exist, an error should appear, the executable should not be invoked, and the shell should then provide the next command prompt.
your-prompt> /bin/ls
input.txt
your-prompt> /bin/cat < not-here.txt
error: no such file
your-prompt>
The effect of output redirection is that the standard output
file descriptor (stdout
) of the child process that runs the
executable should be connected to the given file instead of the
terminal screen.
If the output file does not exist, it should be created. If the output file does exist (and is a file), it should be overwritten. If a directory exists at the output file path, an error should appear, the executable should not be invoked, and the shell should proceed to the next command prompt.
your-prompt> /bin/ls
hello.txt
your-prompt> /bin/cat hello.txt
Hello world.
your-prompt> /bin/cat hello.txt > a.txt
your-prompt> /bin/ls
a.txt hello.txt
your-prompt> /bin/cat a.txt
Hello world.
your-prompt> /bin/echo Helloooooooooo wooooooooorld > a.txt
your-prompt> /bin/ls
a.txt hello.txt
your-prompt> /bin/cat a.txt
Helloooooooooo wooooooooorld
your-prompt>
The best way to understand the expected behavior of these features is to use them in an existing shell.
The dup/dup2 system calls will prove useful. It is important to think about which process needs to use them, and when, relative to other steps.
Optionally, the shell may support the full flexibility of redirection syntax offered by most shells:
- Spaces are not required on either side of
<
or>
. For example:./executable>output-file.txt
- Redirection indicators can come in any order with respect to each
other or the executable. For example,
> output-file.txt ./executable arg1 <input-file.txt arg2
. Note that it is still the case that the next token to the right of the<
or the>
must be the redirect filename.
Pipes
Shell pipe commands connect the standard output of one process to
the standard input of another. For example, the command /bin/cat
names.txt | /bin/sort
launches two processes:
- The first process runs the
/bin/cat
executable. - The second process runs the
/bin/sort
executable.
All output written to stdout
(standard output) in the /bin/cat
process becomes available as input readable from stdin
(standard
input) in the /bin/short
process. The shell continues to the next
command prompt only once all processes in the pipeline have completed.
your-prompt> /bin/cat names.txt
Pendleton
Clapp
Lulu
your-prompt> /bin/sort < names.txt
Clapp
Lulu
Pendleton
your-prompt> /bin/cat names.txt | /bin/sort
Clapp
Lulu
Pendleton
your-prompt>
The pipe and dup/dup2 and system calls will prove useful. It is important to think about which processes need to use them, and when, relative to other steps.
Think carefully about error cases involving pipes (such as one process in the pipeline failing in error while the others await pipe interaction). I will be happy to think through the logic with you. I will also be forgiving in grading this feature, especially if you document assumptions and the behavior your have implemented.
In recommended verbose mode, the shell should print a message indicating the PID, executable name (and optionally arguments), and exit status of each child process in the pipeline as that process completes:
your-prompt> /bin/cat names.txt | /bin/sort
[Launching: 3482 /bin/cat]
[Launching: 3483 /bin/sort]
[Completed: 3482 /bin/cat with exit code 0]
Clapp
Lulu
Pendleton
[Completed: 3483 /bin/sort with exit code 0]
your-prompt>
Note that the order of messages about separate process and the output or messages from other pipeline processes may not be predictable.
your-prompt> /bin/cat names.txt | /bin/sort
[Launching: 3483 /bin/sort]
[Launching: 3482 /bin/cat]
Clapp
Lulu
[Completed: 3482 /bin/cat with exit code 0]
Pendleton
[Completed: 3483 /bin/sort with exit code 0]
your-prompt>
Optionally, the shell may support:
- Using both redirection and piping together:
./executable < input.txt | cat > output.txt
- Pipelines with more than two commands:
cat names.txt | sort | uniq | grep Wellesley
More Features
If you have more time and interest, your are encouraged (but not required) to support other interesting features beyond the standard requirements. These could include (but are not limited to) the following, listed in rough order of complexity (lowest to highest):
- Command sequences with
;
as a separator. The command line./executable >output-file.txt; cat output-file.txt
first runs./executable >output-file.txt
; after that command finishes, it runscat output-file.txt
. Sequencing should be usable with redirection and piping. The;
has the lowest precedence. - Several of the optional behaviors above.
- Background jobs and job control. See typical definitions here.
- Signal-handling for
Control-C
(generally relevant) andControl-Z
(along withfg
andbg
commands). - Other combinations of the optional behaviors above.
- Other shell language features such as variables or control-flow structures.
Design
You may organize the code of your shell you wish. It is likely that you want at least these key components:
- Code for a main shell loop that deals with the repeated steps that occur in response to each new command line entered by the user.
- Code for parsing a command line string.
- Code for implementing shell build-in commands, which allow the user to invoke function of the shell program itself.
- Code for invoking executable commands with the various shell options including input/output redirection or pipes.
Tips
Trouble
As you start manipulating processes and adding redirection and pipe features, you will probably get stuck at some point: your shell might just stop responding to input. In this event, you can try Control-C to interrupt/kill your shell (assuming you have not implemented special signal-handling behavior), but sometimes even that may not work and even if it does, you may leave other processes stuck. Log in via a second SSH connection and check out these commands and their documentation:
ps ux
kill
,kill -9
pgrep
,pkill
You may find it helpful to include the process ID (pid) of your shell somewhere in its output to simplify this task…
History
You might find readline useful if you want to implement command-line history.
Submission and Grading
Submit by ensuring that the material you want evaluated is available on
the main
branch in your GitHub repository.
Evaluation will include a live demo and code review. It will focus on completeness/correctness (80%) and code clarity/style/documentation (20%).