Unix Skills¶
This course requires you to work on a Linux server (the CS department server, also known as tempest), using a command line, and your life will be much easier if you have some Unix skills. (Linux is one of many descendants of Unix, an operating system that predates DOS and Windows. Mac OS X is another descendant, so many of the commands and concepts below will work on a Mac as well.)
Motivation¶
You already have a lot of experience working with computers, but with a GUI. A GUI is a graphical user interface, meaning icons, menus, and even drag-and-drop — heaven help us. Although Mac and Windows are rather different, they both use GUIs that rely crucially on those techniques to get work done. GUIs are intuitive, user-friendly, and easy to work with. I'm asking you to give all that up in favor of a CLI (command line interface), so I'd better have a really good reason.
The main reason is that the easiest way to connect to the server is through a relatively narrow tunnel, through which it's quick and easy to send textual commands and get textual responses. (It's possible to set up an X11 tunnel and use a Linux GUI, but that would require you to become familiar with a Linux GUI, so I think it's better to learn some skills that are more universally applicable.)
Another reason, almost as important, is that these skills will make you more efficient both now and in the future, in ways that you can't yet foresee. That's because programming languages are (for the most part) textual, and there's only a very blurry line between a program to accomplish some task and a series of commands to accomplish something. In short, commands can be automated in a way that is hard or impossible for a GUI.
Consider a quick example. Using a GUI, changing the permissions on a file takes about 5 mouse clicks (not including all the navigating around to find the file). Changing the permissions on a file using a CLI takes about 20 keystrokes. If you have to change the permissions on 100 files, it'll take you 5×100=500 mouse clicks, but about 20 keystrokes will do the trick using a CLI, with judicious use of wildcards. (Plus, if you're a good typist, those 20 keystrokes can take less time than those 5 mouse clicks.)
A software developer named Lawrence D'Oliveiro talks about this in his Tumblr post entitled CLI versus GUI Deathmatch!. You should also read a post about Linux philosophy. (I'll bet you didn't know an operating system could have a philosophy!)
Directories¶
Like all filesystems, Unix is organized as a tree of directories. To be able to refer to another directory or file, you have to understand the notation of the filesystem. Here are some to know about:
- /
- The directory whose name is the slash character is the root of
the tree. Every directory and file is a descendant of this
directory. You can
uniquely specify a directory or file by starting at the root. For
example, Scott's web page is:
/home/anderson/public_html/home.html
The preceding is called an absolute pathname, since it works from anywhere.
You'll notice that the different directories in the tree of directories are separated from each other with slashes. So, a slash plays two roles: it separates parent from child and also stands for the root of the ancestry tree. A side-effect of this notation is that you cannot name a directory or file with a slash in the name. (Well, you *can* if you try hard, but don't.) (On Mac OS, the directory separator is a colon; on Windows, it's a backslash.)
- .
- The directory whose name is a single period (pronounced
dot
) is like the pronounme
or the wordhere
: it stands for the directory you are in. Each process on a Unix system has a "current working directory" (CWD) and all relative pathnames implicitly start at the CWD. For example, the follow are two equivalent ways to say "the file namedfoo.textin the current working directory:"./foo.text foo.text
Relative pathnames are very useful and important because they allow code to be relocatable, meaning that a directory subtree can be copied to another location, possibly even on another machine, and all relative pathnames that stay within the subtree will still work!
- ..
- The directory whose name is a double period (pronounced
dot dot
) is like the wordmom
: it stands for the unique parent directory of a directory. For example, the following says "the file namedbar.textin the the parent of the current working directory:"../bar.text
The dot-dot syntax can be very useful in relative pathnames, to address a file that is related via an ancestor (say an uncle, or a second cousin, once removed).
- ~/
- The directory whose name is a tilde is another kind a pronoun:
it means your login directory, referring to a kind of "database"
called the password database. For example, each of you has a
public_htmldirectory in your home directory. Here is how you would address a file in that directory:~/public_html/index.html
Remember that for each of you, that will be a different file!
Also, because it refers to the password database, the tilde is typically not available in programs, but it is a nice feature of the shell, so you will often use it in commands.
- ~user/
- Tilde has a second usage, where it is immediately followed by the
name of a user: it means the login directory of that user. For
example, the following path would be the address a file in Scott's
account on the CS server:
~anderson/public_html/home.html
Most of the pathname concepts above may be familiar to you from URLs, since the syntax of the pathname in a URL derives from Unix pathnames.
With these concepts in mind, hopefully the following sections will be more clear. I will be more terse in these sections, so if you find a command confusing, I encourage you to make use of one of those thousands of web tutorials on unix and linux. At the end of this document, I have links to the man pages for these commands.
Conventions and Prompts¶
In the following sections, I will give sample input and output from interactions with a Linux machine (actually, Tempest, the CS department server).
When you are logged into a Linux machine (directly via the console or or across the network via ssh), you will be running a "shell" (a shell is just a program that allows you to run commands). When the shell is ready for your command, it will print a "prompt." That prompt is wildly customizable. On Tempest, the default prompt is like this:
[user@host cwd]
That is, the shell prints three pieces of information, enclosed in square brackets: the username you're logged in with, the name of the machine you're logged into (the host), and the name of your current working directory (cwd). In the examples below, I will usually be logged into the "Wendy Wellesley" test account, so the prompt will look like this:
[wwellesl@tempest ~]
You will never type that part!
For brevity, I will sometimes replace that with just a $ prompt. Don't
type that, either.
Also note that all these examples have a typographic convention that
the stuff you're supposed to type is in bold
monospace and the responses and other output is in regular
monospace. Any tutorials you read on Linux will probably have
occurrences of a prompt (possibly very terse, such as a dollar sign or
a percent sign), and may have conventions to help you distinguish what
you type from the computer's response.
man¶
From the very beginning, Unix machines have had online "manuals" for use by everyone from novices to experts. Probably only one (unix) person in a thousand remembers more than a handful of the options for the "ls" command. So, when you're logged in, don't hesitate to use the "man" command to learn more about a command you're unfamiliar with:
$ man ls
To exit from man, type "q".
Of course, these online man pages are on the web as well; I give some links at the end of this page.
Navigating¶
As I mentioned, the shell always puts you "in" a directory, your "current working directory" (CWD, also called "." or dot). Commands to know:
- ls
- lists the files and directories in the given directory. With no arguments, lists the contents of the CWD.
- cd
- changes the CWD to the given directory. With no arguments, changes to your home directory.
- pwd
- prints the absolute pathname of the CWD, in case you forget where you are.
Moving and Copying¶
Now that you can move around, you'll want to be able to move and copy files. Commands to know:
- cp
- copies the first argument (a file) to the second argument (either a file or a directory). There are many other options; see the man page for more.
- mv
- moves the first argument (a file) to the second argument (either a file or a directory).
- rm
- removes (deletes) the file(s). Caution! This is not a reversible operation: there is no "un-rm" command.
Tab completion¶
The Unix shell has many built-in conveniences for power users and poor typists. One you should know about is "tab completion." If you type part of a filename, enough to identify a unique file in the directory, and you hit the "tab" key (above caps lock on the left side of your keyboard), the shell will fill out the rest of the filename. If your prefix is not unique, the shell will fill out as much as it can, and allow you to make a choice of how to continue.
You don't have to do this, of course, but it beats typing the whole name, which is slow and error-prone.
Wildcards¶
If you want a command such as ls or rm to apply to several or many
files, you can list all of them on the command line, but that can be
tedious if there are many files. Wildcards are special characters that
match any character, allowing you to specify a pattern for the
filenames. (Like a wildcard in a card game.)
- *
- The asterisk character matches any character and as many as possible.
Just be super careful using both rm and the asterisk; it's really
easy to delete all your files!
Making Files and Directories¶
To make a file, you would historically use a text editor, such as Emacs or vim. Emacs and vim are very different in usage, philosophy and user base. Emacs is slower to start up, but bloated with many features. vim is quicker to start up but is leaner. There are many other differences, but this is not the place to continue the decades-long cold war between the Emacs and vim factions.
You should know, however, that I'm firmly in the Emacs camp.
In this course, we'll be using Visual Studio Code, so you don't need to learn Emacs or vim. It's good to squirrel that knowledge in the back of your mind, though, because in a different environment, you might not have Visual Studio Code, but if you're on a Unix system, you will always have vi and almost always have Emacs. (I can think of only one time in my life when Emacs wasn't already installed, and it only took a few minutes to install it.)
To manage files and directories, use these commands:
- touch file
- creates an empty file with the given filename. You may never use this command, but it's very useful in demonstrations and experiments.
- mkdir dir
- creates the named directory.
- rmdir dir
- removes (deletes) the directory, but only if it's empty.
- rm -r dir
- recursively removes (deletes) the directory tree. Caution! This command is even more dangerous than "rm" itself. Not for the faint of heart.
zip¶
If someone wants to give you a bunch of files and directories, they could attach each of them to a mail message to you, or put them all on a web server where you could download them, but what if there were hundreds or thousands of files and directories? Handling them all one-at-a-time would be tedious at best.
One option is to use Zip. (You probably downloaded a zip file containing VSCode.) Indeed, Gradescope allows you to upload a collection of files as a zip file, so you will probably use zip in this course to upload assignments.
Zip is one of several ways to create a single file that contains a copy of a directory tree. (Note that this is a copy or snapshot of the files; if the files subsequently change, the contents of the zip file does not.) To zip up a folder:
zip -r foo.zip foo
The file foo.zip is created by that command, while foo should be
an existing subfolder of the current directory.
Permissions¶
Tempest is a multi-user machine. There are faculty accounts, course accounts, project accounts, and student accounts, including yours. Naturally, on a multi-user machine, we have to worry about security and privacy in a way that you can mostly ignore on your laptop.
The way this is done in Unix is called permissions. You can decide whether a file or folder can be read by others and whether it can be written by others. By default, your folders and files are private, meaning they can only be read/written by you. This is a good default and you should not change it.
However, you will sometimes have to change the permissions of a file or folder. In particular, in my courses, you will have to allow the web server (Apache) to read your web pages. The raw Unix command to do this is called chmod. Using chmod is a bit complicated. There are lots of tutorials out there; use web search or AI for help.
As a shortcut, I have created a Wellesley-only command called
opendir. It takes one argument, which is the name of the folder that
you want to allow others to read. Use it like this:
opendir foo
where you are in the folder that contains the subfolder foo. We will
use that command many times this semester.
ssh and scp¶
Often, the computer we are physically touching, using its keyboard and mouse, and looking at its screen, is not the one we want to be working with. For example, you login to your own laptop but to do your work, you have to login to Tempest and modify your files there.
Visual Studio Code's remote development environment, which we will be using in this class, uses SSH and SCP behind the scenes. So, while you might not be explicitly using these commands, you will be implicitly using them, so it's a good idea to understand some of the concepts and pitfalls.
The following commands enable this remote work across the network:
- ssh user@host
- Remotely login to the given
host computer as the given user account. ssh will
prompt you for the password for the account and relay it to the
host. If the password is accepted, ssh will start a remote shell for
you. A host is the name of a computer, such as
tempestor, more precisely,tempest.wellesley.eduwhich is the same ascs.wellesley.edu. - scp local-file user@host:path/to/remote/file
- This command copies a local file to a remote file. (Notice the
user@hoston the destination.) This command is a lot likecpexcept that you can precede the filenames with user@host: to have them copied across the network to the destination host. You can use this command to copy a file from your local machine, say your laptop, to Tempest, or from your C9 workspace to Tempest. - scp user@host:path/to/remote/file path/to/local/file
- scp can also go the other way, copying a file from the remote host to your local machine.
As an example, I logged into my Mac (station #01 in H305 where I logged in as "sanderso") to do the following. Notice the different prompt on the Mac versus Tempest.
sci-h304-01:~ sanderso$ cd Desktop/ sci-h304-01:Desktop sanderso$ ls -l mypage.html -rw-r--r-- 1 sanderso WELLESLEY\Domain Users 0 Jan 26 12:29 mypage.html sci-h304-01:Desktop sanderso$ scp mypage.html wwellesl@tempest:public_html/ The authenticity of host 'tempest (149.130.15.5)' can't be established. RSA key fingerprint is ae:53:ce:76:03:10:a9:23:ee:89:14:5a:23:3f:fb:32. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'tempest,149.130.15.5' (RSA) to the list of known hosts. wwellesl@tempest's password: mypage.html 100% 0 0.0KB/s 00:00
Let's take a moment to look at that scary message from scp (which
you will probably get from VSCode when you first login). The ssh and
scp programs are secure, and they protect against eavesdropping by
encrypting all traffic to and fro, and they protect against machine
"spoofing" by checking the identity of the remote host. If you've
never previously connected to that remote host from this local host,
scp can't check the identity so it asks whether you are
sure. On-campus, you can comfortably always say "yes," since LTS has
good control of the hostnames. Across the wilds of the internet, say
in a random airport wifi network, spoofing can arise, so you have to
be more thoughtful. We don't have time to get into that here, though,
so let's continue with our example.
sci-h304-01:Desktop sanderso$ ssh wwellesl@tempest wwellesl@tempest's password: Last login: Wed Feb 20 15:14:31 2023 from 149.130.206.217 [wwellesl@tempest ~] cd public_html/ [wwellesl@tempest public_html] ls -l mypage.html -rw-r--r--. 1 wwellesl wwellesl 0 Jan 26 12:33 mypage.html [wwellesl@tempest public_html] logout Connection to tempest closed.
Did you notice that we didn't get the scary message from ssh the
second time? Did you also notice the different prompt, so that we know
where we are? This is important; it's easy to get confused when you
have different shells, all on the same screen, but logged into
different machines. (It's not uncommon for me to be logged into 3 or 4
machines from my laptop.)
Oh, and there's the logout command. I didn't teach you that; it's
pretty easy to guess what it does. You should always logout of a
machine when you're done, because connections do use up resources and
a host can't support an infinite number of them.
drop¶
Now that you know about permission bits and such, you understand a bit more deeply what prevents you from copying one of your files into a directory that I own: the permission bits on that directory don't allow you ("others") to write to that directory.
But what if I wanted to allow you to write to one of my directories in a controlled way, say as a way of submitting an assignment. An analogy would be like sliding your printout under my door: you can put something of yours into something I own, but it then becomes mine and you can't pull it back out again, though you might be able to look at it.
There is no standard, built-in, Unix command to do what I've described, but I have written one for us at Wellesley.
The drop command only works on Tempest, so you need to make sure the
file is there, first.
- drop account file
- Copy the given file to the "drop" subdirectory of the given account. Actually, copy it to a special sub-directory for all your submissions, named for your account.
Here's the "drop" command in action, dropping to the cs204
course. Make the obvious substitution if you are dropping to
cs304flask or cs304node:
[wwellesl@tempest public_html] ls -l wendy.html -rw-rw----. 1 wwellesl wwellesl 152 Jan 15 2020 wendy.html [wwellesl@tempest public_html] drop cs204 wendy.html Copying wendy.html (from wwellesl) to /home/cs204/drop/ (uid 7003) /home/cs204/drop/wwellesl doesn't exist, making it. Successful drop. [wwellesl@tempest public_html] ls -l /home/cs204/drop/wwellesl/ total 4 -r--r-----. 1 cs204 wwellesl 152 Jan 25 18:50 wendy.html
Notice that the drop created the wwellesl subfolder of the
/home/cs204/drop folder, just for us, since this is our first drop.
By the way, if we needed to drop a whole bunch of files, we could tar them up and drop the tarfile.
Command Summary¶
There are, of course, many other useful commands, but these should get you started. Here they all are, with links to man pages, thanks to tutorialspoint.com: