Unix Skills

This course requires you to work on a Linux server (the CS department server, also known as tempest), using a command line, and your life will be much easier if you have some Unix skills. (Linux is one of many descendants of Unix, an operating system that predates DOS and Windows. Mac OS X is another descendant, so many of the commands and concepts below will work on a Mac as well.)

Motivation

You already have a lot of experience working with computers, but with a GUI. A GUI is a graphical user interface, meaning icons, menus, and even drag-and-drop — heaven help us. Although Mac and Windows are rather different, they both use GUIs that rely crucially on those techniques to get work done. GUIs are intuitive, user-friendly, and easy to work with. I'm asking you to give all that up in favor of a CLI (command line interface), so I'd better have a really good reason.

The main reason is that the easiest way to connect to the server is through a relatively narrow tunnel, through which it's quick and easy to send textual commands and get textual responses. (It's possible to set up an X11 tunnel and use a Linux GUI, but that would require you to become familiar with a Linux GUI, so I think it's better to learn some skills that are more universally applicable.)

Another reason, almost as important, is that these skills will make you more efficient both now and in the future, in ways that you can't yet foresee. That's because programming languages are (for the most part) textual, and there's only a very blurry line between a program to accomplish some task and a series of commands to accomplish something. In short, commands can be automated in a way that is hard or impossible for a GUI.

Consider a quick example. Using a GUI, changing the permissions on a file takes about 5 mouse clicks (not including all the navigating around to find the file). Changing the permissions on a file using a CLI takes about 20 keystrokes. If you have to change the permissions on 100 files, it'll take you 5×100=500 mouse clicks, but about 20 keystrokes will do the trick using a CLI, with judicious use of wildcards. (Plus, if you're a good typist, those 20 keystrokes can take less time than those 5 mouse clicks.)

A software developer named Lawrence D'Oliveiro talks about this in his Tumblr post entitled CLI versus GUI Deathmatch!. You should also read a post about Linux philosophy. (I'll bet you didn't know an operating system could have a philosophy!)

Directories

Like all filesystems, Unix is organized as a tree of directories. To be able to refer to another directory or file, you have to understand the notation of the filesystem. Here are some to know about:

/
The directory whose name is the slash character is the root of the tree. Every directory and file is a descendant of this directory. You can uniquely specify a directory or file by starting at the root. For example, Scott's web page is:
/home/anderson/public_html/home.html

The preceding is called an absolute pathname, since it works from anywhere.

You'll notice that the different directories in the tree of directories are separated from each other with slashes. So, a slash plays two roles: it separates parent from child and also stands for the root of the ancestry tree. A side-effect of this notation is that you cannot name a directory or file with a slash in the name. (Well, you *can* if you try hard, but don't.) (On Mac OS, the directory separator is a colon; on Windows, it's a backslash.)

.
The directory whose name is a single period (pronounced dot) is like the pronoun me or the word here: it stands for the directory you are in. Each process on a Unix system has a "current working directory" (CWD) and all relative pathnames implicitly start at the CWD. For example, the follow are two equivalent ways to say "the file named foo.text in the current working directory:"
./foo.text 
foo.text

Relative pathnames are very useful and important because they allow code to be relocatable, meaning that a directory subtree can be copied to another location, possibly even on another machine, and all relative pathnames that stay within the subtree will still work!

..
The directory whose name is a double period (pronounced dot dot) is like the word mom: it stands for the unique parent directory of a directory. For example, the following says "the file named bar.text in the the parent of the current working directory:"
../bar.text

The dot-dot syntax can be very useful in relative pathnames, to address a file that is related via an ancestor (say an uncle, or a second cousin, once removed).

~/
The directory whose name is a tilde is another kind a pronoun: it means your login directory, referring to a kind of "database" called the password database. For example, each of you has a public_html directory in your home directory. Here is how you would address a file in that directory:
~/public_html/index.html

Remember that for each of you, that will be a different file!

Also, because it refers to the password database, the tilde is typically not available in programs, but it is a nice feature of the shell, so you will often use it in commands.

~user/
Tilde has a second usage, where it is immediately followed by the name of a user: it means the login directory of that user. For example, the following path would be the address a file in Scott's account on the CS server:
~anderson/public_html/home.html

Most of the pathname concepts above may be familiar to you from URLs, since the syntax of the pathname in a URL derives from Unix pathnames.

With these concepts in mind, hopefully the following sections will be more clear. I will be more terse in these sections, so if you find a command confusing, I encourage you to make use of one of those thousands of web tutorials on unix and linux. At the end of this document, I have links to the man pages for these commands.

Conventions and Prompts

In the following sections, I will give sample input and output from interactions with a Linux machine (actually, Tempest, the CS department server).

When you are logged into a Linux machine (directly via the console or or across the network via ssh), you will be running a "shell" (a shell is just a program that allows you to run commands). When the shell is ready for your command, it will print a "prompt." That prompt is wildly customizable. On Tempest, the default prompt is like this:

[user@host cwd]

That is, the shell prints three pieces of information, enclosed in square brackets: the username you're logged in with, the name of the machine you're logged into (the host), and the name of your current working directory (cwd). In the examples below, I will usually be logged into the "Wendy Wellesley" test account, so the prompt will look like this:

[wwellesl@tempest ~] 

You will never type that part!

For brevity, I will sometimes replace that with just a $ prompt. Don't type that, either.

Also note that all these examples have a typographic convention that the stuff you're supposed to type is in bold monospace and the responses and other output is in regular monospace. Any tutorials you read on Linux will probably have occurrences of a prompt (possibly very terse, such as a dollar sign or a percent sign), and may have conventions to help you distinguish what you type from the computer's response.

man

From the very beginning, Unix machines have had online "manuals" for use by everyone from novices to experts. Probably only one (unix) person in a thousand remembers more than a handful of the options for the "ls" command. So, when you're logged in, don't hesitate to use the "man" command to learn more about a command you're unfamiliar with:

$ man ls

To exit from man, type "q".

Of course, these online man pages are on the web as well; I give some links at the end of this page.

As I mentioned, the shell always puts you "in" a directory, your "current working directory" (CWD, also called "." or dot). Commands to know:

ls
lists the files and directories in the given directory. With no arguments, lists the contents of the CWD.
cd
changes the CWD to the given directory. With no arguments, changes to your home directory.
pwd
prints the absolute pathname of the CWD, in case you forget where you are.

Moving and Copying

Now that you can move around, you'll want to be able to move and copy files. Commands to know:

cp
copies the first argument (a file) to the second argument (either a file or a directory). There are many other options; see the man page for more.
mv
moves the first argument (a file) to the second argument (either a file or a directory).
rm
removes (deletes) the file(s). Caution! This is not a reversible operation: there is no "un-rm" command.

Tab completion

The Unix shell has many built-in conveniences for power users and poor typists. One you should know about is "tab completion." If you type part of a filename, enough to identify a unique file in the directory, and you hit the "tab" key (above caps lock on the left side of your keyboard), the shell will fill out the rest of the filename. If your prefix is not unique, the shell will fill out as much as it can, and allow you to make a choice of how to continue.

You don't have to do this, of course, but it beats typing the whole name, which is slow and error-prone.

Wildcards

If you want a command such as ls or rm to apply to several or many files, you can list all of them on the command line, but that can be tedious if there are many files. Wildcards are special characters that match any character, allowing you to specify a pattern for the filenames. (Like a wildcard in a card game.)

*
The asterisk character matches any character and as many as possible.

Just be super careful using both rm and the asterisk; it's really easy to delete all your files!

Making Files and Directories

To make a file, you would historically use a text editor, such as Emacs or vim. Emacs and vim are very different in usage, philosophy and user base. Emacs is slower to start up, but bloated with many features. vim is quicker to start up but is leaner. There are many other differences, but this is not the place to continue the decades-long cold war between the Emacs and vim factions.

You should know, however, that I'm firmly in the Emacs camp.

In this course, we'll be using Visual Studio Code, so you don't need to learn Emacs or vim. It's good to squirrel that knowledge in the back of your mind, though, because in a different environment, you might not have Visual Studio Code, but if you're on a Unix system, you will always have vi and almost always have Emacs. (I can think of only one time in my life when Emacs wasn't already installed, and it only took a few minutes to install it.)

To manage files and directories, use these commands:

touch file
creates an empty file with the given filename. You may never use this command, but it's very useful in demonstrations and experiments.
mkdir dir
creates the named directory.
rmdir dir
removes (deletes) the directory, but only if it's empty.
rm -r dir
recursively removes (deletes) the directory tree. Caution! This command is even more dangerous than "rm" itself. Not for the faint of heart.

zip

If someone wants to give you a bunch of files and directories, they could attach each of them to a mail message to you, or put them all on a web server where you could download them, but what if there were hundreds or thousands of files and directories? Handling them all one-at-a-time would be tedious at best.

One option is to use Zip. (You probably downloaded a zip file containing VSCode.) Indeed, Gradescope allows you to upload a collection of files as a zip file, so you will probably use zip in this course to upload assignments.

Zip is one of several ways to create a single file that contains a copy of a directory tree. (Note that this is a copy or snapshot of the files; if the files subsequently change, the contents of the zip file does not.) To zip up a folder:

zip -r foo.zip foo

The file foo.zip is created by that command, while foo should be an existing subfolder of the current directory.

Permissions

Tempest is a multi-user machine. There are faculty accounts, course accounts, project accounts, and student accounts, including yours. Naturally, on a multi-user machine, we have to worry about security and privacy in a way that you can mostly ignore on your laptop.

The way this is done in Unix is called permissions. You can decide whether a file or folder can be read by others and whether it can be written by others. By default, your folders and files are private, meaning they can only be read/written by you. This is a good default and you should not change it.

However, you will sometimes have to change the permissions of a file or folder. In particular, in my courses, you will have to allow the web server (Apache) to read your web pages. The raw Unix command to do this is called chmod. Using chmod is a bit complicated. There are lots of tutorials out there; use web search or AI for help.

As a shortcut, I have created a Wellesley-only command called opendir. It takes one argument, which is the name of the folder that you want to allow others to read. Use it like this:

opendir foo

where you are in the folder that contains the subfolder foo. We will use that command many times this semester.

ssh and scp

Often, the computer we are physically touching, using its keyboard and mouse, and looking at its screen, is not the one we want to be working with. For example, you login to your own laptop but to do your work, you have to login to Tempest and modify your files there.

Visual Studio Code's remote development environment, which we will be using in this class, uses SSH and SCP behind the scenes. So, while you might not be explicitly using these commands, you will be implicitly using them, so it's a good idea to understand some of the concepts and pitfalls.

The following commands enable this remote work across the network:

ssh user@host
Remotely login to the given host computer as the given user account. ssh will prompt you for the password for the account and relay it to the host. If the password is accepted, ssh will start a remote shell for you. A host is the name of a computer, such as tempest or, more precisely, tempest.wellesley.edu which is the same as cs.wellesley.edu.
scp local-file user@host:path/to/remote/file
This command copies a local file to a remote file. (Notice the user@host on the destination.) This command is a lot like cp except that you can precede the filenames with user@host: to have them copied across the network to the destination host. You can use this command to copy a file from your local machine, say your laptop, to Tempest, or from your C9 workspace to Tempest.
scp user@host:path/to/remote/file path/to/local/file
scp can also go the other way, copying a file from the remote host to your local machine.

As an example, I logged into my Mac (station #01 in H305 where I logged in as "sanderso") to do the following. Notice the different prompt on the Mac versus Tempest.

 
sci-h304-01:~ sanderso$ cd Desktop/ 
sci-h304-01:Desktop sanderso$ ls -l mypage.html  
-rw-r--r--  1 sanderso  WELLESLEY\Domain Users  0 Jan 26 12:29 mypage.html 
sci-h304-01:Desktop sanderso$ scp mypage.html wwellesl@tempest:public_html/ 
The authenticity of host 'tempest (149.130.15.5)' can't be established. 
RSA key fingerprint is ae:53:ce:76:03:10:a9:23:ee:89:14:5a:23:3f:fb:32. 
Are you sure you want to continue connecting (yes/no)? yes 
Warning: Permanently added 'tempest,149.130.15.5' (RSA) to the list of known hosts. 
wwellesl@tempest's password:  
mypage.html                                    100%    0     0.0KB/s   00:00     

Let's take a moment to look at that scary message from scp (which you will probably get from VSCode when you first login). The ssh and scp programs are secure, and they protect against eavesdropping by encrypting all traffic to and fro, and they protect against machine "spoofing" by checking the identity of the remote host. If you've never previously connected to that remote host from this local host, scp can't check the identity so it asks whether you are sure. On-campus, you can comfortably always say "yes," since LTS has good control of the hostnames. Across the wilds of the internet, say in a random airport wifi network, spoofing can arise, so you have to be more thoughtful. We don't have time to get into that here, though, so let's continue with our example.

 
sci-h304-01:Desktop sanderso$ ssh wwellesl@tempest 
wwellesl@tempest's password:  
Last login: Wed Feb 20 15:14:31 2023 from 149.130.206.217 
[wwellesl@tempest ~] cd public_html/ 
[wwellesl@tempest public_html] ls -l mypage.html  
-rw-r--r--. 1 wwellesl wwellesl 0 Jan 26 12:33 mypage.html 
[wwellesl@tempest public_html] logout 
Connection to tempest closed. 

Did you notice that we didn't get the scary message from ssh the second time? Did you also notice the different prompt, so that we know where we are? This is important; it's easy to get confused when you have different shells, all on the same screen, but logged into different machines. (It's not uncommon for me to be logged into 3 or 4 machines from my laptop.)

Oh, and there's the logout command. I didn't teach you that; it's pretty easy to guess what it does. You should always logout of a machine when you're done, because connections do use up resources and a host can't support an infinite number of them.

drop

Now that you know about permission bits and such, you understand a bit more deeply what prevents you from copying one of your files into a directory that I own: the permission bits on that directory don't allow you ("others") to write to that directory.

But what if I wanted to allow you to write to one of my directories in a controlled way, say as a way of submitting an assignment. An analogy would be like sliding your printout under my door: you can put something of yours into something I own, but it then becomes mine and you can't pull it back out again, though you might be able to look at it.

There is no standard, built-in, Unix command to do what I've described, but I have written one for us at Wellesley.

The drop command only works on Tempest, so you need to make sure the file is there, first.

drop account file
Copy the given file to the "drop" subdirectory of the given account. Actually, copy it to a special sub-directory for all your submissions, named for your account.

Here's the "drop" command in action, dropping to the cs204 course. Make the obvious substitution if you are dropping to cs304flask or cs304node:

[wwellesl@tempest public_html] ls -l wendy.html 
-rw-rw----. 1 wwellesl wwellesl 152 Jan 15  2020 wendy.html 
[wwellesl@tempest public_html] drop cs204 wendy.html 
Copying wendy.html (from wwellesl) to /home/cs204/drop/ (uid 7003) 
/home/cs204/drop/wwellesl doesn't exist, making it. 
Successful drop. 
[wwellesl@tempest public_html] ls -l /home/cs204/drop/wwellesl/ 
total 4 
-r--r-----. 1 cs204 wwellesl 152 Jan 25 18:50 wendy.html 

Notice that the drop created the wwellesl subfolder of the /home/cs204/drop folder, just for us, since this is our first drop.

By the way, if we needed to drop a whole bunch of files, we could tar them up and drop the tarfile.

Command Summary

There are, of course, many other useful commands, but these should get you started. Here they all are, with links to man pages, thanks to tutorialspoint.com:

  1. man
  2. ls
  3. cd
  4. pwd
  5. cp
  6. mv
  7. rm
  8. mkdir
  9. rmdir
  10. zip
  11. chmod
  12. ssh
  13. scp