Web Hosting

Often at the end of CS 204, students want to know how, if they create their own personal website (maybe a portfolio of accomplishments that they'll show to potential employers), how they can do that. In short, you need a place for the website to live, which is called a web host.

You have many options:

  • Host on cs.wellesley.edu, which will last until you graduate but at some point you'll have to find your own web host.
  • Create a GitHub account and host on github.com. Many employers expect something like this, but other possibilities exist, such as bitbucket.com
  • A private web hosting company. I personally use dreamhost.com but that company is just one of many.

Getting your Own Website

This guide to Best Web Hosting Services in 2020 seems useful.

Hosting on cs.wellesley.edu

Your account on cs.wellesley.edu AKA tempest is a general purpose Unix account.

In it is a directory called public_html. Any file in that directory is on the world-wide web. That is, Apache is willing to respond to a request for that file by delivering it to the requesting browser.

How do you put something on the web? Copy it to that directory.

How do you remove it from the web? Move it out of that directory (or change the permissions so that Apache can't read it).

Privacy Barrier

The previous section wasn't exactly right. Because we know that students are still learning about web pages and often have pages that are "under construction," we have implemented a default "privacy barrier."

That privacy barrier is simple: access to a file is granted if either:

  1. the requesting browser is on-campus (defined as having a wellesley.edu IP address, so "on-campus" also means anyone using the VPN).
  2. the requesting browser has some guest credentials (a username and password)

I gave you these credentials in class. You can email me if you forget.

You can give out the guest credentials to friends and family. They don't allow any power to modify anything on the server; they merely allow the ability to read pages. Still, you need to be careful with them, since giving them out allows the person to ready every student's web pages, not just yours, so don't post them for the world to read.

.htaccess

The privacy barrier is implemented by a magic file in your public_html folder called .htaccess. Yes, the filename starts with a dot. That makes it invisible for most directory listings. However, you can see "dot files" by including the -A option (for all files) to the ls command:

[wendy@tempest public_html]$ ls
cs204 cs204-assignments
[wendy@tempest public_html]$ ls -A
.htaccess cs204 cs204-assignments
[wendy@tempest public_html]$ ls -lA
total 4
-r--r-----. 1 wendy apache 515 Feb  4  2016 .htaccess
-r--r-----. 1 wendy apache 1024 Feb  21  2016 cs204
-r--r-----. 1 wendy apache 1024 Feb  21  2016 cs204-assignments 

Apache looks for a file of that name in your public_html for additional configuration information, and the contents of that specifies the protection barrier.

Because the name of that file is special, all you need to do to remove the privacy barrier is to rename that file. You can rename it back if you want to restore the barrier:

[wendy@tempest public_html]$ ls -A
.htaccess cs204 cs204-assignments
[wendy@tempest public_html]$ mv .htaccess dot.htaccess
[wendy@tempest public_html]$ ls 
cs204 cs204-assignments dot.htaccess

index.html

Another special filename is index.html. If the requested URL ends with a diretory, omitting the filename in the directory, Apache looks for a file named index.html in that directory. If it exists, that's the file that is delivered. Thus, it's the default page in that directory.

Consequently: https://cs.wellesley.edu/~wendy/ gets the index.html file from Wendy's public_html.

Note that public_html is not in the URL. That's because it's implicit.