Building Web Pages and using HTML Forms

Many of you, but not all of you, said you knew HTML and CSS, at least a little. Some of these sections are intended to get those who don't know HTML and CSS started. They are at the bottom, since most of you won't need them. If you do, follow these links to Learning HTML and Learning CSS.

The outline for this reading is

  • HTML basics: if you know all this, you'll be fine
  • CSS basics: a few useful parts of CSS
  • Validity: checking that you've followed the "rules" as laid down by the World Wide Web Consortium (W3C)
  • Accessibility: an important aspect of web pages, so that they are accessible to all

Basic HTML Page

Here is a basic page. Here's another page with some hyperlinks to other pages, including the first page and itself. You can see the HTML code by doing a "view source".

Here is the source code for the first example:

<!doctype html>
<html lang='en'>
<head>
    <meta charset='utf-8'>
    <link rel='stylesheet' href='sda-style.css'>
    <meta name=author content="Scott D. Anderson">
    <title>Sample Home Page</title>
</head>
<body>

<h1>Sample Home Page</h1>

<p>Here is a list of some fruits and vegetables:</p>

<ul>
  <li class="fruit">apple</li>
  <li class="fruit">banana</li>
  <li class="veggie">broccoli</li>
  <li class="veggie">carrot</li>
  <li class="fruit">date</li>
  <li class="veggie">peas</li>
</ul>

<script src="//ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
<h2>Accessibility</h2>

<p>You can <a href="https://wave.webaim.org/">check this page with the WAVE tool</a></p>

<p>&copy; Scott D. Anderson<BR>
This work is licensed under a <a rel="license"
href="https://creativecommons.org/licenses/by-nc-sa/1.0/">Creative Commons License</a> </p>

<!-- Creative Commons License -->
<ul id="iconlist">
<li>
<a rel="license"
href="https://creativecommons.org/licenses/by-nc-sa/1.0/"><img
alt="Creative Commons License" style="border: 0"
src="somerights.gif"></a> 
<!-- /Creative Commons License -->

<li><a href="https://www.anybrowser.org/campaign/"><img
style="border:0"
src="enhanced.gif" width="96" height="32" alt="Viewable With Any
Browser"></a> 

<li><a href="https://validator.w3.org/"><img
style="border:0;width:88px;height:31px"
src="valid-html5v1.png"
alt="Valid HTML 5" height="31" width="88"></a> 

<li><a href="https://jigsaw.w3.org/css-validator/"><img
style="border:0;width:88px;height:31px"
src="vcss.gif" alt="Valid CSS!"
height="31" width="88"></a>
</ul>

</body>
</html>

Your web applications will comprise lots of such pages and links, so everyone should feel comfortable with doing these. If there's anything you don't understand, brush up or ask. Some observations:

  • many basic structural tags: HTML, HEAD, BODY, META, LINK
  • tags like TITLE for meta information
  • page tags like H1, H2, OL, UL, LI, P
  • images via IMG
  • hyperlinks using the A tag
  • adding JavaScript with the SCRIPT tag

If you're comfortable with all of those, you're in good shape. That's not a comprehensive list (table tags would be helpful, generic containers like SPAN and DIV, etc), but it's a good start.

If not, you should read some of the material in Learning HTML. However, I'll do very brief introductions of a few popular elements.

Images

Images can be put on the page using the img tag, like this:

<img src="url-of-the-picture" alt="textual alternative">

The URL is almost always a relative URL to a file in the same folder as the HTML file (or maybe a subfolder, but nearby). For example, with this set of files and folders:

top/
   page.html
   connie.jpg
   sub/
      jamie.jpg

We can put the following elements in page.html to display those two images:

<img src="connie.jpg" alt="my sister">
<img src="sub/jamie.jpg" alt="my nephew">

All the things you learned about Unix pathnames can be brought to bear here, even .. for a parent folder.

Note that the alt attribute should be considered required, for accessibility.

A hyperlink is a "container" in the sense of turning a span of text into a link. Thus, it has a beginning and an end. The destination is given in the beginning. Like this:

Here is a link to <a href="https://www.nytimes.com/">The New York Times</a>

which looks like:

Here is a link to The New York Times

Basic CSS

The page above doesn't look great, but any cosmetics it has are due to CSS. View the CSS file to take a look. Here's a brief list of the basic CSS skills that you should know.

  • box properties such as margin, border, padding and width
  • fonts: family, size, weight, style
  • colors and background colors
  • display:block and such

There are also techniques for applying CSS, so you should know about these kinds of selectors:

  • TAG, such as BODY or H1
  • ID, such as #iconlist
  • CLASS, such as .fruit and .veggie
  • descendant selectors, such as #iconlist LI

If you understand all those, you're in excellent shape.

If not, you should read some of the material in Learning CSS

URL Anatomy

Let's take a moment to talk a bit about URLs, starting with a longish example:

https://cs.wellesley.edu:1942/journals/mining/search?text=gold+silver

breaks down into:

protocol://host.domain:port/path/to/endpoint?querystring

Here's some more detail:

  • protocol: Almost always https but you occasionally see things like github or sftp or ssh. We'll use http: because encryption will get in our way. If your browser enforces https: like mine does, I'll show you a work-around.
  • host: the name of the server. Ours is cs.
  • domain: you buy these from a domain registrar. The college owns wellesley.edu
  • port: the particular "door" on the server that is listening for web requests.
  • path: the sequence of folders to locate the file or endpoint
  • endpoint: the application or program that receives your web request
  • query string: the data that a form using GET sends to the endpoint

Here's another, similar but slightly different at the end:

https://cs.wellesley.edu:1942/journals/mining/gold.html#fools

breaks down into:

protocol://host.domain:port/path/to/endpoint#fragment

The gold.html is a file of HTML code. The #fragment is the id of some element inside the HTML. It might be something like this:

<h2 id="fools">All about Fool's Gold</h2>

The id is an attribute that is part of the HTML language that the browser recognizes. Any HTML element can have an id attribute added to. The example above happened to put the id on an H2, but that was just the example.

All the H2 headers on this page have an ID, which is how the table of contents (TOC) is able to link to them. Go to the top of this page and click on the "table of contents" button to see a list of links to sections.

Note that the id must be unique. If you had two different elements with the same ID, which one would you want the browser to take you to?

We'll learn a lot more later.

Testing Accessibility

Accessibility is a big complex subject, and professional websites have trained developers, automated tools, and human testers to ensure accessibility. All that is outside the scope of this course, but I will introduce three important tools and require you to use them.

All HTML must be valid, which means it satisfies the structural rules set out by the WWW consortium (W3C). Valid HTML is important because screen readers and such can do a better job understanding the structure of the page if it follows the rules. Most browsers are much more forgiving, so don't assume that if it looks good in a browser that it's good.

You can validate your HTML using this website from the W3C: https://validator.nu/. That site works in three modes: you can give it a URL and it'll retrieve the page using the URL and validate it. That should work on the CS server, but it will not work on your personal laptop.

The other modes are file upload and direct input. The last mode allows you to just copy/paste your HTML from your browser to the validator, which only takes a minute and is very easy.

All CSS (see below) must be valid, for similar reasons as the HTML.

The W3C also provides a CSS validator that works the same way as the HTML validator: https://jigsaw.w3.org/css-validator

Check the page for common accessibility issues with the WAVE, the Web Accessibility Evaluation tool, https://wave.webaim.org/. It's important to note that getting no errors from the WAVE tool doesn't mean your site is accessible — only a person can decide that — but it's a useful tool nevertheless. Not passing the WAVE test is certainly undesirable, so I will require you to pass the WAVE test, meaning no errors (but warnings are okay).

Like the earlier validators, WAVE can retrieve a publicly hosted page given its URL and evaluate it. It doesn't have a mode where you can copy/paste your code, but there are two browser plugins that will evaluate the page in your browser. In less than 1 minute, I installed the browser plug-in, viewed my page page, and evaluated it. There are plug-ins for at least Chrome and Firefox.

Caches

Caches are an important part of the web browser and therefore the web developer. In the examples above, we often made use of external CSS files and external JavaScript file. That allows two pages to share those common files. But one of the great advantages of external style sheets and external JavaScript files is also, for a web developer, a slight bother. To understand that, you have to understand caches.

The word “cache” (pronounced “cash”) is an ordinary, but uncommon, English word (more of an SAT word for most people). However, it's used all the time by computer scientists because caches are used all the time by computers, in all kinds of ways, because caching is a general technique for speeding things up. In particular, a cache speeds things up by storing to avoid re-doing.

Specifically, your web browser will cache (keep a copy of) files you visit or reference (images, external CSS, JS libraries, etc) in a folder on your local machine. (That folder is called, of course, the cache.) If the web browser needs that file again, say on another page of your site that uses the same external style sheet, the browser doesn't have to re-download the file; it just grabs a copy from the cache. This makes the web browser faster.

So, why is the browser cache a problem for a web designer? Because if you make a change to the external style sheet, the web browser may continue to use the old cached copy, instead of getting the new improved copy from the server. This means that when you view your page, you won't see your changes — very frustrating.

The solution is to tell the web browser to ignore the cache when you re-load the page. In most web browsers, this is done by holding down the shift key when you click on the reload icon.

So, just remember:

When in doubt, use shift+reload

Conclusion and Summary

This reading has had many facets because HTML, CSS, and Forms are so important to this course, while also being topics that many (but not all) of you know something about, so they don't deserve a lot of our time. Nevertheless, you should know:

  • the basics of using HTML to create the structure of a web page
  • the basics of CSS for styling a page
  • how to use validators to check that you've used HTML and CSS correctly
  • how to make your page, particularly your forms, reasonably accessible
  • how caches affect our debugging of web apps

If you're feeling good about all of those, you can stop here. If you need more introduction to HTML or CSS, keep reading.

Learning HTML

HTML is a relatively simple language, compared to programming languages like Python or JavaScript. It is a markup language, which means it's just about structure: this is part of that and so forth.

The people in this class are, I sincerely believe, capable of teaching themselves HTML in short order. There are many online tutorials, of course, which you can find with a quick web search. Our own CS110 web site has a lot of information on writing web pages in HTML. If you're starting from scratch, I recommend reading the following pages:

  • CS 204 HTML is what I teach in CS 204
  • HTML. 21 pages. This talks generally about syntax of tags and URLs.
  • Here is the MDN (Mozilla Developer Network) HTML Tutorial, with links to beginning HTML and also to forms
  • Here is the W3Schools HTML tutorial
  • Tables This teachs you about tables in HTML, which is quite useful for formatting, um, tables. Since many of our query results are tabular, this can sometimes be useful for us.

Learning CSS

Making your website pretty is useful as well as nice, because something that looks good and is well laid out is often easier to use. Still, the amount of CSS you could learn is certainly greater than the amount you need to learn. Here are links to the first few CSS readings in CS 110:

Some reading for later, if ever.

Apache and public_html

So far in this course, we've been working in your ~/cs304 folder, which works fine for Node and will continue to work well. However, if we want our web pages delivered by Apache rather than Flask, we need to to put them in our ~/public_html folder. That folder, and only that folder, is read by Apache and consists of our globally accessible web pages.

You can download all these examples to the public_html of your Tempest account like this:

cd ~/public_html/ 
cp -r ~cs304/pub/downloads/web-pages web-pages 

You are welcome to learn from my example, but don't use it lock, stock and barrel for your assignment. Prepare yourself for the project by creating an interesting and relevant form.