Welcome!
Everything is fine.

Building Web Pages and using HTML Forms

In Fall 2020, many of you, but not all of you, said you knew HTML and CSS, at least a little. Some of these sections are intended to get those who don't know HTML and CSS started. They are at the bottom, since most of you won't need them. If you do, follow these links to Learning HTML and Learning CSS.

The outline for this reading is

  • HTML basics: if you know all this, you'll be fine
  • CSS basics: a few useful parts of CSS
  • FORM basics: you should know all this
  • Validity: checking that you've followed the "rules" as laid down by the World Wide Web Consortium (W3C)
  • Accessibility: an important aspect of web pages, so that they are accessible to all

Basic HTML Page

Here is a basic page with two hyperlinks to other pages. Your web applications will comprise lots of such pages and links, so everyone should feel comfortable with doing these. Do a "view source" to see the actual HTML. If there's anything you don't understand, brush up or ask. Some observations:

  • many basic structural tags: HTML, HEAD, BODY, META, LINK
  • tags like TITLE for meta information
  • page tags like H1, H2, OL, UL, LI, P
  • images via IMG
  • hyperlinks using the A tag
  • adding JavaScript with the SCRIPT tag

If you're comfortable with all of those, you're in good shape. That's not a comprehensive list (table tags would be helpful, generic containers like SPAN and DIV, etc), but it's a good start.

If not, you should read some of the material in Learning HTML

Basic CSS

The page above doesn't look great, but any cosmetics it has are due to CSS. View the CSS file to take a look. Here's a brief list of the basic CSS skills that you should know.

  • box properties such as margin, border, padding and width
  • fonts: family, size, weight, style
  • colors and background colors
  • display:block and such

There are also techniques for applying CSS, so you should know about these kinds of selectors:

  • TAG, such as BODY or H1
  • ID, such as #iconlist
  • CLASS, such as .fruit and .veggie
  • descendant selectors, such as #iconlist LI

If you understand all those, you're in excellent shape.

If not, you should read some of the material in Learning CSS

Basic Forms

Web applications don't just deliver information from the user. They also get information from the user, even if it's just guidelines for a search. Of course, your projects will do more than that, including accepting information from the user that will be stored in a database. The fundamental way web applications get information from the user is through forms, so it's important to be comfortable with those, too.

Here is an sample form. Again, you can do view source to see the underlying HTML. Feel free to test it out! This form just reflects your input back to you. A real web application might squirrel it away in a database.

Here are some of the things you should know:

  • the FORM tag and its attributes, like method and action
  • the input tag and its attributes, like type and name
  • the select tag and its child tag, option
  • the label tag

If you know those, you've got a good start on knowing forms.

If not, you should read some of the material in Learning Forms

GET vs POST

The form above used the POST method. What does that mean, and what other choices are there?

vWhen a web browser sends a request to a web server, it does the request via one of two main ways (methods): GET and POST. (There are other methods like HEAD, DELETE, and PUT, but they are relatively rare.)

You may find it helpful to think of GET versus POST as the difference between sending information by either a postcard or a letter in an envelope.

  • Both are ways to send information. GET and POST both send information from the browser to the server.
  • The general term is a request. A request can use either GET or POST.
  • They require slightly different handling by the sender and the receiver.

Ordinary web requests, like when you click on a hyperlink or load an image using the IMG tag, are all GET requests. (Not surprisingly, since the web browser is trying to GET some data.)

Either GET or POST can be used when submitting form data (you choose which using the METHOD attribute), but there are some important differences.

  • GET: all the form data is in the URL
  • POST: all the form data is in the body of the request

Let's explore the many differences between GET and POST.

Usage

  • GET is supposed to be used when reading information from the web server
  • POST is supposed to be used when updating information on the web server

(If the distinction between reading data versus updating data reminds you of the difference between SQL queries using the SELECT statement and modifying the state of the database using INSERT/UPDATE/DELETE, that's exactly right. We'll talk about that as part of this course.)

These previous facts have some consequences:

With GET, since all the data is in the URL, and URLs aren't infinite in length, GET is unsuitable for long things, like submitting a blog post or uploading a file.

Because GET is for reading data, browsers and other computers can cache the results, rather that bothering the web server again. See below for more on caches.

Since all the data is in the URL, a GET request that retrieves something useful (helpful search results) can be bookmarked, saved in a browser history, emailed to a friend, etc. Google maps used to do this, and it was useful for sending someone a particular map or set of directions.

If there is sensitive information being submitting (e.g. SSN or credit card number), having the sensitive information end up in the URL by using the GET method should probably be avoided.

Because POST is for updating information, the browser will typically prevent you (with a warning) from re-submitting a form. This is a good thing. For example, you wouldn't want to accidentally resubmit an order to Amazon.com. But you should think about this. If re-submitting is harmless; don't use POST.

Here is a screenshot of an unnecessary warning I get when retrieving a classlist, because they used POST instead of GET:

confirm reposting of form

All I did was try to "refresh" the page, to see if there had been any change in registrations. Had they used GET, that would have been fine, but because they used POST, I have to confirm this. This confirmation would be good if I was doing something like ordering a book; I don't want to pay for two copies just because I refreshed a page.

W3Schools has a very nice, concise summary of GET vs POST

More on GET versus POST

Let's go back to the metaphor of a request as a piece of snail mail.

  • Postcards and Envelopes both have the address on the outside, where it's visible. In GET and POST, the address is the URL.
  • Postcards have all the other information on the outside as well. There is no "inside". Consequently, a postcard is limited in length. You can send "having a great time, wish you were here" using a postcard, but you can't send the long story about bumping into an old friend while touring the Uffizi Gallery in Florence, Italy.
  • Envelopes have the address on the outside and as much content as you want (and can pay for) inside the envelope. That's where you can put your long story, along with the pictures you took.
  • Similarly, we can use GET for short requests, and POST for long requests, especially those with payloads like pictures.
  • Furthermore, if you're sending personal or private information (your SSN, your credit card number), you'd use an envelope (POST) not a postcard (GET). This isn't perfect security, but it's better than a postcard!

(For best security, we'd use HTTPS; we'll talk about that later in the course.)

One last bit of information that may help. The metaphor breaks down a bit when we talk about how the URL contains the information. In a postcard, there's a vertical line between the "address" half of the card, and the "message" part of the card. But in a GET request, all the info is in the URL.

Suppose our web app allows us to request info about a book given its title and author. The URL for this request is getbook. The form the user fills out has, say, two fields, title and author. A request might be:


| name | value |

| title | The Hobbit | | author | Tolkien |


If the web app uses a GET request, all this info has to end up in the URL. This requires an encoding step that your browser knows how to do, and your app knows how to reverse. The encoded url might be:

/getbook?title=The+Hobbit&author=Tolkien

We do not have to learn these encoding/decoding rules, just know that they exist so that the browser can send a modest amount of not-too-complex information in a single URL.

Accessibility

The sample form above seems good and works fine, but it has a serious flaw: it is not accessible, which means that some users (potential customers!) will have difficulty understanding it, possibly because they are visually impaired and using a screen reader or other assistive technology. These people matter, and it's our moral/ethical responsibility (and in some cases legal responsibility) to ensure that they have equal access.

What is missing is labels, which associate an input (technically called a control) such as the name field or the SELECT menu, with a bit of text that explains it. This is done with the label tag. Here's more on label.

This is important because when the user is filling out the form and wants to know what an input is asking, the screen reader can read the associated label text.

There are two ways to use label, the simple, structural way, and the flexible, id-based way. Let's look at both:

LABEL using Structure

The simple structural way is to put the label text next to the input and wrap both with the label:

<label>zip code: <input name="zip"></label>

which looks like:

LABEL using ID

If we use ID instead of structure, we can put the label and the control in different places in the page, connecting them using an ID. This is more flexible, though a bit more complex. It's done by giving the control an ID, and use the for attribute of the LABEL to specify the ID of the element it labels. Here's an example where we wanted to put the form inputs in a table column, with the text labels in another column:

<table>
<tr><td><label for="state-elt">state</label></td>
        <td><input id="state-elt" name="state"/></td></tr>
<tr><td><label for="zip-elt">zip</label></td>
        <td><input id="zip-elt" name="zip"/></td></tr>
</table>

which looks like:

Notice that because the inputs are in different cells of a table, the label can't "wrap" (surround) the input as in the structural technique.

Adding these necessary improvements to the sample form yields: sample form improved.

Testing Accessibility

Accessibility is a big complex subject, and professional websites have trained developers, automated tools, and human testers to ensure accessibility. All that is outside the scope of this course, but I will introduce three important tools and require you to use them.

All HTML must be valid, which means it satisfies the structural rules set out by the WWW consortium (W3C). Valid HTML is important because screen readers and such can do a better job understanding the structure of the page if it follows the rules. Most browsers are much more forgiving, so don't assume that if it looks good in a browser that it's good.

You can validate your HTML using this website from the W3C: https://validator.nu/. That site works in three modes: you can give it a URL and it'll retrieve the page using the URL and validate it. The .htaccess barrier that we set up (see the aside about access, above) allows access for the validator, so it will be able to validate your pages. The other modes are file upload and direct input. The last mode allows you to just copy/paste your HTML from your browser to the validator, which only takes a minute and is very easy. That will be necessary when you are developing with Flask, since the ports that we are using aren't accessible outside the campus firewall.

An additional way to test the accessibility is to add a link (usually wrapped around a "badge" that the page is accessible) such that the validator checks the validity of the page that was linked from (the referrer). You can test that with the sample forms linked from this page; you'll see the badge icon at the bottom. However, we won't be able to use that technique in our Flask applications, due to the firewall issues.

All CSS (see below) must be valid, for similar reasons as the HTML.

The W3C also provides a CSS validator that works the same way as the HTML validator: https://jigsaw.w3.org/css-validator

Check the page for common accessibility issues with the WAVE, the Web Accessibility Evaluation tool, https://wave.webaim.org/. It's important to note that getting no errors from the WAVE tool doesn't mean your site is accessible — only a person can decide that — but it's a useful tool nevertheless. Not passing the WAVE test is certainly undesirable.

Like the earlier validators, WAVE can retrieve a publicly hosted page given its URL and evaluate it. It doesn't have a mode where you can copy/paste your code, but there are two browser plugins that will evaluate the page in your browser. In less than 1 minute, I installed the Chrome plug-in, viewed my page page, and evaluated it. See screenshot below, showing four errors, all of which are fixed in the improved form.

The result of running the WAVE tool on the sample form
Running the WAVE tool on the sample form shows four errors, flagged in red

I recommend that you:

  • install the WAVE plugin to your browser (Chrome or Firefox)
  • run it on all three of the sample web pages (home.html, form.html and form-improved.html).
  • practice using the HTML and CSS validators, too.

Requirements

In this course, I require that

  • All HTML pages pass the HTML validator
  • All CSS pass the CSS validator
  • All pages have no errors in the WAVE validator

Caches

Caches are an important part of the web browser and therefore the web developer. In the examples above, we often made use of external CSS files and external JavaScript file. That allows two pages to share those common files. But one of the great advantages of external style sheets and external JavaScript files is also, for a web developer, a slight bother. To understand that, you have to understand caches.

The word “cache” (pronounced “cash”) is an ordinary, but uncommon, English word (more of an SAT word for most people). However, it's used all the time by computer scientists because caches are used all the time by computers, in all kinds of ways, because caching is a general technique for speeding things up. In particular, a cache speeds things up by storing to avoid re-doing.

Specifically, your web browser will cache (keep a copy of) files you visit or reference (images, external CSS, JS libraries, etc) in a folder on your local machine. (That folder is called, of course, the cache.) If the web browser needs that file again, say on another page of your site that uses the same external style sheet, the browser doesn't have to re-download the file; it just grabs a copy from the cache. This makes the web browser faster.

So, why is the browser cache a problem for a web designer? Because if you make a change to the external style sheet, the web browser may continue to use the old cached copy, instead of getting the new improved copy from the server. This means that when you view your page, you won't see your changes — very frustrating.

The solution is to tell the web browser to ignore the cache when you re-load the page. In most web browsers, this is done by holding down the shift key when you click on the reload icon.

So, just remember:

When in doubt, use shift+reload

Conclusion and Summary

This reading has had many facets because HTML, CSS, and Forms are so important to this course, while also being topics that many (but not all) of you know something about, so they don't deserve a lot of our time. Nevertheless, you should know:

  • the basics of using HTML to create the structure of a web page
  • the basics of CSS for styling a page
  • the basics of forms, so that you can collect information from the user
  • how to use validators to check that you've used HTML and CSS correctly
  • how to make your page, particularly your forms, reasonably accessible
  • how caches affect our debugging of web apps
  • the distinction between GET and POST

If you're feeling good about all of those, you can stop here. If you need more introduction to HTML, CSS or Forms, keep reading.

Learning HTML

HTML is a relatively simple language, compared to programming languages like Python or JavaScript, or even SQL. It is a markup language, which means it's just about structure: this is part of that and so forth.

The people in this class are, I sincerely believe, capable of teaching themselves HTML in short order. There are many online tutorials, of course, which you can find with a quick web search. Our own CS110 web site has a lot of information on writing web pages in HTML. If you're starting from scratch, I recommend reading the following pages:

  • HTML. 21 pages. This talks generally about syntax of tags and URLs.
  • Here is the MDN (Mozilla Developer Network) HTML Tutorial, with links to beginning HTML and also to forms
  • Here is the W3Schools HTML tutorial
  • Tables This teachs you about tables in HTML, which is quite useful for formatting, um, tables. Since many of our query results are tabular, this can sometimes be useful for us.

Note that you must know HTML, not just how to build a web page with some nice software like Dreamweaver that writes the HTML for you. Dreamweaver and that ilk are terrific, but they're no good for our purposes, because our scripts will be writing the HTML for the results of queries. That means we need to understand HTML. (To be fair, we'll be writing templates, so you can work with something like Dreamweaver to create the overall HTML, but then you'll need to be able to edit the HTML to turn it into a template.)

Learning Forms

The following reading introduces FORMs to those who know some HTML but not forms.

  • Forms. 9 pages This talks about forms, which are crucial for web applications.
  • MDN on Forms This is the Mozilla Developer's Network introduction to forms. If you master all of that, you'll know more than I do, but the first few parts should be sufficient.

The following is an alternative introduction. You might start with this and then go back to the links in the list above if you want more information:

Basic Page with a Form

Here is a page with a form, where the form just "echoes" the form data back to you, the user. Keep this "echoing" script in mind, because while it is useless in deployed systems, it can be very useful in debugging your web application, by clarifying whether the problem is in processing the form data or in getting the form data to the server.

Here is the code for just the form:

  <form action="/~cs304/php/form-echo-html.php"
        method="get">
    <!-- modern browsers will insist that
         "required" elemements are non-empty -->
    <p><label>stimulus:
            <input required
                   type="text" name="stimulus">
    </label></p>
    <p><label>response:
            <input required
                   type="text" name="response">
    </label></p>
    <p><label>reason:
      <select required name="reason">
        <!-- invalid option must have an empty
             value for "required" to work -->
        <option value="">choose reason
        <option>just 'cuz
        <option>none o' your bizness
        <option>I dunno
      </select></label></p>
    <p><label>Why: <br>
            <textarea required name="why" rows="5" cols="30"></textarea>
    </label></p>
    <p><input type="submit">
  </form>

I have omitted the CSS to style the form; you can look in the source code for the form-echo.html page for that info.

Form Structure

The form tag is a container, meaning that one important aspect of its function is to enclose a set of inputs. These inputs are, essentially, name/value pairs. Thus, we can think of the form above as producing something like:

namevalue
stimulusstrange
responsecharm
reasonI dunno
why

It could be for lots of reasons, but maybe it's my deep and lasting regard for particle physics

In the table above, the name column is defined by the form author, and the values in the value column are typed or chosed by the user.

The form tag also specifies two other important things:

  • the action attribute specifies where the data is to go when the form is submitted. That is, this is the server-side script that will process the data.
  • the method attribute specifies how the data will be sent:
    • the GET method (the default) encodes the data and tacks it onto the URL following a question mark. The server-side script then parses this URL and decodes the form input.
    • the POST method encodes the data and supplies it as additional input following the URL of the request. The server-side script can then read the data from standard input (like Java's System.in.readln).

A form should also have a submit button, or the user won't be able to send the data to the server. (It's possible, of course, to have other mechanisms to trigger form submission, say by processing the enter key or some JavaScript form submission, but you will usually have a submit button.)

One important thing to note about forms: a page can have as many forms as it needs, but forms don't nest (unlike some, but not all, HTML elements). The reason for that is that the collection of name/value pairs within the form element are sent to the target of the action attribute of the form element, and nested forms would make that confusing.

Learning CSS

Making your website pretty is useful as well as nice, because something that looks good and is well laid out is often easier to use. Still, the amount of CSS you could learn is certainly greater than the amount you need to learn. Here are links to the first few CSS readings in CS 110:

Some reading for later, if ever.

Apache and public_html

In Fall 2020, we'll be combining all our pages with Flask, so you can skip reading this section. But you're welcome to read this if you're curious.

So far in this course, we've been working in your ~/cs304 folder, which works fine for Flask and will continue to work well. However, if we want our web pages delivered by Apache rather than Flask, we need to to put them in our ~/public_html folder. That folder, and only that folder, is read by Apache and consists of our globally accessible web pages.

You can download all these examples to the public_html of your Tempest account like this:

cd ~/public_html/ 
cp -r ~cs304/pub/downloads/forms . 

You are welcome to learn from my example, but don't use it lock, stock and barrel for your assignment. Prepare yourself for the project by creating an interesting and relevant form.