(Reading: We strongly suggest you read chapters 1, 3, and 6 of Head First HTML with these notes.)
Today, we'll dig into how to use HTML to structure your web page and talk about some of the abstract concepts behind this language.
Contents:
An HTML document is composed out of elements that begin and end with tags. For example:
<title>CS110 HTML Coding</title>start tag contents end tag
The following are some HTML tags that you have seen. If you forget what a tag does or are looking for a new tag, you can look up tags in an HTML Reference, like the ones you find on the CS110 documentation page.
<html>
<head>
<title>
<body>
<h1>–<h6>
<p>
<br>
<strong>
<em>
<code>
<q>
<ul>
<ol>
<li>
There are several other tags that you have seen, but still need to learn more about.
<a>
<img>
<link> and <meta>
We'll learn more about these today.
Most elements have an end tag that matches the start tags.
In a few special cases, particularly </p>
and </li>, the end tag is optional and
may be omitted because the browser can determine it from context.
Some elements (such as <br>,
<hr>,
and <img>) consist only of a start tag
and do not have corresponding end tags or contents.
These are called empty elements.
Tags serve as instructions telling the browser how to display the contents of elements. The browser is an interpreter for HTML code; it reads the HTML code and renders its elements based on rules for each kind of tag.
Multiple tags can be nested: one
fits inside the other like measuring cups or Russian dolls. If we
have two tags, fred and barney, they
can be nested like this:
<fred>
Region A
<barney>
Region B
</barney>
Region C
</fred>
The fred tag applies to all three regions, Region A,
Region B and Region C, while barney applies only to
Region B that it surrounds. The Region A and Region C
only have the fred tag apply to them.
When nesting two tags, the inner tag must be closed before the outer tag is closed. Your browser may not enforce this; it may be forgiving of errors, but you can't be sure that every browser will be so forgiving, so always follow this syntactic rule.
A nice way to see the structure of your document is using Firebug, a plug-in for the Firefox browser. We'll use Firebug a lot more later in the semester, but for now, we might take a minute in lecture to show the structure of a document.
From the very first computer program, programmers have needed to
leave notes
in the code to help themselves and others understand
what's going on or what the code's purpose is. These notes are called
comments. Comments are a part of the program text (they're
not written separately, because then, well, they'd get separated),
but they are ignored by the computer. Comments aren't about what someone
can discover by reading the code, but should cover the background context
of the code, or its goal.
Because it's important to get in the habit of putting comments in your HTML code, we will require comments in this course. At this point, you won't have a lot to say, and that's fine. You will start by labeling each file with its name, your name, the date, and any sources you consulted (such as the source code of other web pages). Think of this as signing your work. Later, when you're designing a website with many coordinated pages, you can use comments on a page to talk about how it fits into the overall plan.
The HTML comment syntax is a little odd-looking. Here's an example:
<!-- I can say anything I want in a comment. -->
The syntax starts with a left angle bracket < then an
exclamation point and two hyphens, then the comment (anything you
want) and ends with two hyphens and a right angle bracket >.
You can see other examples of comments by doing View
Source on this web page.
We learned that tags always begin with a left angle
bracket < and close with a right angle bracket
>. (Remember that the browser doesn't care whether the
tag's name is upper or lower case.) You can provide additional information
within the tag to further specify what it does, using attributes.
A good example of the use of attributes is seen with the anchor
tag, <a>. To use it as a hyperlink, you have to
specify where the link takes us. For example, to have a link that
says Yahoo!
and takes us to www.yahoo.com, we say:
<a href="http://www.yahoo.com">Yahoo!</a>
The href part is the attribute. For this tag,
since the attribute is almost always required, we can almost think of it
as an a href tag, but that way of thinking will confuse us
later, so it's best to think of it as two separate ideas. Later, we'll
see other attributes for the <a> tag.
Create a file named little.html with the following contents:
<html>
<head>
<title>My Little Page</title>
</head>
<body>
<p>This is the text on my little page.
</body>
</html>
Then add an H1
header and an H2 header
to the page.
Now put in a link to your favorite website. Here are some of ours. (You can use View Source to see the URL where they go.)
Remember that (1) on a Mac, you can edit a file using TextWrangler, (2)
you should put your HTML code in the BODY element, (3) be sure to end
your filename with .html, and (4) you can view a page
locally in your browser, using File / Open File.
In the head element of each CS110 lecture page, you'll see
some HTML that looks something like:
<link rel="stylesheet"
type="text/css"
href="http://cs.wellesley.edu/~cs110/cs110-main-style.css">
Like the anchor tag, the <link> tag also has an
href attribute, and it also links one web page with another.
However, it has a different purpose. Instead of producing a clickable
link, the <link> tag tells the browser that there is
some additional information about this page located in a different file.
The href attribute of the <link> tag tells
the browser where to find the other file. The href attribute
contains a URL, which we'll learn about later.
The <link> tag contains other attributes depending
on the purpose of the connection. In our case, it contains:
rel attribute that says what the RELationship
the other file has to this one. We use it to specify a
style sheet, which says how tags should be formatted.
type attribute that says what kind of stuff the
other file contains. In our case, we say that the file contains
text/css.
The <link> tag can be used for a variety of
purposes, but most current browsers only use it for style sheets.
At this point, the <link> tag is still a mystery.
We've told you what it's for, and something about its syntax, but not what
goes in the other file. We'll learn more about the <link>
tag, style sheets and CSS in later lectures.
Modify little.html from the previous exercise
by adding the following link element inside
the head element:
<link rel="stylesheet"
type="text/css"
href="http://cs.wellesley.edu/~cs110/cs110-main-style.css">
How does this affect the appearance of the document?
In the modified document, how can you change the appearance of an unvisited hyperlink from blue text to text that is large and green? (Hint: use a header tag.)
We can now generalize start tag syntax to include any number of attributes, as follows:
<tag attr1 = "value1" attr2 = "value2" ... attrN = "valueN"> contents </tag>
Browsers differ on how nit-picky they are about attributes. Many will let you get away with omitting the quotation marks when the value is a single solid word. Others will complain if you have line-breaks in your attributes. In general, it's best to comply with the strictest syntax rules, so that your site will work on the most browsers.
(Note that the newest version of HTML, called XHTML, requires attributes to be in quotation marks, so those who think they may someday switch to XHTML should use quotation marks. For the purposes of this course, you may use either syntax. You'll find that your instructors aren't always consistent.)
One thing that we all want to do with our web pages is add pictures.
Because the picture file is a separate file, we have to link to it, just
like the href attribute of the anchor
(<a>) tag.
Once you have an image file, say small_weasel.jpg you can
use it on your web page like this:
<img src = "small_weasel.jpg" alt = "a small weasel">
with the following result:
Of course, this only works if the server can find the image file.
The src attribute must be
the URL of the image file. It can be an
absolute URL or a relative URL. What did we use here?
When the browser asks for this page, the server sends it back and it also finds and sends back any image files that the page references. If the server doesn't find the file (or the file is corrupted in some way), the browser will show this:
Depending on your browser, you may see a broken-image icon above, the
alt text, or possibly nothing at all.
Copy the weasel image to a file named small_weasel.jpg
in the same folder as little.html.
Modify little.html to display an image of the
weasel. (Of course, you could replace this image by any other
image you'd like!)
Add a title attribute with value "small weasel image"
to the image. What does this do?
You noticed that we added an ALT attribute to the IMG tag that is a
small piece of text that can be used in place of the image in certain
circumstances. The ALT attribute is an important part of the HTML
standard. Perhaps its most important use supports
accessibility. Unfortunately, not everyone has good enough
vision to see the images that we use in our websites, but that
doesn't mean they can't and don't use the Web. Instead, they
(typically) have software that reads a web page to them, including
links. When the software gets to an IMG tag, it reads the
ALT text. If there is no ALT text, it may read the SRC attribute,
hoping there's a hint there, but all too often the SRC attribute is
something like "../images/DCN87372.jpg" and the visually
impaired web user is left to guess.
Therefore, you should always include a brief, useful value for the ALT attribute. If your page is an image gallery, then your ALT text could be a description of the image. However, describing the image is not, in general, the idea. For example, if the image is a link whose target is made clear by the image, then the ALT text should say something like, "Link to ..." so the user will know what to do with it. The sole exception is for images that are just used for formatting, such as blank pictures that fill areas or colorful bullets for bullet lists. In those cases, in fact, it's better to include an ALT attribute that is empty, so that the user doesn't have to listen to the SRC attribute being read. In both cases, the text should be useful for someone who wants to use your site but isn't sighted. It helps to turn off images and view your site to check.
Furthermore, you should avoid having critical information on your website conveyed only in images. There may be times when it is unavoidable, but to the extent that it is possible, we want our websites to be easily usable by all people, including the blind and visually impaired.
Accessibility is important in modern society. We build ramps as well as stairs, we put cutouts in curbs, and we allocate parking spaces for the handicapped. Indeed, most federal and state government websites are legally required to be accessible, and ALT attributes are just one part of that.
In this class, we expect you to always use the ALT attribute. If you find an image or an example where we've forgotten to use one, please bring it to our attention.
For more information, you can read the following
If you want to display a bunch of pictures, the web page appears neater if the pictures align well. You can align them vertically if they all have the same width, or horizontally if they all have the same height. For example, the following three pictures are all 150 pixels high.
Regardless of the actual dimensions of an image, the browser will squeeze it into a set size if requested. You can do this with two new attributes, namely HEIGHT and WIDTH:
<img src="..." alt="..." height="height-goes-here" width="width-goes-here">
Replace the "height-goes-here" and "width-goes-here" with integers specified in pixels, which we'll discuss later. If both width and height are specified, both will be obeyed, but you have to be careful with that. Suppose the original image is 160x240: taller than it is wide. Technically, the ratio of the width to the height is called the aspect ratio. The Eiffel Tower picture has an aspect ratio of 160:240 or 2:3. If you set the height and width so that they don't have the same aspect ratio, the picture will look distorted. Here is the Eiffel Tower with the wrong aspect ratio:
If you use either the HEIGHT or the WIDTH attributes, but not both, the browser will usually calculate the other attribute so that the aspect ratio is preserved. Thus, the picture will have either the width or height you want, but will not be distorted. That's how we did that row of pictures above.
Copy the eiffel-tower image to a file named eiffel-tower.jpeg
in the same folder as little.html and small_weasel.jpg.
Modify little.html to display an image of the
weasel next to the tower so that both images have the same height.
Take a minute to look at the online reference for the
<center> tag. We would give a link directly to it, but
we want you to learn how to navigate to find the reference material on any
tag. Here's how:
<center>
tag.
<center> tag.
<center> is deprecated in favor of style
sheets.
In this context, deprecated means that browsers will still
support the <center> tag for the foreseeable future
because of the zillions of old web pages that already use it, but that
<center> should not be used in new web pages.
Centering is about presentation of the content, and the modern approach is to reserve presentation issues for CSS (style sheets), which we will be covering later in this class.
You should avoid using deprecated tags in this course. We are teaching you the modern, more powerful approach, and we would like you to adopt that style. If you don't know whether a tag is deprecated, check this reference.
Ah, but you object that your web page looks ugly without centering, font changes, colors, and so forth. That may be; we're not going to try to contradict your aesthetic sense. However, for the first week of this course, we don't want to confuse anyone by introducing style sheets right away. So, we ask you to be patient. We will get to style sheets very soon.
In the meantime, consider the fact that many visitors to your site might not gain any advantage from style sheets anyway, because they are visually impaired, or because they are using an old or alternative browser that doesn't honor style sheets. For such users, the most important thing is the content and having that content well-structured and clearly conveyed. So, until we get to style sheets, consider that you are designing your website with accessibility in mind.
An HTML document may contain many different kinds of errors that prevent it from rendering as you expect. Errors in code are known as bugs, and the process of finding and correcting such errors is called debugging.
Here are some common types of HTML bugs:
<tilte>
or <break>.
<em>important<em>
<em important</em>
<h1>CS110 HTML Coding</h2>
<h1><em>CS110 HTML Coding</h1><em>
<img source="logo.jpg" hieght=100>
<img src="logo.jpg height=100>
Copy the following buggy HTML code into a file named buggy.html
and debug the code until it looks like you think it's intended to look.
<html>
<head>
<title>A Buggy HTML File</title>
</head>
<body>
<h1>A Buggy HTML File</h1>
<p> This HTML file contains several <em>bugs</em> (i.e., errors).
<br>
Can you <em>debug</em> them (i.e., find and fix them)?
<p>Here's an unordered list with three items:
<ul>
<li>A <bold>bold</bold> item.</li>
<li>An ordered sublist with two items:
<ol>
<p> A CAPITALIZED item
<li> An <a href="http://www.wellesley.edu">
<em>italic link</a></em>
<li> A final <code>item<code>, using code font.
</ul>
<p>Here's <a href="http://cs.wellesley.edu another link</a>
</body>
</html>
How can you be sure you've followed every nit-picky rule that the HTML
standards committee devised? (The standards committee is
the World Wide Web Consortium
or
W3C.) Even if you have memorized all the rules, checking a page would be
tedious and error-prone – perfect for a computer! Fortunately, the
W3C created an HTML validator. You
can validate by supplying a URL, by uploading a file, or even
copy/pasting in some HTML. Visit that page and try validating
your little.html file,
debugged buggy.html file,
or even the file for this lecture.
An HTML validator is an excellent tool to help you debug your HTML code.
Validation also helps with accessibility. One important aspect of accessibility is having the proper HTML syntax for each page in your site. Visitors with accessibility needs will use the alternative browsers and screen readers, and that software will be aided by syntactically correct HTML. Read the following for a longer discussion of why to validate your HTML pages.
Throughout the semester, if you need to validate a web page, you can find the HTML validator and others in the validators section of the CS110 documentation page.
If you haven't already, try validating little.html.
You'll see that it doesn't validate. The reasons are slightly technical,
so bear with us, but you'll see how to make your own documents valid.
There are several different, incompatible HTML versions, and so a
document that is valid HTML 3.2 is invalid HTMl 4.01 and vice versa.
Neither would be valid XHTML 1.0. So, the validator needs to know what
version of HTML you're using. More importantly, the browser or screen
reader or other software needs to know what syntax to expect.
Therefore, the first thing any web page needs to do is announce the
syntax it's using. This is done with a special DOCTYPE
(for document type
) tag.
In this course, we're using HTML 4.01. The
correct DOCTYPE for this is at the top of pretty much every
web page in the CS110 site. It looks like this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
We urge you just to copy/paste that tag to your own pages and don't worry about it further.
A browser needs to know what characters (letters, numbers,
and punctuation) are in the HTML file. Are they western
European
characters, or Russian, Greek, Sanskrit, Korean, Japanese,
Chinese, or !Kung
or something else? This information is called the character set
or charset for short. For our purposes, we can use something
called UTF-8
.
To do so, we put a <meta> tag in
the <head> of our document. It looks like this:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Again, we urge you just to copy/paste that code and don't worry about it.
Copy the DOCTYPE and CHARSET code into little.html
and/or the debugged buggy.html
and check that the result is now valid.
Once you get your page to validate, the validator will present a bit of
HTML code that you can copy/paste onto your page to give it a seal of
approval
, declaring that it is valid (and what standard it meets).
You can see an example on the bottom of the CS110 pages, including this
one.
The very cool thing about this icon is that it is clickable, and clicking it will cause the validator to process your page again. Thus, you can modify your page, upload the changes, and click the icon to re-validate it, making validation very easy. In fact, we suggest that you put the icon on your page before it's valid, and use it during your debugging process.
The snippet of code is just the following, so go ahead and copy/paste it into your pages. The code doesn't use anything we don't know, so read it!
<p>
<a href="http://validator.w3.org/check?uri=referer"><img
src="http://www.w3.org/Icons/valid-html401"
alt="Valid HTML 4.01 Strict"
title="Valid HTML 4.01 Strict"
height="31" width="88">
</a>
</p>
Here is a link to a new version of the
web page about fun CS activities with the additional HTML code needed for validation.
The page does not yet validate successfully, because the <img> tag
around the image of the frisbee game is not placed within a block tag such
as a <p> tag - add a <p> tag to see that the
page now validates for strict HTML 4.01.
We'd like you to start thinking about web design: how the pages of a site are laid out, the color and font choices, how visitors navigate around the site, and other issues that influence how you like a website. Some of this is pretty intuitive, and you probably already have some good ideas about this, so we're interested in sites that you think are well designed.
Please email your instructor and send him/her the URL of a website that you think is well designed. If you'd like, you can say why, but that's not necessary. We'll collect these and discuss them in class.