(Reading: We strongly suggest you read chapters 1, 3, and 6 of Head First HTML and CSS with these notes.)

HTML Elements and Tags

An HTML document is composed out of elements that begin and end with tags. For example, the H2 tag was used to create the header element you see above:

<h2> HTML Elements and Tags </h2>
start tag contents end tag

Here's an example of a simple page; it's an excerpt of the

<!doctype html>
<!-- created by Ellen -->
<html>

    <!-- A simple web page illustrating some basic HTML tags -->

    <head>
        <title>Fun CS events</title>
    </head>

    <body>
        <h1>Join us for some fun CS department events!</h1>
        <ul>
            <li>Spring Cirque du CS 
            <li>Holiday cookie party
            <li>Faculty-student frisbee game
        </ul>

        <h2>Spring Cirque du CS</h2>
        <p>A celebration of student accomplishments in Computer Science! 
           <strong>Demonstrate your CS110 project!</strong>
        </p>
        <p>
            <img src="cirque1.jpg" alt="circus treats" height="200">
        </p>
  
        <h2>Outdoor Fun</h2>

        <p>All levels of skill and experience are welcome!</p>
        <blockquote>
              <p>The faculty have always beaten us in the past, but never again! We have 
              some Wellesley Whiptails <br> on our team and we've been practicing hard, 
              so we're gonna kick the faculty's butt this year! <br>
              <em>-- anonymous CS student</em>
              </p>
         </blockquote>
    </body>
</html>

The following are some HTML tags that you can see above. If you forget what a tag does or are looking for a new tag, you can look up tags in an HTML Reference. (Note that we will not be learning all the tags in that reference; we'll learn a useful subset.) Some of the tags we've seen are: are:

There are several other tags that you have seen, but still need to learn more about.

We'll learn more about these in this reading.

Most elements have an end tag that matches the start tag. In a few special cases, particularly </p> and </li>, the end tag is optional and may be omitted because the browser can determine it from context.

Some elements (such as <br>, <hr>, and <img>) consist only of a start tag and do not have corresponding end tags or contents. These are called empty elements.

Tags serve as instructions telling the browser how to display the contents of elements. The browser is an interpreter for HTML code; it reads the HTML code and renders its elements based on rules for each kind of tag.

Nesting

Multiple tags can be nested: one fits inside the other like measuring cups or Russian dolls. If we have two tags, fred and barney, they can be nested like this:

<fred>
   Region A
      <barney>
         Region B
      </barney>
   Region C
</fred>

The fred tag applies to all three regions, Region A, Region B and Region C, while barney applies only to Region B that it surrounds. The Region A and Region C only have the fred tag apply to them.

When nesting two tags, the inner tag must be closed before the outer tag is closed. Your browser may not enforce this; it may be forgiving of errors, but you can't be sure that every browser will be so forgiving, so always follow this syntactic rule.

Tag Syntax

We learned that tags always begin with a left angle bracket < and close with a right angle bracket >. (Remember that the browser doesn't care whether the tag's name is upper or lower case.) You can provide additional information within the tag to further specify what it does, using attributes.

A good example of the use of attributes is seen with the anchor tag, <a>. To use it as a hyperlink, you have to specify where the link takes us. For example, to have a link that says Google and takes us to www.google.com, we say:

<a href="http://www.google.com">Google</a>

The href part is the attribute. For this tag, since the attribute is almost always required, we can almost think of it as an a href tag, but that way of thinking will confuse us later, so it's best to think of it as two separate ideas. Later, we'll see other attributes for the <a> tag.

Exercise 1

(Note: we will intersperse exercises like this in the reading. They are solely for your benefit. They are typically short and may help with understanding the concepts and techniques. They are not graded. We highly recommend that you make time for them.)

To start this exercise, click on this link to JSFiddle. JSFiddle is an extremely popular sandbox that allows you to play with HTML, CSS and JS right in your browser, without having to launch an editor (such as TextWrangler), save files to your server, or any other such infrastructure. We'll use it a lot in this course.

Copy/paste the following code into the HTML box of the JSFiddle. Then click the "Run" menu item to see the result.

  <p>This is the text on my little page.

  <p>This is some additional, very boring, text on my page.

Then add an h1 header and an h2 header to the page.

Now put in a link to your favorite website. Here are some of ours. (To see the URL where they go hover over a link and look at the left-bottom bar of your browser.)

The title tag

The contents of the title tag doesn't appear on the body of the web page (though it often appears at the top of the window in the title bar). It is, however, very important for two reasons:

For both those reasons, a page title like about us or contact is often not helpful. It's a good idea to put something more descriptive in the title, such as About CS 110 or CS 110 staff contact information.

Once, it was very popular on the web to have links like this:

It seemed so clever and intuitive, making the clickable text be the word "here." There are two big problems with this, though:

So what do you do instead? Just wrap the link tags around important words:

Accessibility is very important in CS 110, so keep that in mind.

In the head element of each CS110 lecture page, you'll see some HTML that looks something like:

   <link href="http://cs.wellesley.edu/~cs110/cs110-main-style.css" rel="stylesheet" type="text/css">

Like the anchor tag, the <link> tag also has an href attribute, and it also links one web page with another. However, it has a different purpose. Instead of producing a clickable link, the <link> tag tells the browser that there is some additional information about this page located in a different file. The href attribute of the <link> tag tells the browser where to find the other file. The href attribute contains a URL, which we'll learn about later.

The <link> tag contains other attributes depending on the purpose of the connection. In our case, it contains:

In HTML5, the attribute type is not required anymore. The reason is that CSS is declared the default style for HTML5. However, you might see type in our older examples and in many pages on the Web. It is not an error to use it, but it's not required anymore.

The <link> tag can be used for a variety of purposes, but most current browsers only use it for style sheets.

At this point, the <link> tag is still a mystery. We've told you what it's for, and something about its syntax, but not what goes in the other file. We'll learn more about the <link> tag, style sheets and CSS in later lectures.

Exercise 2

Modify the JSFiddle from the previous exercise by adding the following link element near the top. (You'll get a minor complaint from JSFiddle, but you can ignore it.)

<link href="http://cs.wellesley.edu/~cs110/cs110-main-style.css" rel="stylesheet">

How does this affect the appearance of the document?

In the modified document, how can you change the appearance of an unvisited hyperlink from blue text to text that is large and green? (Hint: use a header tag.)

Tag and Attribute Syntax

We can now generalize start tag syntax to include any number of attributes, as follows:

<tag attr1 = "value1" attr2 = "value2" ... attrN = "valueN"> 
   contents
</tag>

Browsers differ on how nit-picky they are about attributes. Many will let you get away with omitting the quotation marks when the value is a single solid word (no spaces in the value). Others will complain if you have line-breaks in your attributes. In general, it's best to comply with the strictest syntax rules, so that your site will work on the most browsers.

Note that some versions of HTML, namely XHTML, require attributes to be in quotation marks. We're using HTML5, which is much more liberal, and will let you omit quotation marks unless the value contains a space, a line break, grave accent (`), equals sign (=), less than sign (<), greater than sign (>), quote (") or apostrophe ('). In short, if there's anything confusing in the value, use quotation marks.

Using Images

One thing that we all want to do with our web pages is add pictures. Because the picture file is a separate file, we have to link to it, just like the href attribute of the anchor (<a>) tag.

Once you have an image file, say small_weasel.jpg you can use it on your web page like this:

<img src = "small_weasel.jpg" alt = "a small weasel">

with the following result:

a small weasel

Of course, this only works if the server can find the image file. The src attribute must be the URL of the image file. It can be an absolute URL or a relative URL. What did we use here?

When the browser asks for this page, the server sends it back and it also finds and sends back any image files that the page references. If the server doesn't find the file (or the file is corrupted in some way), the browser will show this:

a small weasel

Depending on your browser, you may see a broken-image icon above, the alt text, or possibly nothing at all.

Exercise 3

Modify the JSfiddle to display an image of the weasel. (Of course, you could replace this image by any other image you'd like!) Note that you must use an absolute URL for the IMG, namely:

http://cs.wellesley.edu/~cs110/lectures/L01-html/small_weasel.jpg

Add a title attribute with value "small weasel image" to the image. What does this do?

The ALT Attribute

You noticed that we added an ALT attribute to the IMG tag that is a small piece of text that can be used in place of the image in certain circumstances. The ALT attribute is an important part of the HTML standard. Perhaps its most important use supports accessibility. Unfortunately, not everyone has good enough vision to see the images that we use in our websites, but that doesn't mean they can't and don't use the Web. Instead, they (typically) have software that reads a web page to them, including links. When the software gets to an IMG tag, it reads the ALT text. If there is no ALT text, it may read the SRC attribute, hoping there's a hint there, but all too often the SRC attribute is something like "../images/DCN87372.jpg" and the visually impaired web user is left to guess.

Therefore, you should always include a brief, useful value for the ALT attribute. If your page is an image gallery, then your ALT text could be a description of the image. However, describing the image is not, in general, the idea. For example, if the image is a link whose target is made clear by the image, then the ALT text should say something like, "Link to ..." so the user will know what to do with it. The sole exception is for images that are just used for formatting, such as blank pictures that fill areas or colorful bullets for bullet lists. In those cases, in fact, it's better to include an ALT attribute that is empty, so that the user doesn't have to listen to the SRC attribute being read. In both cases, the text should be useful for someone who wants to use your site but isn't sighted. It helps to turn off images and view your site to check.

Furthermore, you should avoid having critical information on your website conveyed only in images. There may be times when it is unavoidable, but to the extent that it is possible, we want our websites to be easily usable by all people, including the blind and visually impaired.

Accessibility is important in modern society. We build ramps as well as stairs, we put cutouts in curbs, and we allocate parking spaces for the handicapped. Indeed, most federal and state government websites are legally required to be accessible, and ALT attributes are just one part of that.

In this class, we expect you to always use the ALT attribute. If you find an image or an example where we've forgotten to use one, please bring it to our attention.

For more information, you can read the following

Resizing and Aspect Ratio

If you want to display a bunch of pictures, the web page appears neater if the pictures align well. You can align them vertically if they all have the same width, or horizontally if they all have the same height. For example, the following three pictures are all 150 pixels high.

a molehill the Eiffel Tower the Matterhorn

Regardless of the actual dimensions of an image, the browser will squeeze it into a set size if requested. You can do this with two new attributes, namely HEIGHT and WIDTH:

<img src="..." alt="..."  height="height-goes-here"   width="width-goes-here">

Replace the "height-goes-here" and "width-goes-here" with integers specified in pixels, which we'll discuss later. If both width and height are specified, both will be obeyed, but you have to be careful with that. Suppose the original image is 160x240: taller than it is wide. Technically, the ratio of the width to the height is called the aspect ratio. The Eiffel Tower picture has an aspect ratio of 160:240 or 2:3. If you set the height and width so that they don't have the same aspect ratio, the picture will look distorted. Here is the Eiffel Tower with the wrong aspect ratio:

<img src="eiffel-tower.jpeg" alt="the Eiffel Tower" width="300" height="150">

the Eiffel Tower

If you use either the HEIGHT or the WIDTH attributes, but not both, the browser will usually calculate the other attribute so that the aspect ratio is preserved. Thus, the picture will have either the width or height you want, but will not be distorted. That's how we did that row of pictures above.

Exercise 4

Modify the JSfiddle to display an image of the Eiffel tower next to the weasel so that both images have the same height. You'll have to use an absolute URL for the IMG:

http://cs.wellesley.edu/~cs110/lectures/L01-html/small_weasel.jpg

Comments

From the very first computer program, programmers have needed to leave notes in the code to help themselves and others understand what's going on or what the code's purpose is. These notes are called comments. Comments are a part of the program text (they're not written separately, because then, well, they'd get separated), but they are ignored by the computer. Comments aren't about what someone can discover by reading the code, but should cover the background context of the code, or its goal.

Because it's important to get in the habit of putting comments in your HTML code, we will require comments in this course. At this point, you won't have a lot to say, and that's fine. You will start by labeling each file with its name, your name, the date, and any sources you consulted (such as the source code of other web pages). Think of this as signing your work. Later, when you're designing a website with many coordinated pages, you can use comments on a page to talk about how it fits into the overall plan.

HTML Comment Syntax

The HTML comment syntax is a little odd-looking. Here's an example:

<!-- I can say anything I want in a comment.  -->

The syntax starts with a left angle bracket < then an exclamation point and two hyphens, then the comment (anything you want) and ends with two hyphens and a right angle bracket >.

Beauty in Websites

Ah, but you object that your web page looks ugly without centering, font changes, colors, and so forth. That may be; we're not going to try to contradict your aesthetic sense. However, for the first week of this course, we don't want to confuse anyone by introducing style sheets right away. So, we ask you to be patient. We will get to style sheets very soon.

In the meantime, consider the fact that many visitors to your site might not gain any advantage from style sheets anyway, because they are visually impaired, or because they are using an old or alternative browser that doesn't honor style sheets. For such users, the most important thing is the content and having that content well-structured and clearly conveyed. So, until we get to style sheets, consider that you are designing your website with accessibility in mind.

Bugs and Debugging

An HTML document may contain many different kinds of errors that prevent it from rendering as you expect. Errors in code are known as bugs, and the process of finding and correcting such errors is called debugging.

Here are some common types of HTML bugs:

Exercise 5

Copy the following buggy HTML code into a JSfiddle and debug the code until it looks like you think it's intended to look.

<h1>A Buggy HTML File</h1>

<p> This HTML file contains several <em>bugs</em> (i.e., errors).
<br>
Can you <em>debug</em> them (i.e., find and fix them)?

<p>Here's an unordered list with three items: 

<ul>

    <li>A <bold>bold</bold> item.</li>

    <li>An ordered sublist with two items: 

        <ol> 

          <p> A CAPITALIZED item 

          <li> An <a href="http://www.wellesley.edu">
                  <em>italic link</a></em>
   
    <li> A final <code>item<code>, using code font.

</ul>

<p>Here's <a href="http://cs.wellesley.edu another link</a>

Validation of HTML Code

How can you be sure you've followed every nit-picky rule that the HTML standards committee devised? (The standards committee is the World Wide Web Consortium or W3C.) Even if you have memorized all the rules, checking a page would be tedious and error-prone – perfect for a computer! Fortunately, the W3C created an HTML validator. You can validate by supplying a URL, by uploading a file, or even copy/pasting in some HTML. Visit that page and try validating your little.html file, debugged buggy.html file, or even the file for this lecture. An HTML validator is an excellent tool to help you debug your HTML code.

Validation also helps with accessibility. One important aspect of accessibility is having the proper HTML syntax for each page in your site. Visitors with accessibility needs will use the alternative browsers and screen readers, and that software will be aided by syntactically correct HTML. Read the following for a longer discussion of why to validate your HTML pages.

Throughout the semester, if you need to validate a web page, you can find the HTML validator and others in the validators section of the CS110 documentation page.

If you haven't already, try validating little.html. You'll see that it doesn't validate. The reasons are slightly technical, so bear with us, but you'll see how to make your own documents valid.

Document Type

There are several different, incompatible HTML versions, and so a document that is valid HTML 3.2 is invalid HTMl 4.01 and vice versa. Neither would be valid XHTML 1.0. So, the validator needs to know what version of HTML you're using. More importantly, the browser or screen reader or other software needs to know what syntax to expect. Therefore, the first thing any web page needs to do is announce the syntax it's using. This is done with a special DOCTYPE (for document type) tag.

In this course, we're using HTML5. (Your book describes HTML 4.01 (and also XHTML), but HTML 4.01 is essentially a subset of HTML5, so nothing they tell you will be incorrect.) Fortunately, the doctype for HTML5 is very simple and easy, certainly compared to HTML 4.01. It looks like this:

      <!doctype html>
    

We urge you just to copy/paste that tag to your own pages and don't worry about it further.

Note: because HTML5 is very new, the validator at W3C will give you a warning that the result is "experimental."

Charset

A browser needs to know what characters (letters, numbers, and punctuation) are in the HTML file. Are they western European characters, or Russian, Greek, Sanskrit, Korean, Japanese, Chinese, or !Kung or something else? This information is called the character set or charset for short. For our purposes, we can use something called UTF-8.

To do so, we put a <meta> tag in the <head> of our document. It looks like this:

 <meta charset="utf-8">

Again, we urge you just to copy/paste that code and don't worry about it.

Icon Declaring Validation

Once you get your page to validate, you can put some HTML code on your page to give it a seal of approval, declaring that it is valid (and what standard it meets). You will see in lab examples of this strategy.

The very cool thing about this icon is that it is clickable, and clicking it will cause the validator to process your page again. Thus, you can modify your page, upload the changes, and click the icon to re-validate it, making validation very easy. In fact, we suggest that you put the icon on your page before it's valid, and use it during your debugging process.

The snippet of code is just the following, so go ahead and copy/paste it into your pages. The code doesn't use anything we don't know, so read it!

<p>
  <a href="http://validator.w3.org/check?uri=referer">
     <img 
       src="http://cs.wellesley.edu/~cs110/Icons/valid-html5v2.png"
       alt="Valid HTML 5"
       title="Valid HTML 5"  
       height="31" width="88">
  </a> 
</p>

An HTML Template

The preceding requirements result in a kind of boilerplate that you'll need for all your web pages. Feel free to copy/paste the following to begin all your HTML pages, or to use this template.html

<!doctype html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <link rel="stylesheet" href="style.css">
    <meta name=author content="Your name here">
    <title>title here</title>
</head>
<body>

<p>
  <a href="http://validator.w3.org/check?uri=referer">
     <img 
       src="http://cs.wellesley.edu/~cs110/Icons/valid-html5v2.png"
       alt="Valid HTML 5"
       title="Valid HTML 5"  
       height="31" width="88">
  </a> 
</p>
</body>
</html>

Solutions to Exercises