Class 2 – XHTML Syntax

To recapitulate the main concepts of XHTML, remember that XHTML is the latest incarnation of HTML.  The X just means that XHTML is a valid subset of XML.  Whereas the syntax for HTML was often sloppy and inconsistent, XHTML syntax, because it adheres to the rules of XML, XHTML is cleaner and more consistent… in other words, XHTML is standardized.

XML Element Syntax

All XML documents (and thereby all XHTML documents), consist of elements and their attributes.  Elements are described by tags, which are just words surrounded by greater than and less than signs.  The first word inside the tag is always the tag name.  For example, this is an opening tag describing the start of a “span” element.

<span>

In XML (and thereby in all XHTML), elements must always be closed.  A closing tag for a “span” element would look like this:

</span>

The stuff in between the opening and closing tags of any element can consist of other elements (i.e. “child elements”), and/or plain text.  Here is an example of a span element containing one child element and some plain text.

<span>
  <p>this is a paragraph</p>
  this is plain text
</span>

So you see that XML elements can be nested: one inside the other.  The “p” element in this example is a child element of the “span” element.  And conversely, the “span” element is called the parent element of the “p” element.

A note on formatting: If one tag is nested inside another, it should be indented further to the right so you can easily see the nesting visually.  Proper indentation of nested code is a requirement of this courseSo please get used to using the tab key.

Opening and closing tags

As mentioned, all elements must have a closing.  So, for example, the following code has an opening tag, <h1>, some text inside of it, and then a closing tag, </h1>.

<h1>This is a heading</h1>

Empty elements

Some elements never have nothing nested inside of them.  For those elements, the opening and closing of the element can be indicated using a single tag.  For example, “img” tags, which are used to show images, are always empty tags:

<img src=”images/donkey.gif” />

The opening of the tag is indicated by the less than sign, “<”, while the closing of the tag is indicated by the characters at the end of the tag, ” \>”.  So any element that has nothing nested inside of it can be closed in this way with a single tag.

But, if there is something nested inside of an element, it must be opened and closed with separate opening and closing tags.  So, for example, the following code is not valid:

<div />
  Some text inside of the div tag
</div>

The first div tag in this bad example has both an opening and closing: <div />. This is incorrect since the div element is not empty – there is text inside of it.

XML Attribute Syntax

XML elements may have one or more attributes.  An attribute is a name/value pair that goes inside the element’s opening tag.  For example, an “img” tag, which represents an image in XHTML, always has a “src” attribute, which indicates the source file of the image to show.  For example, the following code shows an “img” tag that has a “src” attribute with the value “moneky.gif”:

<img src="monkey.gif" />

The first thing to remember is that the element name, in this case “img”, is always the first thing inside the tag.  Next, notice that the attribute value is surrounded by quotes.  Either double or single quotes are acceptable, so long as the starting quote matches the ending quote.

If you surround an attribute value in double quotes, then it’s okay to use single quotes inside the value.  For example, the following “a” element has two attributes, a “title” attribute, and a “href” attribute.  The “title” attribute has a single quote inside its value.  This is okay because the entire attribute value is surrounded by double quotes:

<a title="Montezuma's Revenge!" href="revenge.html">dare to click me!</a>

The “img” tag example above also reiterates the point mentioned previously that some XML elements use a single tag to both open and close the element.  The ” />” at the end of the tag indicates the closing of the tag.  As a result, nothing can be nested inside of it.  Elements with nothing inside of them are called “empty elements“.  Remember to always put at least one space in front of the tag closing for empty elements for browser compatibility.

XHTML Document Structure

The basic structure of a standard XHTML document is as follows:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <title>Page Skeleton</title>
    <meta http-equiv="content-type" content="text/html;charset=utf-8" />
    <!-- put a any CSS style sheet or JavaScript code here -->
  </head>
  <body>
    <!-- put the visible contents of the page here -->
  </body>
</html>

The “?xml” tag at the top of the document indicates that this is an XML version 1 document using the UTF-8 character encoding.  All XML documents must declare themselves in this way, although it’s possible to use other character encodings.

The !DOCTYPE tag at the top of the code is a Document Type Definition (DTD).  It defines the document structure with a list of legal elements. In this case, the DTD has a link to this document (which you can open in a text editor) that defines all the tags and all the attributes that are allowed in “Strict” XHTML.  “Strict” XHTML forbids any of the bad coding techniques of old-fashioned HTML.  If you wanted to be looser and allow some of the deprecated techniques of HTML, you could use the “Transitional” DTD instead:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

In this class, we will always use the “Strict” DTD. It’s not a bad idea to save a copy of the template above and use it for every page you ever make in this class.

The “html” element is the root element of the XHTML document, meaning it surrounds all the other elements.  All XML documents must have a root node.  By definition in the XHTML DTD, the “html” element must have two children, “head” and “body”.  The “head” element is required to have a “title” child element.

In general, the “head” element contains information that is invisible to the end-user, such as metadata, style sheets, javascripts, and other instructions to the browser that do not actually show up visibly on the page.

The “head” element must always have a child “title” element which indicates the title of the page.  While this title does not show up in the main area of the visible page, it does show up at the top of all the main web browsers.

The “body” element contains all the data that is visible in the main area of the web page.  If you were to type some text inside the “body” element, it would show up when you load that page in the browser.

Comments in XHTML are data that you can put in the code that is not displayed on the screen.  They are useful for leaving notes for yourself so you can more easily understand your own code.  Comments are defined using the <!– and –> tags.  For example:

<!-- put the visible contents of the page here -->

Differences between old HTML and XHTML

A few important notes about XHTML for anyone already knows some HTML:

  • all XHTML elements and attributes must always be lowercase.
  • any attributes must have values
  • all attribute values must be surrounded by quotes
  • all empty tags must have the closing ” />” mark

Related posts:

  1. Class 2 – Common XHTML Elements
  2. Class 2 – The Bare Minimum Elements of an XHTML Page
  3. Class 2 – Usage of the Most Common XHTML Elements
  4. Class 2 – Special vs. Arbitrary XHTML Attributes

Tags:

Leave a Reply