Webpage Workshop: Home » XHTML » Converting from HTML to XHTML

Converting from HTML to XHTML

If you already have a web page written in HTML and you would like to convert it into XHTML, then follow the simple steps in this tutorial. If you are planning on starting from scratch, then you should start at our XHTML Tutorials instead.

Step 1: Convert elements to lowercase

In XHTML all tags and attributes must be in lowercase, this means that if you have written your code in uppercase then you're going to be working for a long time... better put the kettle on...

For example, <EM>emphasised text</EM> is not valid XHTML. It should be changed to read <em>emphasised text</em>. This must also be done for the attributes within an element e.g. <a HREF="http://...">link text</a> is not valid XHTML, whereas <a href="http://...">link text</a> is.

Step 2: Replace the !DOCTYPE declaration

The second step is to ensure that you have the correct !DOCTYPE definitions at the start of your document, this helps when it comes to checking your code with online validators such as W3C's Validator. You should replace the existing definition with the following code, or if you haven't got a definition you should stick this code at the very beginning of your code:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

You've probably noticed the XML tag on the first line, this says that your document is written in XML which, along with HTML, is one of the parents of XHTML.

Step 3: Change the <html> tag

This step involves replacing your existing <html> tag with:

<html xmlns="http://www.w3.org/1999/xhtml"
  xml:lang="en" lang="en">

You should also be aware, that the two letters "en" refer to the language in which the page is written, so you should replace them according to whatever language your page uses e.g. French is "fr". Take a look at the language codes page for more details.

Step 4: Make sure all elements are closed

You should now ensure that all your elements are properly closed, for example <p> tags must now be accompanied by the </p> tag. This should be done for all elements that normally had optional closing tags, this also includes <li> elements.

Elements that are normally considered "empty" (single elements that exist alone) must also be closed, for example <br> must now be written as <br />. (The documentation states that you should use <br/>, but for some reason this doesn't always work, so the extra space in the element is suggested).

Step 5: Ensure all attributes are quoted

XHTML states that all the attribute values must be surrounded by quotes, so for example <table width=100%> is no longer considered "well formed", whereas <table width="100%"> is. So, the next step is to go through your code and check that all the attributes are correctly quoted.

This may take a long time if you are not used to quoting all your attribute values, so once again... get some coffee...

Step 6: 'Un-minimize' minimized attributes

Although this section has a odd title, it is a fairly simple stage. Certain attributes within an element have always been minimized because they have only one possible value, for example <hr noshade> is not a valid HTML element as the "noshade" attribute only has one possible value that is "noshade". In XHTML you would write <hr noshade="noshade" /> as the "noshade" attribute cannot be minimized.

This rule also applies to attributes such as 'selected' in the <input type="radio" />, <input type="checkbox" /> and <option></option> tags as well as any other lesser used minimized attributes.

Step 7: Make sure that every image has an 'alt' attribute

The "alt" attribute used in <img /> elements is now a required part of your code. This attribute was always a very useful one to use, but now you don't have the choice! The "alt" attribute suggests the alternate text to be used if the image cannot be seen (for example in a text-only browser or by a blind person!), it is used in the following way: <img src="..." alt="alternate text" />.

Go through your code and make sure that every image element contains this attribute. If you have an image, for example a clear GIF that you don't want to be visible in any way, then use <img src="..." alt="" />, this will ensure that no "tooltips" are shown when the user mouseovers the image.

Step 8: Check for 'overlapping' elements

In the past it has been acceptable to "overlap" your elements, for example <b><i>text</b></i>. This is no longer the case, you must ensure that your elements are opened and closed in the correct order, i.e. "first open" = "last closed". For example <b><i>text</i></b> is the correct way of stacking your element.

Step 9: Check 'type' attributes for script and style elements

The "type" attribute used in <script> and <style> elements must now be used every time you use one of these elements. With the <style> element, the most common implementation will be:

<style type="text/css">
  stylesheet contents...
</style>

With the <script> element, the most common use of this element is:

<script type="text/javascript">
 script contents...
</script>

Step 10: Validate your code!

This isn't really a step in the conversion process, but it is good practice! Go to a validation service such as validator.w3.org and get your code checked... it's free so you have absolutely no excuses for making sloppy markup!

You may find some errors if you had a particularly complicated page to start with, but most of your pages should now be valid XHTML! If you did find some errors, then check that you have followed all the steps completely and if you are sure you have then consult the differences page to find out if you have missed anything!

<< Previous page