Index


RISC World

NET World

Richard Goodwin on checking your web pages

Checking your HTML

It is all very well writing your HTML code, however the final stage of constructing any website needs to be testing. It is just the same as beta testing a piece of software, you need to check that all the graphics appear as they should, that all the links work and of course that your text is spelt correctly.

Web Browsers

HTML should be a universal standard, the theory is that your code will appear the same in all browsers. If your page works in one browser, it will work in all. Unfortunately this is just not the case. As a simple example suppose you have a website that uses tables part of your HTML might look like this.

 <HTML>
 <HEAD>
 <TITLE>WML</TITLE>
 </HEAD>
 <body bgcolor="#FFFFFF" text="#000000" link="#0000DD" vlink="#666666">
 <TABLE BORDER="0" CELLSPACING="5" CELLPADDING="5">
 <P>Some text goes here</P>
 </body>
 </HTML>

This code will work fine in a great many browsers, so if you were using Internet Explorer or Browse they page would display correctly. However if you were using Netscape the page would not display. This is because a closing tag for the table command is missing! The code should be;

 <HTML>
 <HEAD>
 <TITLE>WML</TITLE>
 </HEAD>
 <body bgcolor="#FFFFFF" text="#000000" link="#0000DD" vlink="#666666">
 <TABLE BORDER="0" CELLSPACING="5" CELLPADDING="5">
 <P>Some text goes here</P>
 </TABLE>
 </body>
 </HTML>

Without the closing Table tag NetScape will not render the page and you end up with a blank screen.

Another example that can cause problems is with frames. Suppose you want a frame that appears right at the top of the browser window, your code fragment might look like this;

 <HTML>
 <TITLE>Welcome to a page.</TITLE>
 <frameset rows="42,*,80" FRAMEBORDER="0" FRAMESPACING="0" BORDER="0">
 <frameset cols="120,*" FRAMEBORDER="0" FRAMESPACING="0" BORDER="0">
 <FRAME NAME="Sidetop" SRC="sidebar/sidetop.htm" RESIZE="no" SCROLLING="no">
 <FRAME NAME="Top" SRC="topbar/topbar.htm" RESIZE="no" SCROLLING="no">
 </FRAMESET>

This would work in some browsers, but in Internet Explorer the frames would not be in the correct place, so to get the page to render in Internet Explorer it would have to be amended to include marginwidth and marginheight tags as shown below.

 <HTML>
 <TITLE>Welcome to a page.</TITLE>
 <frameset rows="42,*,80" FRAMEBORDER="0" FRAMESPACING="0" BORDER="0">
 <frameset cols="120,*" FRAMEBORDER="0" FRAMESPACING="0" BORDER="0">
 <FRAME NAME="Sidetop" SRC="sidebar/sidetop.htm" RESIZE="no" SCROLLING="no"
 marginwidth="0" marginheight="0" >
 <FRAME NAME="Top" SRC="topbar/topbar.htm" RESIZE="no" SCROLLING="no" marginwidth="0"
 marginheight="0">
 </FRAMESET>

This show us one very important thing, that browsers will interpret HTML in different ways, if you want to be sure your site works on all browsers, you will need to check it in as many browsers as possible.

Validation

Just like a programming language it is possible, in fact very easy, to write faulty HTML. The page may well work (see the Table example) but it cannot be trusted. However there is a simple solution, get the page validated. There are a number of on-line HTML validators available. Just enter the URL if your site and see the number of errors in your code. Some example HTML Validators include;

However a word of caution, if you correct the mistakes in your HTML and do it "by the book" you may find that the page now doesn't work as you intended! So I suggest you correct each mistake one at a time, and then check the page by hand in your browser. If a problem suddenly occurs you will be able to fix it. If you correct all the "mistakes" in one go it can sometimes be very difficult to get a page working again!

Link and File checking

Many websites are hosted on UNIX or LINUX machines, these differ from RISC OS or Windows machines in a great many ways, however for the purposes of constructing a website the one really important difference is the way file names are handled. On a RISC OS or Windows machine the following would all point to the same file.

<a href="index2.htm">An index</a>
<a href="index2.htm">An index</a>
<a href="index2.htm">An index</a>
<a href="index2.htm">An index</a>

However on a UNIX or LINUX filing system the above commands would point to four different files, this is because the file names are case sensitive, in the same directory you could have a file called Index.htm and one called index.htm. So you might find that your site works on your computer when the link is called index.htm, and the file called Index.htm. However if the site is uploaded to a UNIX server the link will no longer work. The easiest way around this problem is to ensure that all your file names and links are always in lower case.

Don't forget that this applies not only to hyperlinks but also to links for graphics, fig1.gif and Fig1.gif are not the same file!

More next time.

Richard Goodwin

 Index