Top-of-document End-of-document

Frequently Encountered Problems: HTML

The Frequently Encountered Problems list, hereafter referred to as the FEP-list, is a 'reversed FAQ'. It is meant to work as a check-list for those writing HTML documents - perhaps with emphasis on all those doing it for the first time.

We, the authors, would like to state quite clearly from the start that this document does not concern itself with philosophy. The topics discussed here are directly related to quite real problems. We are not out to convince you of "our philosophy", nor to voice our personal opinions. This list deals with how to insure your documents against failure.

This document is maintained by the WDG, and can be found at http://www.htmlhelp.com/faq/. It is maintained by tina@htmlhelp.com to whom all contributions can be sendt. All contributors will be listed.

History and Contributors

First version: Tina Marie Holmboe, 17th of December, 1996
Last update: Tina Marie Holmboe, 23rd of February, 1997
Current version is 1.005


Index

  1. Does your document include a DOCTYPE declaration?
  2. Did you remember the TITLE element?
  3. Have you defined a background color, or background image?
  4. How about those attribute values - did you remember to quote the ones that need to be quoted?
  5. You didn't by any chance try to indent text by use of the BLOCKQUOTE or DD elements, or by using an 'invisible' GIF?
  6. Did you use the FONT element?
  7. Have you validated your documents?
  8. Are you using frames?
  9. Have you remembered to include ALT attributes for your IMG's?
  10. Did you also remember a useful ALT text for the counter?
  11. You've included a mailto: link, and attempted to include the Subject: with it?
  12. Did you try to resize an image by specifying its HEIGHT and WIDTH in percentages ?
  13. You wanted to use empty table-cells ?

i. Does your document include a DOCTYPE declaration?

[tina 17/12/1996 tina 23/02/1997]

If not, you should be aware that such a declaration, typically of the form:

  <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

is mandatory for both HTML 2.0 (http://www.w3.org/pub/WWW/MarkUp/html-spec/) and HTML 3.2 (Wilbur) (http://www.w3.org/pub/WWW/TR/WD-html32) documents. In most situations a missing DOCTYPE isn't fatal in any way, but a document without one could, legally, be interpreted by a browser as conforming to HTML 2.0 (http://www.w3.org/pub/WWW/MarkUp/html-spec/) - which may not produce the desired effect. In such a case, the browser could ignore all the bits of your document which contain HTML not found in 2.0.

Since validation of your documents is a highly recommended practice, a correct DOCTYPE is essential. Additionally, since such a declaration is never wrong, it is a good idea to add one. The following is a list of the standard HTML DTDs. Just add the proper one, just as it occurs here, to the first line of your documents.

HTML 2.0

HINT: The HTML 2 spec does not include tables. If you have included tables in your document, you should use the HTML 3.2 DTD.

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
This refers to the W3C's HTML 2.0 DTD.
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Level 2//EN">
Same as above.
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Level 1//EN">
This refers to the HTML 2.0 DTD, level 1 - no FORM elements are permitted in a HTML document of this type.
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Strict//EN">
This refers to the strict HTML 2.0 DTD, a more structurally rigid version of HTML 2.0.
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Strict Level 1//EN">
As above, this refers to a more strict HTML, this time with the same restrictions on forms as the HTML 2.0 Level 1 doctype.

HTML 3.2

Any one of the following is acceptable for HTML 3.2:


ii. Did you remember the TITLE element?

[tina 17/12/1996 alan 14/01/1997]

TITLE specifies text for use in external references to your page. It should make sense when taken out of context, e.g in when seen in a bookmark list, or a search engine response. When potential readers search for "Foo Corporation", which page are they going to read first - the one whose title clearly shows that it's relevant to your enquiry, or the one that says "No Title" ?

It's not without good reason that the HTML specifications make this element mandatory. Because of limitations in some browsers and operating systems, it's best if you don't exceed 63 characters in length.


iii. Have you defined a background color, or background image?

[tina 17/12/1996 tina 23/02/1997]

If you have, ask yourself these questions:

Have I defined a TEXT (http://www.htmlhelp.com/reference/wilbur/body/body.html) color which makes the text easily readable on that background ?

If a body color is defined, it will override the default setting in most browsers for the background; however, the other default settings will remain unaffected. Therefore, if one were to set the body background color to black, but did not set the text font to white, the text might by default appear as black on a black background, rendering it unreadable.

Remember that choosing a color from a scheme that utilize more than 8bit (256 colors) may increase the probability that it is unreadable. Your best bet is to use the colors found in the HTML 3.2 spec. (http://www.htmlhelp.com/reference/wilbur/body/body.html)

There is also a so-called "Web Palette" which is used by Microsoft Internet Explorer, Netscape and Mosaic. You can find more information about this at the colorguide (http://www.oit.itd.umich.edu/projects/DMS/answers/colorguide/)

Did I set the LINK (http://www.htmlhelp.com/reference/wilbur/body/body.html), VLINK (http://www.htmlhelp.com/reference/wilbur/body/body.html), and ALINK (http://www.htmlhelp.com/reference/wilbur/body/body.html) colors as well? Even if you have set the TEXT color, if you fail to set these as well, links on your pages may be invisible on your chosen background.

What happens if the users have image loading disabled? In such a case it is very important that you have set the BGCOLOR attribute, and with it a suitable set of TEXT, LINK, VLINK and ALINK colors. If not, the background of the user's browser may be the same as the text color you have set.

Summary: There are three viable combinations of attributes to the BODY element.

  1. No colors, no background image (leaving it to the user's choice only)
  2. Colors for all of BGCOLOR, TEXT, LINK, ALINK and VLINK; no image;
  3. As 2, but with background image. Choose a BGCOLOR similar to the predominant color of the background image.

iv. How about those attribute values - did you remember to quote the ones that need to be quoted?

[tina 17/12/1996 tina 14/01/1997]

One of the more common constructions in HTML code these days is

  align=center

In that piece of code, ALIGN is a so-called "attribute" to some HTML element or other. center is the attribute's "value". In this case, the code can be written exactly as above.

Another favorite is

  width=100%

In this case the percentage character is wreaking havoc. Some characters are not allowed directly in attribute values. So a conforming browser could quite nicely ignore the value altogether, which might not produce the effect you were after.

In short, some characters need to be quoted when used in such a fashion. To put it simply: if a string used as a value for an attribute contains any character not among the following, it must be quoted.

   A-Z a-z 0-9 . -

When in doubt, quote. It never hurts, and can improve readability and maintainance of your pages.


v. You didn't by any chance try to indent text by using the BLOCKQUOTE, DD or UL elements, or with an 'invisible' GIF?

[tina 17/12/1996 tina 23/02/1997]

When indenting text, there are mainly two issues to consider. Either you want to indent the first line of a paragraph, or you wish to do so with an entire block of text.

The former is fairly easy - you can do so by prefixing the line you wish to indent with N number of the so-called 'non-breaking space', or &#160; which will lead to the text-line being pushed in N number of characters. This is a safe method, and no negative effects apart from a missing indent will occur on browsers that doesn't support the &#160; entity.

Indenting a whole block of text, however, is another matter. Before you do, ask yourself why you would want to. Is there a really good reason why you want that block indented ?

If you find that you have a good reason, there are two methods which in themselves will not cause unpredictable results.

1: Put up a table that looks like this:

 <div align="center">
  <table width="80%">
   <tr>
    <td> ...stuff... </td>
   </tr>
  </table>
 </div>

Use <div align="right"> if you only want to achieve indenting on the left side and remeber to end the table with <br clear="all"> in such a case.

2: Use a style sheet (http://www.htmlhelp.com/stylesheets/). This will allow you to specify layout in addition to structure for your HTML documents. Sadly the CSS1 - or Cascading Style Sheets level 1 - is only partially implemented in the Amaya (http://www.w3.org/pub/WWW/Amaya/), Arena (http://www.w3.org/pub/WWW/Arena/), and Microsoft Explorer (http://www.microsoft.com/ie/default.asp) browsers. It is also implemented to a fashion in Netscape's Communicator 4.0b2 (http://www.netscape.com/) browser.

Many tricks have been employed to achieve indenting - and none of them work as they are intended to. The GIF trick will only work on graphical browsers, and then only when the user has not turned image loading off. In most other cases, an annoying image placeholder will be inserted where the GIF would have been.

DD is only valid inside a definition list. If you use a DD outside of such a list, a browser might - worst case - simply not display the text that follows it at all.

UL is a unordered list, and can only contain the element LI. Anything else that is inside UL, but outside of an LI, may be ignored by the browser.

BLOCKQUOTE is used to enclose block quotations from other works - and is not guaranteed to be shown indented by any browser.


vi. Did you use the FONT element?

[tina 17/12/1996 tina 23/02/1997]

When debating the FONT element, which is a part of the HTML 3.2 definition, there are three issues to deal with. The first concerns the SIZE attribute which is part of the standard, the second deals with the COLOR attribute which is also included in the standard, and the third is about the FACE one, which is not.

There has been some debate over the mere inclusion of the FONT element in the HTML standard. Adjusting the font is somewhat contrary to the point of HTML as a mark-up language. Since it is there, however, there are some things to remember. Let us, again, illustrate with an example.

In this case we've got a person, gender and age unknown, which is the lucky owner of a 21" color monitor. The lucky person is keen to preserve her, or his, eyesight, and so stays a proper distance away from the screen. The font size of the user's web browser has been adjusted to fit - 16 points. Along comes our happy HTML coder, and sets <font size="+7">. Our lucky user now has a point size of 23 - that is a very large font...

The problem works the same the other way around - let us say the user has a very small font, 8pt - and you set <font size="-7"> and create a slight problem for his software. It should now render the text in a font 1 point(s) high.

Setting the font size, therefore, can be a very tricky business indeed. It is advised that you use the <BIG> and <SMALL> elements instead. These have the advantage that a browser simply is told to render some parts of the text BIGer or SMALLer than the rest, and can do so in correlation to the user's chosen font size.

Specifying color for a specific bit of text is seen by many as a Good Thing, but do remember that the <FONT COLOR...> construct overrides both the documents and the user's choice in colors.

The FACE attribute to the FONT element is quite another matter. This is not a part of any HTML standard before 'Cougar', and has associated with it several problems. Firstly: there is no standard for font names across different platforms and operating systems. Secondly: even if there were, there is no guarantee that each user has the fonts you wish to use. Thirdly: only two browsers, Microsoft Internet Explorer and Netscape, are supporting the FACE attribute, and then only on the DOS, Macintosh and Windows NT platforms.

Add these three details together, and it should leave an impression that using the FACE attribute is a Bad Idea. It is possible to do font adjustment in a much more platform independent and structurally sound way with style sheets (http://www.htmlhelp.com/stylesheets/).

You can read more about the Hazards of Fonts in Warren Steel's What's wrong with the <FONT> element? (http://www.mcsr.olemiss.edu/%7Emudws/font.html)


vii. Have you validated your documents?

[tina 17/12/1996 tina 23/02/1997]

Validation is always a Good Idea when it comes to HTML documents, even those written by the most experienced authors. For the most part this is an automatic process, and several tools exist on the Web to help you.

But first: there are generally two types of "validators", the SGML parsers, and the "fluff" checkers.

An SGML parser will take a HTML DTD - which can be described as the definition of HTML - and check whether your document conforms to this. SGML parsers are very picky, but immensely useful if you wish to make sure that your document will not break in any way. The importance of SGML parsers lies in the fact that HTML is a language for describing the structure of a document, and not its layout. When the HTML code is correct as specified by the DTD version you have used, you are ensured that any correct and conforming HTML browser will know what to do with your document, and also that the same browser will retain the structure as you have marked it up.

"Fluff" checkers are more lax in their approach, but still immensly useful. Some check whether you've followed common 'style' rules, some do spell-checking, others verify links for their correctness, and yet others will flag constructs that might cause problems for disabled users of the web.

Some HTML validators

and fluff checkers...
A Kinder, Gentler HTML Validator (http://ugweb.cs.ualberta.ca/~gerald/validate/)
This is a very nice Validator service, based on an SGML parser. Note that this service is dependent on the proper DOCTYPE declarations being present in your document.
Bobby (http://www.cast.org/bobby/)
Created to validate, and check, HTML pages with special references to disabilities, 'Bobby' also delivers a measure of load-time, as well as the possibility to validate against many different browsers. A very nice service. Not an SGML parser.
Doctor HTML v4 (http://imagiware.com/RxHTML/)
By far the most extensive, the 'doc will check your spelling, image syntax, table syntax, etc., etc., etc. Not an SGML parser.

The WDG List of Validators (http://www.htmlhelp.com/links/validators.htm)
A list of further reading on the field of validators.

viii. Are you using frames?

[tina 17/12/1996 tina 23/02/1997]

Then there are several details you really ought to know - not all browsers today suppprt them, and FRAMES are not a part of HTML 2.0 (http://www.w3.org/pub/WWW/MarkUp/html-spec/), HTML 3.2 (Wilbur) (http://www.w3.org/pub/WWW/TR/WD-html32) or Cougar, nor were they a part of the now obsolete HTML 3.0 standard.

This means that there are a lot of people out there who won't be able to take advantage of your pages, unless you provide them with a frame-less alternative. This, to many, seems like a lot of work. Fear not, as it isn't. Here is the 'sceleton' of a typical FRAMES-based page.

  <FRAMESET ROWS="10%,*">
   <FRAME SRC="topframe.html">
   <FRAME SRC="bottomframe.html">
  </FRAMESET>

  <!-- place the content of the most important frame here, -->
  <!-- just as if it were a normal document.               -->
  <!-- Suggestion: create a text-only navigation page for  -->
  <!-- this section.                                       -->

There is, contrary to popular belief, no reason to include the NOFRAMES element. When a conforming browser sees <FRAMESET>, it will ignore everything between that and the </FRAMESET> tag. Everything after that will be displayed as usual. All you have to do is fill out that latter part with useful information - and don't insist that your users update to a so-called "frames-capable" browser. Many will find that frightfully rude.

It is also important to remember that, when using frames, the bookmark function in the user's end might only register the top page. When a user then returns, he or she will have to search through the entire site again. This might cause alot of people to never visit your page more than once.

Finally, be aware that there are to-day browsers that allow users to turn off frames; and some do because they simply don't like them. So before you go ahead with your frames-based design, ask yourself whether those frames really are the best way to present your pages.


ix. Have you remembered to include ALT attributes for your IMG's?

[tina 17/12/1996 tina 23/02/1997]

Conservative estimates put the number of users of the text-only browser Lynx (http://lynx.browser.org/) at 7%. If 10 million users are, at the moment, using the Internet and thereby having the possibility of access to the World Wide Web, around 70,000 people are using Lynx. Add to that the estimate that 30% of all 'surfers' with Netscape or Explorer are doing so with image loading turned off, and that 85% are using those two browsers, we end up with another 2,55 million users.

If the predictions come true, 35 million users will have access to the WWW at the turn of the century. That leaves us with 3,2 million users. None of these users will be able to see the images you've added to your code.

That is where the ALT-texts come into play. With good, proper ALT-texts present in your code, those users will also be able to take advantage of your work.

If you don't have access to a text-only browser, take a look at the See Lynx for Yourself! page.

It is worth noting that the ALT attribute to the IMG element is strongly suggested by the W3C with both the HTML 2.0 and 3.2 standards. More information concerning the ALT attribute can be found in WDG's Feature Article No. 3 Use of ALT texts in IMGs (http://htmlhelp.com/feature/art3.htm)


x. Did you also remember a useful ALT text for the counter?

[tina 17/12/1996 tina 19/12/1996]

If you don't have a counter, then this bit is of no concern. However, if you do have one, consider this: many people actually add an ALT-text to theirs, for instance

<img src="mycounter.cgi" alt="my counter">

which is a Good Thing, but it is also usually wrapped into something like:

<img src="mycounter.cgi" alt="my counter"> have visited this page !

which, on a text-only browser, or one with image loading turned off, will display as:

my counter have visited this page !

Try using a more suitable ALT-text in this case, such as alt="[IMG: Counter]" or alt="X people" etc.


xi. You've included a mailto: link, and attempted to include the Subject: with it?

[tina 21/12/1996 tina 23/02/1997]

That is not possible - although a popular "solution" is circulating in various newsgroups on the 'net. It goes like this:

  <a href="mailto:someone@somesite.domain?Subject=your subject">...</a>

This method has only one drawback: IT DOESN'T WORK. This syntactically invalid URL was introduced by Netscape, and is supported mainly by their own recent browsers, though not by their versions before 2.0. On those and on most other browsers IT WILL ENSURE THAT YOUR LINK DOES NOT WORK AT ALL. If you value your mail, don't ever use this method. It is illegal, and it doesn't work.

If you don't want to use a CGI script, there is only one alternative that will ensure that your e-mail will get through. This method will work with any browser available, and any future compliant browser.

  <a href="mailto:someone@somesite.domain" title="your subject">...</a>

Note that at the moment, only Lynx (http://www.lynx.org/) and NCSA Mosaic (http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/) for Windows honours the TITLE attribute to the A element when it comes to mailto: links. This means that with other browsers there will be no subject added, but the mail will get through nevertheless.


xii. Did you try to resize an image by specifying its HEIGHT and WIDTH in percentages ?

[john 15/01/1997 tina 23/02/1997]
Suggested by Kivi Shapiro

To begin with, this defeats the purpose of specifying height and width in some browsers. Specifying Height and Width attributes allows the page to preload all the text around the spacing for the images. This eliminates the repagination effect caused as images begin to load where sizes have not been specified. However, when sizes other than original ones are specified, some browsers will still cause a delay in loading while they attempt to resize the image.

Additionally, when an image is created for use on the Web, the artist takes great care to specify the resolution and color depth in order to achieve a good balance of file size and clarity. When the image size is artificially increased in a browser window, it can appear grainy or distorted. When it is decreased, it is wasting bandwidth as a smaller version could have been used.

The bottom line is that if the image needs to be a size other than the native one, a graphics program should be used to resize the image to ensure quality, display speed and bandwidth conservation. Image programs are listed at major sites such as Windows95.com (http://www.windows95.com/), TuCows (http://www.tucows.com/) and shareware.com (http://www.shareware.com/).


xiii. You wanted to use empty table-cells ?

[tina 23/02/1997 tina 23/02/1997]
Suggested by Brian Orpin

Not all browsers, even among those that support tables, take too kindly to empty cells. Most will handle this quite well, some has been known to crash.

Therefore: put something in those empty cells; for instance a so-called 'non-breaking space', or &#160;.


Top-of-document End-of-document

© Tina Marie Holmboe, February 1997