The Web Design Group

Document Types - Using a Custom DTD

Many documents on the Web use non-standard elements such as EMBED and non-standard attributes such as the LEFTMARGIN attribute on the BODY element. These documents will not validate against standards like HTML 3.2 or HTML 4.0, but validation is too important a tool to be ignored for this reason.

To validate documents with non-standard extensions, one can write a custom document type definition (DTD) that defines the elements. Extending an existing DTD to handle non-standard extensions can be relatively easy. As an example, we will extend the HTML 4.0 Transitional DTD to support the proprietary BGSOUND element and the proprietary LEFTMARGIN, TOPMARGIN, MARGINWIDTH, and MARGINHEIGHT attributes of the BODY element.

  1. Start by downloading the HTML 4.0 Transitional DTD and saving it to your computer.

  2. Open the file in a text editor. If you're using Windows and the line endings are messed up, try using WordPad.

  3. Adding attributes is easiest, so we'll start by adding the margin attributes to BODY. Search for "BODY" in the DTD and you'll find the following definition:

    <!ELEMENT BODY O O (%flow;)* +(INS|DEL) -- document body -->
    <!ATTLIST BODY
      %attrs;                              -- %coreattrs, %i18n, %events --
      onload          %Script;   #IMPLIED  -- the document has been loaded --
      onunload        %Script;   #IMPLIED  -- the document has been removed --
      background      %URI;      #IMPLIED  -- texture tile for document
                                              background --
      %bodycolors;                         -- bgcolor, text, link, vlink, alink --
      >

    We want the value of each new attribute to be a number of pixels, so we'll use NUMBER as the value type. Each attribute is optional and takes no default value, so we use #IMPLIED as is done with the BACKGROUND attribute. The preceding definition then becomes

    <!ELEMENT BODY O O (%flow;)* +(INS|DEL) -- document body -->
    <!ATTLIST BODY
      %attrs;                              -- %coreattrs, %i18n, %events --
      onload          %Script;   #IMPLIED  -- the document has been loaded --
      onunload        %Script;   #IMPLIED  -- the document has been removed --
      background      %URI;      #IMPLIED  -- texture tile for document
                                              background --
      %bodycolors;                         -- bgcolor, text, link, vlink, alink --
      leftmargin      NUMBER     #IMPLIED  -- left margin --
      topmargin       NUMBER     #IMPLIED  -- top margin --
      marginwidth     NUMBER     #IMPLIED  -- left and right margins --
      marginheight    NUMBER     #IMPLIED  -- top and bottom margins --
      >
  4. Now let's add our BGSOUND element. The BGSOUND element is comparable in many ways to IMG or BASEFONT, so we'll add it alongside these elements as "special" text markup by changing

    <!ENTITY % special
       "A | IMG | APPLET | OBJECT | FONT | BASEFONT | BR | SCRIPT |
        MAP | Q | SUB | SUP | SPAN | BDO | IFRAME">

    to

    <!ENTITY % special
       "A | IMG | APPLET | OBJECT | FONT | BASEFONT | BR | SCRIPT |
        MAP | Q | SUB | SUP | SPAN | BDO | IFRAME | BGSOUND">

    The HTML 4.0 Transitional DTD later defines these "special" elements as inline elements, which seems right for BGSOUND.

  5. We'll start a new heading for proprietary elements and define our new element by adding the following:

    <!--================== Proprietary Elements ==============================-->
    
    <!ELEMENT BGSOUND - O EMPTY            -- background sound -->

    We could add this pretty much anywhere, but we'll put it just before the "HTML content models" heading since that's nearby our comparative element, BASEFONT. Like BASEFONT, BGSOUND is an empty element, which means that it doesn't take an ending tag. The hyphen just after BGSOUND in our declaration indicates that the start tag is required. The O indicates that the ending tag is optional. EMPTY says that BGSOUND does not contain any elements, which means that the ending tag is in fact forbidden.

  6. We now need to define the attributes that our new element will take. For simplicity, we'll assume that we only want to use the SRC and LOOP attributes.

    To decide how to define the SRC attribute, we'll look at the definition of the IMG element's SRC attribute:

    <!ELEMENT IMG - O EMPTY                -- Embedded image -->
    <!ATTLIST IMG
      ...
      src         %URI;          #REQUIRED -- URI of image to embed --
      ...
      >

    BGSOUND's SRC attribute also takes a URI as its value, and it also should be required since the element is useless otherwise.

    There is no other LOOP attribute in HTML 4.0 Transitional, but we know that we want to take values that include INFINITE, -1, and 4. The value type CDATA (character data) covers these values. The LOOP attribute is optional and it defaults to a value of 1.

    So we have the following attribute definitions:

    <!ATTLIST BGSOUND
      src         %URI;          #REQUIRED -- URI of sound to play --
      loop        CDATA          1         -- number of times to play the sound --
      >
  7. Now save the DTD as "HTML4plus.dtd" (or anything else you like) and upload it to the Web. You can then tell the validator to use your custom DTD by using the following DOCTYPE:

    <!DOCTYPE HTML SYSTEM "http://www.htmlhelp.com/tools/validator/HTML4plus.dtd">

    (assuming that "http://www.htmlhelp.com/tools/validator/HTML4plus.dtd" is the location of the DTD on the Web)