HTML 4 Explained
Every once in a while, a new version of the HTML standard is released. In 1998, it was time once again to embrace a new generation of HTML elements and attributes, and say goodbye to a few old friends. HTML 4.01 was a grand step towards an accessible, international web. Webmasters were granted many new opportunities with this round of HTML specifications, so let’s dive in and see what we can do with them.
This page was last updated on 2012-08-21
What is HTML 4.01?
HTML 4.01 was a large shake-up to the HTML standards that arrived in April 1998. The HTML language you have learnt is constantly evolving to meet the needs of a growing Internet. Things get added, some things get taken away and still more elements are asked to fade out gracefully. These changes ensure that designers have the freedom and power available to create increasingly complex websites and are able to achieve this efficiently.
It only happens every few years, and the changes are made by the » World Wide Web Consortium (W3C), who are HTML’s governing body, as it were. They convene and design the specifications that we all work with when creating websites (CSS was designed by the W3C too). They look for weaknesses in HTML that are holding the web back, and sort them out, which makes creating compelling websites easier for everybody.
The standard we were all working with before this was HTML 3.2. That was used for a while before the W3C decided to step it up another notch a few years ago. They released HTML 4. Some time later, when some minor errors in the specification were uncovered, they fixed these and called the final specification HTML 4.01. As of now, HTML 4.01 is the accepted standard, and the majority of web users do have browsers that support it fully. Some of the more peripheral new elements have yet to gain full support in the latest round of browsers, but they’re on the way. Modern browsers will generally have no problem with anything in these specs.
Versions
If you have used any software you will have undoubtedly noticed how every few months it advances its number. I used to use Firefox 2, until they improved it and it became Firefox 2.1. Adding a decimal to the version number signifies a minor change to the original. When major changes are made to a software project, they will move up a whole number to version 3. This is the same way most dynamic things work. As you can see, the original HTML 3.0 spec was revised to version 3.2 before the big change to 4, and a minor change to 4.01.
There was some confusion when HTML 4 started being discussed, as at the time version 4 browsers like Internet Explorer 4 were making their appearance and people thought there was some connection. In reality, the two separate things had just reached those versions simultaneously, not because of each other. As you know, browser technology has advanced to version 7 stages and beyond by now, and HTML is still at level 4. So there’s no real connection; though, that said, it was in the version 4 browsers that HTML 4 started being incorporated properly. Glad that’s cleared up.
A few months after HTML 4.0 was released, its documentation was updated to correct some minor problems, and its version number was bumped up slightly. So the final final version of this standard is HTML 4.01.
DOCTYPEs Ahoy
Nowadays the Document Type Declaration (DTD) at the top of your document is very important if you want the browsers to render your page correctly. Without it, browsers might interpret your code more loosely, and you may have display errors. The HTML 4.01 DTDs are below. Take your pick.
Use the strict DTD if you’re using pure, structural code with no hacks:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
The transitional DOCTYPE, below, is the most commonly used, and still permits you to use certain old elements that we will eventually stop using altogether. It is probably the best choice until you’ve gotten to know HTML really well. Once you’re ready, you can start using the stricter DOCTYPE above.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
Finally, for frameset pages, use the frameset DTD:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/html4/frameset.dtd">
Simply add this line of code to the very top of your HTML pages (before the opening <html>
tag), and you’re away. You will also need to specify the character encoding of your page. The best encoding to use is called Unicode, and allows you to type almost any character you want (like punctuation, letters with accents etc.) directly into your content. Add this element in between your page’s <head>
and </head>
:
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8">
Once you have added your DOCTYPE and encoding, run your page through the » HTML validator to see if you’re obeying the rules.
The New Elements in HTML 4
There are 22 elements new to HTML in the 4 specifications, and they cover all the areas, from text-formatting to tables to frames and the rest. Most of these elements have been covered in tutorials elsewhere here in HTML Source, and so where appropriate there are links to these in-depth pages. Most of the text formatting elements will make the text they mark up look a little different. You can see the effects of these elements in the text formatting list.
<abbr>
This is used to show an ABBReviated version of a word, and to offer the full version. When a reader leaves their mouse on the word, the full version pops up.
The code would be <abbr title="abbreviation">abbr.</abbr>
<acronym>
Similar to the one above, this is used for special abbreviations called acronyms (initialled abbreviations that can be spoken as a word themselves). It works the same way.
<acronym title="North Atlantic Treaty Organisation">NATO</acronym>
<bdo>
Most text is read from left to right, but some languages are the opposite, like Hebrew for example. This new tag allows the browser to display your page correctly if you use one of these languages, and allows you to pull off cool effects like this: cool effects. That’s typed in normally in the source code, but the tag changes its direction. RTL means ‘Right To Left’.
Code: <bdo dir="rtl">crazy text</bdo>
<button>
This is a simple way of adding form buttons to your pages. What’s more, you can now format the text and put images and other elements on the button.
<button><b>click</b> me</button>
<colgroup>
A table tag that allows you to affect the attributes of an entire column with one line of code. More info in Tables Accessibility.
<del>
Wrapping this around some text creates a strike-through effect to signify DELeted text, so you can show readers what once existed without cutting it out altogether.
<del>waffle</del>
<fieldset>
This allows you to group buttons and things together, giving you a framed container to hold them in. It works together with the <legend>
tag below.
<frame>
Strangely, this has only been made part of the official specifications now, despite it having been in use for years. It has been given new style attributes but overall it’s the same as the frame
tag before. Read our <frame> tutorials.
<frameset>
This tag is in the same boat as its friend above. Nothing’s really new.
<iframe>
Once a proprietary Internet Explorer tag, this was such a smart idea that it has been assimilated into HTML proper. I have an <iframe> tutorial for you, too.
<ins>
This stands for INSerted text. It works in conjunction with the del
tag, and the inserted text appears with an underline.
<del>waffle</del><ins>quality literature</ins>
<label>
This allows you to give form elements a label. Clicking the label functions like clicking the element (a radio button, for example) itself. This improves Forms Accessibility.
<label for="choice1">Choice 1</label>
<input type="radio" id="choice1">
<legend>
When using a fieldset
element, this element must come first before any other content inside the fieldset
. It gives the title of the group. See it in action in an advanced forms tutorial.
<fieldset><legend>Contact Info</legend>
Email:<input type="text">
Address:<input type="text"></fieldset>
<noframes>
Another part of the frames umbrella that is being formally added to HTML 4. You can learn more about it at the basic frames page.
<noscript>
The same as above, this is for people who can’t do JavaScript.
<object>
This is set to become the do-it-all tag for inserting multimedia into your page, and is supposed to take over from img
, ismap
, applet
, script
and any others.
<object data="picture.gif" type="image/gif"></object>
<optgroup>
With this tag you can group together many option
elements which are part of a select
field, and give the groups titles.
<param>
This tag is used to set PARAMeters for ActiveX, Applets and objects
. It existed before HTML 4.01, but now is official code.
<span>
This tag was brought in specifically to work with stylesheets in applying classes and ids. It does nothing on its own, but it is great for applying your styles to text.
<tbody>
A new table tag that allows you to give attributes to a block of cells with this one tag. Read more in Tables Accessibility.
<tfoot>
Allows you to add a footer to the tbody
part of your tables.
<thead>
This allows you to add a header to the tbody
part of a table. It comes before tbody
, while tfoot
comes after in the code. Both of these elements are also found in HTML4 Tables.
<q>
If you’ve ever used the blockquote
tag, you’ll know it’s a big tag. How many letters are in that, ten?! This is much more like it, and is suitable for shorter quotations. Plus, it adds in the quotation marks for you. It will not add in the line breaks you get with blockquote
.
The new Attributes
These new attributes are here to allow stylesheet implementation, with two more reflecting the new international concerns that the W3C have taken onboard in this new draught. They can all be applied to any element.
class
This is how you give your page elements and text their class
es from your stylesheet. Read all about it in advanced CSS.
dir
This is the attribute that is used mostly with the new <bdo>
tag above. Your possible values are rtl
(right-to-left) or the default ltr
(left-to-right).
id
id
s are just like classes, but can be used with JavaScript and DHTML. More info.
lang
This attribute sets off a block of text as text typed in a foreign LANGuage, so that search engines and browsers know, and don’t just take it as badly spelled English. It will not translate anything for you, it’s just some behind-the-scenes help for things other than readers.
You can denote the text using the span
tag, like
<span lang="fr">Bonjour!</span>.
If you’re going to use it, have a look at the the common language codes.
title
This is one of my favourite things that came with HTML 4.01. It allows you to add in tooltip text, like the alt
attribute; but now you can add it to absolutely anything. You can give table cells titles, add in extra information to your links, and even hide jokes in your code that will only appear when a reader is on a specific word or sentence in your text.
Deprecated Elements
A deprecated element is one that is on the way out, but one which has been given a few more months to live before its life fully ends. There are much better elements than these available now, so your usage of them should be downscaled as much as possible.
<applet>
- Used to add Java applets. Use the new
<object>
element instead. <basefont>
- Used to affect text on the whole page. Use stylesheets instead.
<center>
- Used to center elements. Use
<div align="center">
or stylesheets. <dir>
- Used to make lists. Use
ul
s instead. Lists tutorial. <font>
- Ah, the classic font element. Still good for small things, but stylesheets have taken over. This is one element you should really try to avoid using.
<isindex>
- Just use the
input
tag. <menu>
- Another type of list that is redundant thanks to the
ul
element. <s>
- Creates strike-through effects. Use the stylesheets again, or the new
del
element. <strike>
- Same as above, use style.
<u>
- The underlining element, use stylesheets or
ins
instead.
Dead Elements
These are the elements that were so useless that they’re out on their asses for good. Never use these, you can’t guarantee the browsers will continue to support them. All three of these elements have been replaced by one new element — so you can see how they were useless.
<listing>, <plaintext>, and <xmp>.
Use pre
instead. This creates PREformatted text (text which follows its layout in your code).
And that’s the lot. By this stage, you are encouraged to use all of these elements. Most people have either upgraded their browsers or bought computers with the newest software on it, so the majority of the web audience are on the same level. That said, there are some elements here that still aren’t supported in even the newest browsers, so test your pages out before you go using any of these elements.