HTML 5 | WebReference


By Arpan Dhandhania

Before HTML 5

Going back to 1989, when the Internet was in its infancy, and the Web was waiting to make an entry, Tim Berners-Lee (an independent contractor at CERN) and Robert Cailliau (a data systems engineer at CERN) came up with independent proposals of an Internet-based Hypertext system. In a few months, they collaborated on the joint proposal of the World Wide Web (W3) project. The project was accepted by CERN (European Organization for Nuclear Research).

In the later part of 1991, the first publicly available document about HTML called 'HTML Tags' was published by Tim Berners-Lee, which consisted of 20 HTML elements. Thirteen of these elements continue to exist in HTML 4.0.

In 1993, Tim published his first proposal of the HTML specifications, which used an SGML Document Type Definition to define its grammar. The proposal went through several drafts as more people, including IETF (Internet Engineering Task Force) got involved with the W3 project. In 1994, IETF created the HTML Working Group, which slowly took over responsibility on the project from CERN. In 1995, the group published "HTML 2.0", the first HTML specification that was considered a standard, which future implementations would be based on. It is interesting to note that HTML 1.0 was never published; instead HTML 2.0 was considered the new edition from the previous drafts.

Two years later, in January 1997, HTML 3.2 was published as a W3C Recommendation as IETF had closed down the HTML Working Group. Later that year, HTML 4.0 was released. By this time, many felt that the HTML markup was very loose and that it needed to be made more strict. As a result, HTML 4.0 offered three flavors:

  • Strict: in which deprecated elements are forbidden.
  • Transitional: in which deprecated elements are permitted.
  • Frameset: in which mostly only frame related elements are allowed.

HTML from the start never really enforced any strict rules about its markup. As a developer, this was a feature, but it makes the HTML interpreter's task increasingly complicated. Therefore, towards the end of the 90's W3C started working on XHTML, which was a revised form of HTML 4.1 but it, based itself on XML, which was much stricter about the markup. Thereafter, W3C was maintaining and updating two parallel languages, HTML 4.x and XHTML 1.x. XHTML 5 is now being defined alongside HTML 5 in the HTML 5 spec draft.


By mid-2004, people started to sense lethargy in W3C's development of web standards. Therefore, a group called WHATWG (Web Hypertext Application Technology Working Group) was formed in June 2004. WHATWG is a small, invitation-only group that was founded by individuals from Apple, Mozilla Foundation and Opera Software. They started working on the specifications in July 2004 under the name Web Applications 1.0. The specifications were submitted to W3C and readily accepted. By 2007, W3C adopted the specifications as a starting point of the new HTML called HTML 5.

By the time the first public draft of HTML 5 was published, the word around was that HTML 5 would redefine the web, obsolescing the likes of Adobe Flash, MS Silverlight and Java FX. The promise was that all browsers would use a standard video codec, which would be based on a more open standard. However, reality could not compete with this common dream. Because of strong opposition from the corporates, like Apple and Nokia, HTML 5 cannot specify a standard video codec for all web development.

The First Public Working Draft of the specification was published January 22, 2008. The specifications will be an ongoing work for many years but there is good news for us. The WHATWG has said that parts of HTML 5 will be incorporated into browsers as and when they are finalized. We won't need to wait until the whole specification is completed and approved to start using some of the features of HTML 5.

But what are the features of HTML 5?

New Doctype, Charset and Page structure

As HTML no longer uses SGML to define its Doctype, the doctype line in HTML can be made much simpler. Even the line that defines the charset in the head section is much simpler now.

<!doctype html>
<meta charset="UTF-8">

Page Structure

In HTML 3, we used tables to specify the structure of the page. In HTML 4, we evolved to using <div>s. HTML 5 introduces a completely new set of elements to define the page structure.

Here is the markup of a page in HTML 4:


Now, in HTML 5, this is what the markup would look like:


At first, when I read about the new markup, I thought it was awesome. But then, I started to get worried about how this markup will behave on an HTML4 browser. Anyway, I don't want to digress away from my main topic here, so I will leave you with the following link:

It discusses some of the 'HTML 5 Hiccups' that has been compiled by a bunch of HTML 5 enthusiasts. You can also Google it up and you will find ample information on it.

Other New Elements

Aside from the elements mentioned above, several new elements have been introduced to HTML 5.

  • <canvas>
    gives you a drawing canvas in JavaScript. The user can draw on the canvas and using Javascript, you can track the drawing.
  • <video>
    add video to your Web pages with this simple element.
  • <audio>
    add an audio clip to your Web pages with this simple element.
  • <progress>
    adds a progress bar on the page. You can use it while uploading or downloading something from your site.
  • <meter>
    represents a measurement such as disk usage.
  • The <input> element already exists, but new types have been introduced:
    • tel
    • search
    • url
    • email
    • datetime
    • date
    • month
    • week
    • time
    • datetime-local
    • number
    • range
    • color

Obsolete Elements

The following elements have been removed from HTML as of version 5 either because the element is not really being used or it can be done using CSS.

  • acronym
  • applet
  • basefont
  • big
  • center
  • dir
  • font
  • frame
  • frameset
  • isindex
  • noframes
  • noscript
  • s
  • strike
  • tt
  • u

New Features in HTML 5

Other than elements, HTML 5 also introduces additional capabilities to the browser like working in offline mode, multi-threaded JavaScript, etc. Let's go though some of the features.

Offline Mode
With HTML 5, you can specify what resources your page will require and the browser will cache them so that the user can continue to use the page even if she gets disconnected from the internet. This wasn't a problem before AJAX came into existence as the page could not request for resources after it was loaded. However, today's webpages are designed to be sleek so that they load fast and then the additional resources are fetched asynchronously.

Local Database
HTML 5 has included a local database that will be persistent through your session. The advantage of this is that you can fetch the required data and dump it into the local database. The page there after won't need to query the server to get and update data. It will use the local database. Every now and then, the data from the local database is synced with the server. This reduces the load on the server and speeds up responsiveness of the application.

Native JSON
JSON, or JavaScript Simple Object Notation is a popular alternative to XML, which was almost the de-facto standard before the existence of JSON. Until HTML 5, you needed to include libraries to encode and decode JSON objects. Now, the JavaScript engine that ships with HTML 5 has built-in support for encoding/decoding JSON objects.

Cross Document Messaging
Another interesting addition to HTML 5 is the ability to perform messaging between documents of the same site. A good use of this would be in a blogging tool. In one window, you create your post and in another window, you can see what the post would look like without having to refresh the page. When you save the draft of your post, it immediately updates the view window.

Cross Site XHR
One of the amazing implications of AJAX was to be able to not only fetch data from the server asynchronously, but to be able to get resources from other websites using the XMLHTTPRequest. As this wasn't part of HTML4, you needed to include a library to perform such an action. HTML 5 will have XMLHTTPRequest support built-in, so you won't need any library.

Multi-threaded JavaScript
A large portion of most web apps is written in JavaScript as it is the only client-side programming language available. One of the HTML 5 promises is that JavaScript will become a multi-threaded language so that it executes more efficiently. However, that only solves one part of the problem. Multithreading will speed up the processing time of JavaScript once it has loaded, but as you increase the number of lines of JavaScript, the pages take longer to load. To solve that problem, they have introduced an attribute called async to the <script> element. It tells the browser that this script is not required when the page loads, so it can be fetched asynchronously even after the page has loaded. The syntax for this is:

<script async src="jquery.js"></script> 

Until Next Time

This completes a brief overview of what HTML 5 has to offer. Some of its features are already available in some browsers, but it will take some time until all (at least major) browsers include them. In the forthcoming articles, we will look at each of the features in greater detail.

Original: September 14, 2009