The Document Object Model (DOM), Part I: Analyzing a Simple Document - | WebReference

The Document Object Model (DOM), Part I: Analyzing a Simple Document -

The Document Object Model (DOM), Part I (4)

Analyzing a Simple Document

Let's analyze a simple document and synthesize its Document Object Model. A simple document is one that includes three paragraphs:

<TITLE> Simple DOM Demo </TITLE>
<BODY ID="bodyNode"><P ID = "p1Node">This is paragraph 1.</P>
This is the document body
<P ID = "p2Node">     </P>
<P ID = "p3Node"></P>

You have to start thinking in terms of a family tree and family relationships. The <BODY> tag is the head of the tree. It has four children: Three <P> tags which divide the document into three paragraphs and a textual entry between the first and second paragraphs. Each HTML tag pair includes an opening tag and a closing tag. The paragraph element, for example, includes the <P> tag and the </P> tag. Each child is either an HTML tag or a text node. An HTML-based child begins at the opening bracket and ends at the closing bracket. The content between the opening tag and the closing tag are children of the tag. In the example HTML document above, there are three <P> tags which maps into three child nodes. The textual line between the first and second paragraphs ("This is the document body") constitutes another child node of the <BODY> tag.

Notice the difference in content between the three <P> tags. The first one includes a textual entry, the second one includes blanks, and the third one is empty. A text node is created only when there is at least a single non-blank character. Thus, only the first paragraph above will have a child text node. Although the childNodes object is defined for the second and third paragraphs, trying to access it will yield a null or an undefined value, depending on the exact property used to access the collection.

We have sketched the tree structure of the above example. You can also pop up the page itself to see the outcome of the HTML document above. See the four child nodes (three HTML tags and one text node) of the top level <BODY> tag. Then, notice the only child of the first of the <BODY> children. We explain the relationships you see on the next page.

Some HTML tags do not include a closing bracket. For some of these tags, the closing bracket is inferred by the following tag. The <LI> tag is an example for such a tag. An <LI> tag is closed by the following <LI> tag or the <UL> tag. In these cases, the inferred closing bracket defines the end of the tag content, and should be considered as a regular bracket for the child definition algorithm. Other tags, such as the <IMG> tag, do not have content and thus are not expecting a closing bracket. Tags without content do not have children.

Produced by Yehuda Shiran and Tomer Shiran

Created: May 31, 1999
Revised: May 31, 1999