The Doc Dialer: The Trie Data Structure | WebReference

The Doc Dialer: The Trie Data Structure

The Doc Dialer

The Trie Data Structure

The trie data structure is included in the curriculum of most computer science programs. The origin of the name is from the middle section of the word "retrieval", and this origin hints on its usage: information retrieval systems. Program designers use this data structure to build systems that can extract information in a computational complexity order of one, which is, of course, the best that one can have.

The trie data structure is based on two principles: a fixed set of indices and hierarchical indexing. The first requirement is usually met when you can index the data by some or all of the digits 0 to 9, or by the alphabet. For example, let's take the ten digits 0 to 9. At the top level you have a 10-element array. Each of these array's elements may point to another 10-element array, and so on. If you index the data by the alphabet, there will be 26 elements at the top level array, and each element will point to another 26-element array, and so on.

One of the advantages of the trie data structure is that its hierarchy depth depends on the amount of data stored in it. Each element of data is stored at the highest level of the hierarchy that still allows a unique retrieval. As the data structure getting loaded with data, former data elements lose their uniqueness and need to be pushed down the hierarchy. Thus, the more data being loaded the deeper the hierarchy will be.

Let's take an example. Suppose we have an element that can be identified by the sequence 8376. Assume also that the trie data structure consists of the trie[] array at the top level. If 8376 is the only data element to be stored, it can be stored as trie[8]. Assume now that another element labeled as 8453 is added. To make room for the second element, we allocate another 10-element array and have trie[8] point to it. We push 8376 down and now is indexed as trie[8][3], while the new element, 8453 will be stored under trie[8][4]. Storing all eight presidents is explained on the next page.

Produced by Yehuda Shiran and Tomer Shiran

Created: February 14, 2000
Revised: February 14, 2000