Nested Data Structures--A Perl Primer (1/2) | WebReference

Nested Data Structures--A Perl Primer (1/2)

current pageTo page 2

Nested Data Structures--A Perl Primer

By Dan Ragle

On a recent project, I was tasked to provide a basic numerical analysis to be displayed in monthly breakdowns. The number of months to be displayed would be variable, and each day within each month may or may not contain a quantity and a dollar amount. Or in other words, the number of days within each month that I would actually be reporting on would also be variable.

My instinctive reaction was to create an array of arrays; the top level array being the selected months (one array entry per month) and each entry for the monthly array would contain an array of the days to be reported within that month. Finally, each entry in the days array would contain a small hash, consisting of the calendar day being reported, and the required quantity and dollar amount.

In other words, I needed a nested data structure, an array of hashes, which itself was embedded in an outer array (or an array of arrays of hashes). In Perl, such a structure is easily created and accessed; though to the novice the process may not be intuitive. In this article we will learn the basic terminology, concepts and tactics that can be used in the creation of nested data structures in Perl. My goal here is not to bog the user down with intricate details, but rather to simply give you a taste of nested data structures and introduce you to the basic techniques and concepts involved in their creation and access.

No Such Thing

When attempting to create the structure I described above in Perl, the first and perhaps most important point that must be recognized by beginning coders is that technically there is no such thing as an array of arrays in Perl. That is to say, attempting to assign a full array to a single (outer) array value like this:

my @fruits = ("Apples","Oranges","Bananas");
my @vegetables = ("Spinach","Broccoli","Green Beans");
my @food;
$food[0] = @fruits;
$food[1] = @vegetables;

simply won't work the way you expect. Why? Because in Perl, each array entry must be a scalar value; that is, a single value such as a single quoted string, or a number. In the above assignments, Perl simply assumes that since a scalar value is on the left of the expression (namely, $food[0] and $food[1]), then the scalar representation of the arrays @fruits and @vegetables are needed. And in Perl, when you access the array in a scalar context, only the length of the array is returned, and not the actual array values.

So how does one create an array of arrays in Perl? The answer lies in the creation of hard references.

Hard References

A hard reference in Perl is similar to pointers in other computer languages; it's a means by which we can refer to any type of data structure (scalar, array, hash, string, object, subroutine, etc.) using a single scalar value (as opposed to having to use a named variable that refers to the actual full size data structure itself). The scalar value itself is meaningless to us; it's an internal address used by the Perl interpreter to refer to a value it knows to be at that specific location. Think of a reference as similar to the address of a house. The house number can locate any individual house on a particular street; but the number itself is just a simple number, and could therefore be stored anywhere you can store a single number.

You can create hard references to data values in Perl in a number of ways:

# assign a reference to the data in "foo" to "bar":
my @foo = ("Apples","Oranges","Bananas");
my $bar = \@foo;
# assign a reference to an anonymous array to "bar"
$bar = ["Apples","Oranges","Bananas"];
# assign a reference value to an undefined entry (Perl 
# automatically creates the reference for us)
$foo[3][0] = "Pears";

In the first example, we simply create a reference to an existing named variable; specifically, the @foo array:

my $bar = \@foo;

In this example, $bar now contains a reference to the data structure represented by the array named @foo. We can (and will, in the next section) refer to the individual entries in the @foo array either directly through $foo or indirectly through a special usage of the $bar variable. It is the slash before @foo in the assignment statement that created this special reference entry.

In the second example, we don't use an existing named variable to create a reference, but simply assign an anonymous array to $bar. An anonymous array is so named because it doesn't have a specific named variable associated with it; indeed, the only way to refer to anonymous arrays (and hashes, and subroutines, etc.) is via reference variables. When we execute this statement:

my $bar = ["Apples","Oranges","Bananas"];

we tell Perl to create an anonymous array with the entries "Apples", "Oranges", and "Bananas" in the first three slots; and to assign a reference to that array to $bar. Such an array is accessible only through references; there is no named variable that can access it.

Note that arrays are not the only data structures in Perl that can be defined anonymously; hashes and even subroutines can be created in this way. An anonymous hash definition looks similar to the defintion above, except that curly brackets (braces) are used:

my $bar = {"fruit"      => "Apples",
           "vegetables" => "Carrots"};

And an anonymous subroutine is defined in a similar manner as a named subroutine; you just leave off the subroutine name:

my $bar = sub {my $inparam=shift; print $inparam;};
&$bar("Hello, World!\n");  # prints "Hello, World!"

In the final example from above, we simply assign a value to an undefined array entry, and Perl assumes that it must be a reference value and creates one on the fly to accommodate the statement. Remember that in Perl, there is no such thing as an array of arrays; so when it encounters a statement like this:

$foo[3][0] = "Pears";

Perl simply assumes that it needs a reference to an anonymous array (for the 4th slot of the @foo array) and loads the first element of that array with the value "Pears." Don't worry about the syntax here just yet (i.e., the multiple square brackets); we'll revisit that on the next page. For now, just remember that if you attempt to assign a value to a variable where only a reference will do the job, Perl will silently create such a reference for you. This can be handy, since it means you don't have to explicitly define such a reference before hand, such as:

$foo[3]=[];   # create an empty anonymous array 
$foo[3][0] = "Pears";  # assign a value to the first array entry

Having created references in scalar values, how do we go about accessing those values? On the next page we examine that topic and then produce a basic representation of the data structure I described at the opening of this article.

current pageTo page 2

Created: September 8, 2005
Revised: September 8, 2005