Sams Teach Yourself XML in 24 Hours, Complete Starter Kit, 3rd Edition. Part 1 | 2 | WebReference

Sams Teach Yourself XML in 24 Hours, Complete Starter Kit, 3rd Edition. Part 1 | 2

Files and Directories in Perl

Exercise: The Unix grep

As you get further along in this book, the exercises will present you with more and more useful tools. This exercise presents a stripped-down version of the Unix grep utility. The Unix grep—not to be confused with Perl's grep function, introduced in Hour 6, "Pattern Matching"—searches files for patterns. This exercise presents a utility that will prompt for a directory name and a pattern. Every file in that directory will be searched for that pattern, and lines matching that pattern will be printed.

In future exercises, this utility will be modified to search subdirectories (see Hour 15, "Finding Permanence") and to take command-line arguments (see Hour 12, "Using Perl's Command-Line Tools"). Stay tuned for details.

Using your text editor, type the program from Listing 10.1 and save it as mygrep. If possible, be sure to make the program executable according to the instructions you learned in Hour 1, "Getting Started with Perl." Also, make sure that you don't rename this file to grep on a Unix system because it could be mistaken for the real grep utility.

When you're all done, try running the program by typing the following at a command line:

perl -w mygrep

or, if your system enables you to make the file executable,


Listing 10.2 shows a sample of the mygrep program's output.


Thus far in this hour, I've been sort of handwaving over the topic of directory structure. Full pathnames are sometimes needed to open files, and the readdir function can read directories. But actually navigating directories, adding or removing them, and cleaning them out takes a little bit more Perl.

Navigating Directories

When you run software, your operating system keeps track of what directory you're in when you run the software. When you log in to a Unix machine and run a software package, you are usually placed in your home directory. If you type the operating system command pwd, the shell shows you what directory you are in. If you use MS-DOS or Windows and open a command prompt, the prompt reflects what directory you are in at the time—for example, C:\WINDOWS. Alternatively, you can type the operating system command cd at the MS-DOS prompt, and MS-DOS tells you what directory you are in. The directory that you're currently using is called your current directory or your current working directory.

If you do not specify a full pathname when you try to open a file—for example, open(FH, "file") || die —Perl will attempt to open the file in your current working directory. To change your current directory, you can use the Perl chdir function, as follows:

chdir newdir;

The chdir function changes the current working directory to newdir. If the newdir directory does not exist, or you don't have permission to access newdir, chdir returns false. The directory change from chdir is temporary; as soon as your Perl program ends, you return to the directory that you were working in before you ran the Perl program.

Running the chdir function without a directory as an argument causes chdir to change to your home directory. On a Unix system, the home directory is usually the directory that you were placed in when you logged in. On a Windows 95, Windows NT, or MS-DOS machine, chdir takes you to the directory indicated in the HOME environment variable. If HOME isn't set, chdir doesn't change your current directory at all.

Perl doesn't have a built-in function for figuring out what your current directory is— because of the way some operating systems are written, it's not easy to tell. To find the current directory, you must use two statements together. Somewhere in your program— preferably near the beginning—you must use the statement use Cwd and then, when you want to retrieve the current directory, use the cwd function:

use Cwd;

print "Your current directory is: ", cwd, "\n";
chdir ‘/tmp' or warn "Directory /tmp not accessible: $!";
print "You are now in: ", cwd, "\n";

You have to execute the use Cwd statement only once; afterward, you can use the cwd function as often as necessary.

Creating and Removing Directories

To create a new directory, you can use the Perl mkdir function. The mkdir function's syntax is as follows:

mkdir newdir, permissions;

The mkdir function returns true if the directory newdir can be created. Otherwise, it returns false and sets $! to the reason that mkdir failed. The permissions are really important only on Unix implementations of Perl, but they must be present on all versions. For the following example, use the value 0755; this value will be explained in the section "Unix Stuff" later in this hour. For MS-DOS and Windows users, just use the value 0755; it's good enough and will spare you a long explanation.

print "Directory to create?";
my $newdir=;
chomp $newdir;
mkdir( $newdir, 0755 ) || die "Failed to create $newdir: $!";

To remove a directory, you use the rmdir function. The syntax for rmdir is as follows: rmdir pathname;

The rmdir function returns true if the directory pathname can be removed. If pathnamecannot be removed, rmdir returns false and sets $! to the reason that rmdir failed, as shown here:

print "Directory to be removed?";
my $baddir=;
chomp $baddir;
rmdir($baddir) || die "Failed to remove $baddir: $!";

The rmdir function removes only directories that are completely empty. This means that before a directory can be removed, all the files and subdirectories must be removed first.

Removing Files

To remove files from a directory, you use the unlink function:

unlink list_of_files;

The unlink function removes all the files in the list_of_files and returns the number of files removed. If list_of_files is omitted, the filename named in $_ is removed. Consider these examples:

unlink ;
$erased=unlink ‘old.exe', ‘a.out', ‘personal.txt';
unlink @badfiles;
unlink; #     Removes the filename in $_

To check whether the list of files was removed, you must compare the number of files you tried to remove with the number of files removed, as in this example:

my @files=;
my $erased=unlink @files;

# Compare actual erased number, to original number
if ($erased != @files) {
    print "Files failed to erase: ",
        join(‘,', ), "\n";

In the preceding snippet, the number of files actually erased by unlink is stored in $erased. After the unlink, $erased is compared to the number of elements in @files: They should be the same. If they're not, an error message is printed showing the "leftover" files.

Created: March 27, 2003
Revised: February 3, 2006