Wednesday, August 6, 2008

Reading Directories

Globbing
For this exercise, I suggest creating another directory where you have at least two text files and two or more binary files. Copy a couple of .dll files from your WINDIR directory if you need to, those will do for the binaries, and save a couple of random text files. Size doesn't matter, in this case.

Then run this, giving the directory as the command line argument:

$dir=shift; # shifts @ARGV, the command line arguments after the script name

chdir $dir or die "Can't chdir to $dir:$!\n" if $dir;

while (<*>) {
print "Found a file: $_\n" if -T;
}


The chdir function changes perl's working directory. You should, as ever, test to see if it worked or not. In this case we only try and change directory if $dir is true.

The <*> construct reads all files from a given directory, and prints if it passes the file test -T , which returns true if the file is a non-binary, ie text file. You can be more specific:

$dir =shift;
$type='txt';

chdir $dir or die "Can't chdir to $dir:$!\n" if $dir;

while (<*.$type>) {
print "Found a file: $_\n";
}

like so. But, there is a better way to read from directories. The method above is rather slow and inflexible.

readdir : How to read from directories
Instead, there is readdir . Another version of the previous example:
$dir= shift || '.';

opendir DIR, $dir or die "Can't open directory $dir: $!\n";

while ($file= readdir DIR) {
print "Found a file: $file\n";
}

The first difference is the first line, which essentially says if shift is false, then $dir = ., which is of course the current directory. Then, the directory is opened and we have the chance to trap the error. It is assigned a filehandle. The readdir function reads each file into $file. There is no while () { construct.

We can also apply the text file test. Run this, once without entering a directory and the second time with entering a directory path other than the one the script is in:

$dir= shift || '.';

opendir DIR, $dir or die "Can't open directory $dir: $!\n";

while ($file= readdir DIR) {
print "Found a file: $file\n" if -T $file ;
}

Firstly, because the filename is now not in $_ we have to explicitly apply the -T test to it with -T $file.

Why did this not work the second time? Look at the code carefully. You are testing $file. If perl doesn't get a fully qualified pathname, it assumes you are still in the directory the script was run from, or that of the last successful chdir . Not necessarily where you are readdir'ing from. So, to fix it:


print "Found a file: $dir/$file\n" if -T "$dir/$file" ;


where we now specify the pathname, both in the printout and in the file test itself. The "" are used because otherwise perl tries to divide $file by $dir.

Try running this on a directory with only a few files in it:

$dir= shift || '.';

opendir DIR, $dir or die "Can't open directory $dir: $!\n";

while ($file= readdir DIR) {
print "Found a file: '$file'\n";
}

Notice that two files are found which have interesting names, namely . and .. . These two files are the current, and lower directory respectively. Nothing new, they have always been there -- run the DOS command dir if you don't believe me. You don't usually want to know about them, so:
while ($file= readdir DIR) {
next if $file=~/^\./;
print "Found a file: '$file'\n";
}

is the usual workaround. You can use scalar context to dump everything to a list of some description:
$dir= shift || '.';

opendir DIR, $dir or die "Can't open directory $dir: $!\n";

@files=readdir(DIR);

print "@files";

but that includes the . files, so it is best to ensure they aren't included:
@files=grep !/^\./, readdir(DIR);

We haven't met -T yet, but for the moment just remember it searches a list and if it returns true, lets the variable pass. In this case, if it doesn't begin with . then that's true so it goes into @files.

There are other commands associated with reading directories, which tell you where in a directory you are, and then where to go to return. You should be aware of their existence, because you never know when you might need them. The one other command of use is closedir , which closes a directory. Optional, but recommended for clarity.

No comments: