Wednesday, August 6, 2008

Sorting

A Simple Sort
If I was reading this I'd be wondering about sorting. Wonder no more, and behold:

foreach (sort keys %countries) {
print "The key $_ contains $countries{$_}\n";
}

Spot the difference. Yes, sort crept in there. If you want the list sorted backwards, some cunning is called for. This is suitably foxy:
foreach (reverse sort keys %countries) {
print "The key $_ contains $countries{$_}\n";
}

Perl is just so difficult at times, don't you think ? This works because:
keys returns a list
sort expects a list -- and gets one from keys , and sorts it
reverse also expects a list, so it gets one and returns it
then the whole list is foreach 'd over.
This is a quick example to make sure the meaning of reverse is clear:
print "Enter string to be reversed: ";
$input=;

@letters=split //,$input; # splits on the 'nothings' in between each character of $input

print join ":", @letters; # joins all elements of @letters with \n, prints it
print reverse @letters; # prints all of @letters, but sdrawkcab )-:

Perl's list operators can just feed directly to each other, saving many lines of code but also decreasing readability to those that aren't Perl-literate:
print "Enter string to be reversed: ";
print join ":",reverse split //,$_=;

This section is about sorting, so enough of reverse . Time to go forwards instead.



Numeric Sorting -- How Sort Really Works
That's easy alphabetical sorting by the keys. If you had a hash of international access numbers like this one:

%countries=('976','Mongolia','52','Mexico','212','Morocco','64','New Zealand','33','France');

foreach (sort keys %countries) {
print "The key $_ contains $countries{$_}\n";
}

You might want to sort numerically. In that case, you need to understand how Perl's sort function works.

The sort function compares two variables, $a and $b . They must be called $a and $b otherwise it won't work. One chap published a book with stolen code, and he changed $a and $b to $x and $y. He obviously didn't test the program because it would have failed and he would have noticed. And this book was really published ! Don't believe everything you read in books -- but web tutorials are always 100% truthful :-)

Back to sorting. $a and $b are compared, and the result is:

1 if $a is greater than $b
-1 if $b is greater than $a
0 if $a and $b are equal
So as long as the sort function gets one of those three values back it is happy. This means we can write our own sort routines, and feed them to sort. For example, we know the default sort is alphabetical. But if we write this:
%countries=('976','Mongolia','52','Mexico','212','Morocco','64','New Zealand','33','France');

foreach (sort supersort keys %countries) {
print "$_ $countries{$_}\n";
}

sub supersort {
if ($a > $b) {
return 1;
} elsif ($a < $b) {
return -1;
} else {
return 0;
}
}

then it works correctly. Of course, there is an easier way. The 'spaceship' operator <=> . It does exactly what the supersort subroutine does, namely return 1, -1 or 0 depending on the comparison of two given values.
So we can write the above much more easily as:

%countries=('976','Mongolia','52','Mexico','212','Morocco','64','New Zealand','33','France');

foreach (sort { $a <=> $b } keys %countries) {
print "$_ $countries{$_}\n";
}

Notice the { } braces, which define the contents as the subroutine sort must use. Pretty short subroutine. There is a companion operator to <=> , namely cmp which does exactly the same thing but of course compares the values as strings, not numbers.Remember, if you are comparing numbers, your comparison operator should contain non-alphas, if you are comparing strings the operator should contains alphas only. And don't talk to strangers.

Anyway, you now have enough knowledge to sort a hash by value instead of keys. Suppose your pointy haired manager bounced up to you and demanded a hash sorted by value ? What would you do ? OK, what should you do ?

Well, we could just sort the values.

foreach (sort values %countries) {

But Pointy Hair wants the keys too. And if you have a value you can't find the key.
So we have to iterate over the keys. But just because we are iterating over the keys doesn't mean to say we have to hand the keys over to sort . What about:

%countries=('976','Mongolia','52','Mexico','212','Morocco','64','New Zealand','33','France');

foreach (sort { $countries{$a} cmp $countries{$b} } keys %countries) {
print "$_ $countries{$_}\n";
}

beautifully simple. If you want a reverse sort transpose $a and $b .



Sorting Multiple Lists
You can sort several lists at the same time:

%countries=('976','Mongolia','52','Mexico','212','Morocco','64','New Zealand','33','France');
@nations=qw(China Hungary Japan Canada Fiji);

@sorted= sort values %countries, @nations;

foreach (@nations, values %countries) {
print "$_\n";
}

print "#----\n";

foreach (@sorted) {
print "$_\n";
}

This sorts @nations and the values from %countries into a new array.

The example also demonstrates that you can foreach over more than one list value -- each list is processed in turn. How I discovered that particular trick with Perl is instructive. I just tried it. If you think you should be able to do something with Perl, try it. Adhere to the syntax and conventions you will be familiar with from experience, in this case delimiting a list with commas, and try it. I'm always finding new shortcuts just by experimentation.

No comments: