Perl Sorting with a bit more efficiency

I’ve modified the script from the last post to illustrate how many times the get_value function is called.

#! /usr/bin/perl -w

use strict;

my %letter_vals;
my $counter = 1;
my $function_calls = 0;

# function to calculate the 'value' of the name
sub get_value{
	my $data = shift;
	my $running_total;
	my @letters = split(//, $data);
	for (@letters){
		$running_total += $letter_vals{$_};
	}
	$function_calls ++;
	return $running_total;
}

# load up the 'lookuphash'
for ("a" .. "z"){
	$letter_vals{$_} = $counter;
	$counter ++;
}

# define the initail list of names
my @unsorted= qw /aitch babs dan cath nads japes max bill lotty ellie izzy/;

# sort based on the name 'value'
my @sorted = sort {&get_value($a) <=> &get_value($b)} @unsorted;

#  add the score next to the name
@sorted = map {$_, "(" . &get_value($_) .")"} @sorted;

# proudy display the results
print "\nResult: @sorted\nCalled function $function_calls times\n";

The output is as follows:

Result: dan (19) babs (24) cath (32) bill (35) nads (38) max (38) aitch (41) ellie (43) japes (51) izzy (86) lotty (92)
Called function 67 times

So I’m using the ‘Schwartzian Transform’ from the ‘Alpaca’, the transform is named after Randal L. Schwartz who wrote the Alpaca book -hmmmmm, coincidence? This next script is a further modification showing the steps involved, we’ll do the ‘all-in-one’ method at the end.

#! /usr/bin/perl -w

use strict;
use Data::Dumper;

my %letter_vals;
my $counter = 1;
my $function_calls = 0;

# function to calculate the 'value' of the name
sub get_value{
	my $data = shift;
	my $running_total;
	my @letters = split(//, $data);
	for (@letters){
		$running_total += $letter_vals{$_};
	}
	$function_calls ++;
	return $running_total;
}

# load up the 'lookuphash'
for ("a" .. "z"){
	$letter_vals{$_} = $counter;
	$counter ++;
}

# define the initail list of names
my @unsorted= qw /aitch babs dan cath nads japes max bill lotty ellie izzy/;

my @mapped = map {[$_, &get_value($_)]} @unsorted;

# curiosity satisfaction!
print Dumper(@mapped);

# sort based on the name 'value'
my @sorted = sort { $a->[1]  $b->[1] } @mapped;

#  add the score next to the name
@sorted = map {$_->[0], "(" . $_->[1] . ")"} @sorted;

# proudy display the results
print "\nResult: @sorted\nCalled function $function_calls times\n";

running the code produces:

$VAR1 = [
          'aitch',
          41
        ];
$VAR2 = [
          'babs',
          24
        ];
$VAR3 = [
          'dan',
          19
        ];
$VAR4 = [
          'cath',
          32
        ];
$VAR5 = [
          'nads',
          38
        ];
$VAR6 = [
          'japes',
          51
        ];
$VAR7 = [
          'max',
          38
        ];
$VAR8 = [
          'bill',
          35
        ];
$VAR9 = [
          'lotty',
          92
        ];
$VAR10 = [
           'ellie',
           43
         ];
$VAR11 = [
           'izzy',
           86
         ];

Result: dan (19) babs (24) cath (32) bill (35) nads (38) max (38) aitch (41) ellie (43) japes (51) izzy (86) lotty (92)
Called function 11 times

Wow! Notice how many times the function is called now? It was 67 times, now its 11 times (which is one lookup per item in the list). On my earlier project of sorting the sendmail virtusertable this will reduce the lookups from circa 25000 to 1300 so a big reduction and hopeful it will pickup speed when we test it against the ‘shell’ script from “El PerlHater”.

How it works

my @mapped = map {[$_, &get_value($_)]} @unsorted;

This is the key to whole operation, the map command takes each item in the array and makes  a new array of arrays (there are n number of 2 element arrays) which are displayed in the output from the Dump command.

my @sorted = sort { $a->[1]  $b->[1] } @mapped;

Now the array is sorted using a reference to the second item in the the newly created 2 item array which is the number returned from the get_value()  function. Look at the output from Dump in the results section for the structure.

@sorted = map {$_->[0], "(" . $_->[1] . ")"} @sorted;

Lastly the map function is used again to slot the the references back into an array with the scores next to them.

The All-In-One Method

#! /usr/bin/perl -w

use strict;

my %letter_vals;
my $counter = 1;
my $function_calls = 0;

sub get_value{
	my $data = shift;
	my $running_total;
	my @letters = split(//, $data);
	for (@letters){
		$running_total += $letter_vals{$_};
	}
	$function_calls ++;
	return $running_total;
}

for ("a" .. "z"){
	$letter_vals{$_} = $counter;
	$counter ++;
}

my @unsorted= qw /aitch babs dan cath nads japes max bill lotty ellie izzy/;

my @sorted = map {$_->[0], "(" . $_->[1] . ")"} sort { $a->[1]  $b->[1] } map {[$_, &get_value($_)]} @unsorted;

print "\nResult: @sorted\nCalled function $function_calls times\n";

So this outputs the same data but is more compact as the steps of the transformation are all rolled together, while shorter I’m not sure that it enhances the maintainability but it will certainly reduce wear on the ‘@’ key.

This entry was posted in Perl and tagged , , . Bookmark the permalink.

Leave a Reply