Sorting
Sorting in Perl is weird. And wonderful. And occasionally baffling.The sort function gives you two magic variables: $a and $b. They're the items being compared. Your job is to say which one comes first.
That's it. The spaceship operator says "compare these numerically." Perl handles the rest.sort { $a <=> $b } @numbers
Part 1: THE TWO OPERATORS
Perl has two comparison operators for sorting:The return value tells sort which element should come first:OPERATOR TYPE RETURNS -------- -------- -------------------------------- <=> Numeric -1 if $a < $b, 0 if equal, 1 if $a > $b cmp String Same, but compares as strings
You don't have to return exactly -1, 0, 1. Any negative, zero, or positive value works. But these operators give you exactly what sort needs.-1 $a comes before $b 0 They're equal (order doesn't matter) 1 $b comes before $a
Part 2: NUMERIC SORTING
Default sort is alphabetical. Numbers get mangled:Output:my @nums = (1, 20, 3, 100, 15); print join(", ", sort @nums);
That's ASCII order, not numeric. The string "100" comes before "15" because '1' < '1' is false but '0' < '5' is true... wait, that's not right either. It's '1' == '1', then '0' < '5' is true... actually '10' comes before '15' because we compare character by character.1, 100, 15, 20, 3
The point: it's not what you want.
Fix it with the spaceship:
Output:my @nums = (1, 20, 3, 100, 15); print join(", ", sort { $a <=> $b } @nums);
Now we're talking.1, 3, 15, 20, 100
.--. |o_o | |:_/ | // \ \ (| | ) /'\_ _/`\ \___)=(___/
Part 3: DESCENDING ORDER
Swap $a and $b:Or use reverse (less efficient but clearer):sort { $b <=> $a } @numbers # Largest first sort { $b cmp $a } @strings # Z before A
The swap trick is faster. No extra pass through the list.reverse sort { $a <=> $b } @numbers
Part 4: CASE-INSENSITIVE SORTING
Strings sort case-sensitively by default:Output:my @words = qw(apple Banana cherry APRICOT); print join(", ", sort @words);
Uppercase letters have lower ASCII values than lowercase. 'A' is 65, 'a' is 97. So capitals sort first.APRICOT, Banana, apple, cherry
Fix it with lc():
Output:sort { lc($a) cmp lc($b) } @words
We're comparing lowercase versions, but keeping the original case in the output.apple, APRICOT, Banana, cherry
Part 5: THE $a AND $b MYSTERY
Where do $a and $b come from? They're package variables that sort creates for you. No declaration needed.But beware: they're package-scoped, not lexical. This matters if you use strict:
$a and $b are special-cased. Perl doesn't complain about them even under strict.use strict; my @sorted = sort { $a <=> $b } @nums; # Works fine
But don't do this:
Your lexical $a hides the magic one. Chaos ensues.my $a = 5; # Shadows the sort variables! sort { $a <=> $b } @nums; # Broken
Part 6: SORTING BY EXTRACTED KEYS
Here's where it gets interesting. You can sort by anything you can compute from the elements.Sort URLs by their title:
The regex \| (.+)$ captures everything after the pipe. We compare those extracted titles.my @urls = ( 'https://example.com | Example Site', 'https://perl.org | The Perl Foundation', 'https://cpan.org | CPAN', ); my @sorted = sort { my ($title_a) = $a =~ /\| (.+)$/; my ($title_b) = $b =~ /\| (.+)$/; lc($title_a) cmp lc($title_b); } @urls;
Output:
https://cpan.org | CPAN https://example.com | Example Site https://perl.org | The Perl Foundation
Part 7: THE SCHWARTZIAN TRANSFORM
That URL sort has a problem. We're extracting the title twice per comparison. For n elements, sort does O(n log n) comparisons. That's a lot of repeated regex work.Enter the Schwartzian Transform:
Read it bottom to top:my @sorted = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { [ $_, /\| (.+)$/ ] } @urls;
Each regex runs once per element, not once per comparison. For large lists, this is dramatically faster.1. map: Transform each URL into [$url, $title] 2. sort: Compare by $title (the second element) 3. map: Extract just the URL (first element)
Part 8: BREAKING IT DOWN
Let's trace through the transform:The key insight: compute once, compare many times.INPUT: @urls = ( 'https://example.com | Example Site', 'https://perl.org | The Perl Foundation', 'https://cpan.org | CPAN', ) STEP 1 - First map (bottom): [ ['https://example.com | Example Site', 'Example Site'], ['https://perl.org | The Perl Foundation', 'The Perl Foundation'], ['https://cpan.org | CPAN', 'CPAN'], ] STEP 2 - Sort by index [1]: [ ['https://cpan.org | CPAN', 'CPAN'], ['https://example.com | Example Site', 'Example Site'], ['https://perl.org | The Perl Foundation', 'The Perl Foundation'], ] STEP 3 - Final map (top): ( 'https://cpan.org | CPAN', 'https://example.com | Example Site', 'https://perl.org | The Perl Foundation', )
Part 9: SORTING COMPLEX DATA
Sorting hashes by value:We sort the keys, but compare using the values. $scores{$b} first means descending order (highest score first).my %scores = ( alice => 87, bob => 92, carol => 85, ); my @ranked = sort { $scores{$b} <=> $scores{$a} } keys %scores;
Output:
bob, alice, carol
Part 10: MULTI-KEY SORTING
Sort by multiple criteria with the or operator:First by name, then by age. The or short-circuits: if the name comparison returns non-zero, we're done. Only if names are equal (comparison returns 0) do we check age.my @people = ( { name => 'Alice', age => 30 }, { name => 'Bob', age => 25 }, { name => 'Alice', age => 25 }, ); my @sorted = sort { $a->{name} cmp $b->{name} or $a->{age} <=> $b->{age} } @people;
Part 11: CUSTOM COMPARISON FUNCTIONS
For complex sorts, extract the logic:Note: no block braces when using a named function. And $a/$b are still the magic variables - they just work.sub by_domain { my ($domain_a) = $a =~ m{://([^/]+)}; my ($domain_b) = $b =~ m{://([^/]+)}; return lc($domain_a) cmp lc($domain_b); } my @sorted = sort by_domain @urls;
Part 12: GOTCHAS
Don't modify $a or $b:The comparison function should be pure. No side effects.sort { $a++ <=> $b } # BAD: modifies source array
Don't return non-numbers:
The return value must be numeric. Negative, zero, or positive.sort { 'apple' } # Meaningless
Watch out for undef:
Use the // operator to handle it:sort { $a <=> $b } (1, undef, 3); # Warnings!
sort { ($a // 0) <=> ($b // 0) } @list;
Part 13: THE COMPLETE TOOLKIT
# Numeric ascending sort { $a <=> $b } @nums # Numeric descending sort { $b <=> $a } @nums # String ascending sort { $a cmp $b } @strings # String descending sort { $b cmp $a } @strings # Case-insensitive sort { lc($a) cmp lc($b) } @strings # By hash value sort { $hash{$a} <=> $hash{$b} } keys %hash # Multi-key sort { $a->{x} cmp $b->{x} or $a->{y} <=> $b->{y} } @list # Schwartzian (compute once) map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [$_, expensive_computation($_)] } @list
perl.gg+---------+ | UNSORTED| | 3 1 4 1 | | 5 9 2 6 | +----+----+ | V .-----. / sort \ | <=> | \ / '-----' | V +---------+ | SORTED | | 1 1 2 3 | | 4 5 6 9 | +---------+