Mastering Perl Sorting
In this blog post, we’ll explore a fascinating Perl script that sorts URLs based on their titles. We’ll break down the script, discuss how sorting works in Perl, and provide a simple sorting example for those new to this concept.
The Script
First, let’s look at the full script:
'
cat URLs.txt | perl -nlE $urls->{s~watch/\K[^"]+~$&~r} = $_;
}{
my @sorted = sort {
my ($title_a) = $a =~ m{>([^<]+)</a>};
my ($title_b) = $b =~ m{>([^<]+)</a>};
lc($title_a) cmp lc($title_b);
} values %{$urls};
local $" = qq|\n|;
say qq|@sorted|;
'
Understanding the Script
Reading the Input: The line cat URLs.txt | perl -nlE reads each line from URLs.txt and processes it with Perl.
Storing URLs: The first part of the script populates a hash reference $urls where keys are derived using a regex and values are the original lines.
$urls->{s~watch/\K[^"]+~$&~r} = $_;
This line uses a regex to extract a part of the URL and sets it as the key in the hash. The value is the entire line.
Sorting the URLs:
my @sorted = sort { my ($title_a) = $a =~ m{>([^<]+)</a>}; my ($title_b) = $b =~ m{>([^<]+)</a>}; lc($title_a) cmp lc($title_b); values %{$urls}; }
This block:
- Extracts titles using regex.
- Converts titles to lowercase to ensure case-insensitive comparison.
- Uses cmp for string comparison.
Output the Sorted List:
local $" = qq|\n|; say qq|@sorted|;
The special variable $“ is set to a newline, and the sorted list is printed.
Demo Data
Let’s create a demo URLs.txt:
<li><a href="https://example.com/watch/123">Zebra Link</a></li>
<li><a href="https://example.com/watch/456">Apple Link</a></li>
<li><a href="https://example.com/watch/789">Mango Link</a></li>
When you run the script on this file, it will sort the URLs based on the titles, producing:
<li><a href="https://example.com/watch/456">Apple Link</a></li>
<li><a href="https://example.com/watch/789">Mango Link</a></li>
<li><a href="https://example.com/watch/123">Zebra Link</a></li>
Understanding Perl’s sort
Sorting with Implicit Variables $a and $b
In Perl, the sort function uses two special variables $a and $b for comparison. Here’s a simple example:
my @numbers = (5, 3, 8, 1, 4);
my @sorted_numbers = sort { $a <=> $b } @numbers;
print "@sorted_numbers\n"; # Outputs: 1 3 4 5 8
In this example:
- sort uses a block { $a <=> $b } for numeric comparison.
- $a and $b represent pairs of elements in the list being compared.
String Comparison
For string comparison, you use the cmp operator:
my @words = ('apple', 'Mango', 'banana', 'Zebra');
my @sorted_words = sort { lc($a) cmp lc($b) } @words;
print "@sorted_words\n"; # Outputs: apple banana Mango Zebra
Here, lc($a) and lc($b) convert strings to lowercase, ensuring a case-insensitive sort.
Conclusion
Sorting in Perl is powerful and flexible, leveraging special variables $a and $b for comparison. Our example script demonstrates a practical use case, sorting URLs based on their link titles. Understanding these concepts, you can effectively sort lists in Perl.
Feel free to experiment with the provided script and demo data. Happy coding!
Copyright ©️ 2024 perl.gg