In this blog post, we'll explore a fascinating Perl script that sorts URLs based on their titles. We'll break down the script, discuss how sorting works in Perl, and provide a simple sorting example for those new to this concept.
First, let's look at the full script:
xxxxxxxxxx
. | - '
$urls->{s~watch/\K[^"]+~$&~r} = $_;
}{
my @sorted = sort {
my ($title_a) = $a =~ m{>([^<]+)</a>};
my ($title_b) = $b =~ m{>([^<]+)</a>};
lc($title_a) cmp lc($title_b);
} values %{$urls};
local $" = qq|\n|;
say qq|@sorted|;
'
Reading the Input: The line cat URLs.txt | perl -nlE
reads each line from URLs.txt
and processes it with Perl.
Storing URLs: The first part of the script populates a hash reference $urls
where keys are derived using a regex and values are the original lines.
xxxxxxxxxx
$urls->{s~watch/\K[^"]+~$&~ } = $_;
This line uses a regex to extract a part of the URL and sets it as the key in the hash. The value is the entire line.
Sorting the URLs:
xxxxxxxxxx
my @sorted = sort {
my ($title_a) = $a =~ m{>([^<]+)</a>};
my ($title_b) = $b =~ m{>([^<]+)</a>};
lc($title_a) cmp lc($title_b);
} values %{$urls};
This block:
Extracts titles using regex.
Converts titles to lowercase to ensure case-insensitive comparison.
Uses cmp
for string comparison.
Output the Sorted List:
xxxxxxxxxx
local $" = |\|;
say |@sorted|;
The special variable $"
is set to a newline, and the sorted list is printed.
Let's create a demo URLs.txt
:
xxxxxxxxxx
<li><a href="https://example.com/watch/123">Zebra Link</a></li>
<li><a href="https://example.com/watch/456">Apple Link</a></li>
<li><a href="https://example.com/watch/789">Mango Link</a></li>
When you run the script on this file, it will sort the URLs based on the titles, producing:
xxxxxxxxxx
<li><a href="https://example.com/watch/456">Apple Link</a></li>
<li><a href="https://example.com/watch/789">Mango Link</a></li>
<li><a href="https://example.com/watch/123">Zebra Link</a></li>
In Perl, the sort
function uses two special variables $a
and $b
for comparison. Here's a simple example:
xxxxxxxxxx
my @numbers = (5, 3, 8, 1, 4);
my @sorted_numbers = sort { $a <=> $b } @numbers;
print "@sorted_numbers\n"; # Outputs: 1 3 4 5 8
In this example:
sort
uses a block { $a <=> $b }
for numeric comparison.
$a
and $b
represent pairs of elements in the list being compared.
For string comparison, you use the cmp
operator:
xxxxxxxxxx
my @words = ('apple', 'Mango', 'banana', 'Zebra');
my @sorted_words = sort { lc($a) cmp lc($b) } @words;
print "@sorted_words\n"; # Outputs: apple banana Mango Zebra
Here, lc($a)
and lc($b)
convert strings to lowercase, ensuring a case-insensitive sort.
Sorting in Perl is powerful and flexible, leveraging special variables $a
and $b
for comparison. Our example script demonstrates a practical use case, sorting URLs based on their link titles. Understanding these concepts, you can effectively sort lists in Perl.
Feel free to experiment with the provided script and demo data. Happy coding!