In this blog post, we'll explore a fascinating Perl script that sorts URLs based on their titles. We'll break down the script, discuss how sorting works in Perl, and provide a simple sorting example for those new to this concept.
First, let's look at the full script:
xxxxxxxxxx . | - ' $urls->{s~watch/\K[^"]+~$&~r} = $_;}{ my @sorted = sort { my ($title_a) = $a =~ m{>([^<]+)</a>}; my ($title_b) = $b =~ m{>([^<]+)</a>}; lc($title_a) cmp lc($title_b); } values %{$urls};
local $" = qq|\n|; say qq|@sorted|;'Reading the Input: The line cat URLs.txt | perl -nlE reads each line from URLs.txt and processes it with Perl.
Storing URLs: The first part of the script populates a hash reference $urls where keys are derived using a regex and values are the original lines.
xxxxxxxxxx$urls->{s~watch/\K[^"]+~$&~} = $_;This line uses a regex to extract a part of the URL and sets it as the key in the hash. The value is the entire line.
Sorting the URLs:
xxxxxxxxxxmy @sorted = sort { my ($title_a) = $a =~ m{>([^<]+)</a>}; my ($title_b) = $b =~ m{>([^<]+)</a>}; lc($title_a) cmp lc($title_b);} values %{$urls};This block:
Extracts titles using regex.
Converts titles to lowercase to ensure case-insensitive comparison.
Uses cmp for string comparison.
Output the Sorted List:
xxxxxxxxxxlocal $" = |\|;say |@sorted|;The special variable $" is set to a newline, and the sorted list is printed.
Let's create a demo URLs.txt:
xxxxxxxxxx<li><a href="https://example.com/watch/123">Zebra Link</a></li><li><a href="https://example.com/watch/456">Apple Link</a></li><li><a href="https://example.com/watch/789">Mango Link</a></li>
When you run the script on this file, it will sort the URLs based on the titles, producing:
xxxxxxxxxx<li><a href="https://example.com/watch/456">Apple Link</a></li><li><a href="https://example.com/watch/789">Mango Link</a></li><li><a href="https://example.com/watch/123">Zebra Link</a></li>
In Perl, the sort function uses two special variables $a and $b for comparison. Here's a simple example:
xxxxxxxxxxmy @numbers = (5, 3, 8, 1, 4);my @sorted_numbers = sort { $a <=> $b } @numbers;print "@sorted_numbers\n"; # Outputs: 1 3 4 5 8In this example:
sort uses a block { $a <=> $b } for numeric comparison.
$a and $b represent pairs of elements in the list being compared.
For string comparison, you use the cmp operator:
xxxxxxxxxxmy @words = ('apple', 'Mango', 'banana', 'Zebra');my @sorted_words = sort { lc($a) cmp lc($b) } @words;print "@sorted_words\n"; # Outputs: apple banana Mango ZebraHere, lc($a) and lc($b) convert strings to lowercase, ensuring a case-insensitive sort.
Sorting in Perl is powerful and flexible, leveraging special variables $a and $b for comparison. Our example script demonstrates a practical use case, sorting URLs based on their link titles. Understanding these concepts, you can effectively sort lists in Perl.
Feel free to experiment with the provided script and demo data. Happy coding!