perl.gg / hidden-gems

<!-- category: hidden-gems -->

$& $ $' - The Match Variable Performance Curse

2026-05-04

Three variables. If you use any one of them, anywhere in your entire program, every single regex in your entire program gets slower.

Not just the regex near the variable. Every regex. In every module. In every library you loaded. All of them. Slower. Because you typed $& one time.

use Some::Module; # has 200 regexes internally use Another::Module; # has 150 more # somewhere in YOUR code: my $match = $&; # congratulations, you just slowed down # 350 regexes that aren't yours
This is the match variable performance curse. It is one of the most infamous gotchas in Perl's history. It affected real production systems for decades. And it is still lurking in codebases that have not been updated.

Part 1: WHAT THE VARIABLES CONTAIN

The three match variables capture parts of the string involved in the most recent successful regex match:
my $string = "Hello, World!"; $string =~ m~World~; say $`; # "Hello, " (prematch - everything before the match) say $&; # "World" (match - the matched substring) say $'; # "!" (postmatch - everything after the match)
H e l l o , W o r l d ! |___________|_________|_| $` $& $' prematch match postmatch
Together, they let you reconstruct the full original string:
$ . $& . $' equals the original string. They give you complete context about where in the string the match occurred.

Handy, right? Sure. If you do not care about performance.

Part 2: THE GLOBAL PERFORMANCE PENALTY

Here is the horrifying part. Perl's regex engine is optimized to avoid unnecessary work. Normally, when a regex matches, Perl only needs to know where it matched and what the capture groups contain. It does not need to copy the parts of the string before and after the match.

But if $&, $`, or $' might be accessed, Perl has to compute all three for every match. That means copying substrings of the target string on every successful regex. For long strings, that is a lot of copying.

The key word is "might." Perl checks at compile time whether any code in the entire program uses these variables. If any code does, the engine activates the expensive path for every regex operation. Not just the regexes near the variable usage. Every regex. Globally.

Program WITHOUT $& / $` / $': regex match --> record position --> done (fast) Program WITH $& / $` / $' anywhere: regex match --> record position --> copy prematch substring --> copy match substring --> copy postmatch substring --> done (slow)
One variable used once. Every regex pays the tax forever.

Part 3: WHY PERL DOES THIS

Perl cannot know at compile time which regexes will execute before the code that reads $&. So it has to prepare all of them.

Consider:

sub process { my ($text) = @_; $text =~ m~important~; # does this need to set $& ? return $text; } # ... 5000 lines later ... $x =~ m~pattern~; print $&; # reads $& from the LAST successful match
Which regex sets $&? It depends on execution order, which is unknowable at compile time. So Perl takes the conservative approach: if $& appears anywhere, compute it everywhere.

This is a global property of the program. The regex engine has one flag: "are match variables in use?" If yes, every regex pays. There is no way to say "only compute $& for this one regex."

At least, not until Perl 5.10.

Part 4: THE ENGLISH MODULE TRAP

The English module gives human-readable names to special variables. It also activates the performance curse.
use English; # DANGER # These are now available: # $PREMATCH (alias for $`) # $MATCH (alias for $&) # $POSTMATCH (alias for $')
Just loading the English module without the right flag imports the match variable aliases. And importing the aliases counts as "using" them. Your program gets slower just from the use statement, even if you never reference $MATCH in your code.

The fix:

use English qw(-no_match_vars); # safe!
The -no_match_vars flag tells English to skip the three dangerous aliases. You still get $INPUT_RECORD_SEPARATOR, $EVAL_ERROR, and all the other friendly names. You just do not get the three that destroy performance.

If you see use English; without -no_match_vars in a codebase, fix it immediately. That bare use English is silently slowing down every regex in the program.

Part 5: THE SAFE ALTERNATIVES (5.10+)

Perl 5.10 introduced three new variables that do the same thing as $&, $`, and $' without the global penalty:
my $string = "Hello, World!"; $string =~ m~World~p; # note the /p flag say ${^PREMATCH}; # "Hello, " say ${^MATCH}; # "World" say ${^POSTMATCH}; # "!"
The /p flag tells the regex engine to compute the match variables for this specific match only. No global flag. No penalty on other regexes. Just this one, right here, pays the cost.
OLD (global penalty): $string =~ m~pattern~; say $&; # every regex in the program is slower NEW (per-regex cost): $string =~ m~pattern~p; say ${^MATCH}; # only THIS regex pays the cost
The ${^PREMATCH}, ${^MATCH}, and ${^POSTMATCH} variables only exist after a regex with the /p flag. Without /p, they are undefined. This is the opt-in behavior that should have existed from the start.

Part 6: PERL 5.20 FIXED THE CURSE (MOSTLY)

Starting with Perl 5.20, the global penalty for $&, $`, and $' was removed. These variables are now computed on demand, only for the regex that was most recently successful. No global flag. No penalty on other regexes.
# Perl 5.20+: this is fine now $string =~ m~pattern~; say $&; # no global performance penalty
So is the curse dead? Almost. Here is why you should still care:
  1. Legacy code might still run on Perl before 5.20. Many production
systems are on 5.16 or 5.18.
  1. The /p flag and ${^MATCH} variables are still the explicit,
intentional way to access match data. They signal "I know what I am doing."
  1. If you write a CPAN module, your code might run on any Perl
version. Using $& in a CPAN module penalizes every user on Perl before 5.20.
  1. Habits matter. Understanding why $& was dangerous teaches you
about Perl's internals and the cost of global state.

Part 7: HOW TO CHECK IF YOUR CODE IS AFFECTED

Wondering if your codebase uses the dangerous variables? Grep for them:
$ grep -rn '\$&\|\\$`\|\$'"'" lib/ bin/
That grep is ugly because $' conflicts with shell quoting. An easier approach:
$ perl -ne 'print "$ARGV:$.: $_" if /\$[&`'"'"']/' lib/*.pm
Or use Perl::Critic, which has a policy specifically for this:
$ perlcritic --single-policy ProhibitMatchVars lib/
The Perl::Critic policy BuiltinFunctions::ProhibitMatchVars flags any use of $&, $`, $', or use English without -no_match_vars. It is included in the default policy set.

Also check for modules you depend on. If any CPAN module you load uses $& internally, your program pays the price. You can check with:

#!/usr/bin/env perl use strict; use warnings; use feature 'say'; # load your modules use Some::Module; use Another::Module; # check if match vars are in use use B; say "Match vars penalty active" if B::regex_pad_av() || $B::Deparse::{'$&'}; # simpler but less reliable: check the variable $_ = "test"; m~(test)~; say "Has \$& active" if defined eval '$&';
In practice, just run Perl::Critic. It catches the common cases.

Part 8: THE HISTORICAL PAIN

This is not a theoretical problem. Real CPAN modules shipped with $& in their code. Every program that used those modules got slower. The module authors often had no idea.

The most infamous case was English.pm itself. For years, the standard way to get readable variable names imported the match variable aliases by default. The -no_match_vars fix was added later, but by then, countless tutorials and books had taught use English; without the flag.

Thousands of Perl programs ran slower than they needed to because of a convenience module that was supposed to make code more readable. The irony is exquisite.

The Perl porters eventually fixed the problem at the engine level in 5.20. But the 15-year gap between "we know this is a problem" and "we fixed it in the engine" left a mark on Perl culture. It made the community acutely aware of global side effects and the hidden costs of convenience.

Part 9: WHAT TO USE INSTEAD

If you need the matched text, use a capture group:
# instead of $& $string =~ m~(pattern)~; my $matched = $1;
If you need prematch and postmatch, use @- and @+ (match position arrays) with substr:
$string =~ m~pattern~; my $prematch = substr($string, 0, $-[0]); my $match = substr($string, $-[0], $+[0] - $-[0]); my $postmatch = substr($string, $+[0]);
Or on Perl 5.10+, use the safe variables:
$string =~ m~pattern~p; my $prematch = ${^PREMATCH}; my $match = ${^MATCH}; my $postmatch = ${^POSTMATCH};
Or on Perl 5.20+, just use $& without guilt. The penalty is gone. But add a comment so future maintainers do not panic:
$string =~ m~pattern~; my $match = $&; # safe on 5.20+, no global penalty

Part 10: LESSONS FROM THE CURSE

.--. |o_o | "Three little variables. |:_/ | Fifteen years of pain. // \ \ One -no_match_vars flag." (| | ) /'\_ _/`\ \___)=(___/
The match variable curse is a story about the cost of global state. Three variables that look harmless. A design decision made in Perl's early days. A performance cliff that was invisible unless you knew to look for it.

The technical problem is mostly solved in modern Perl. Perl 5.20 removed the global penalty. The /p flag exists for explicit opt-in. Perl::Critic catches the old patterns.

But the lesson is timeless. Global side effects are dangerous. Convenience features can have hidden costs. And sometimes the most harmful code in your program is a single variable you used once, in one place, that nobody thought to question.

Know the history. Use the safe alternatives. And if you see use English; without -no_match_vars, add the flag. Your regex engine will thank you.

perl.gg