<!-- category: hidden-gems -->
$|++ Autoflush via Increment
You are printing progress dots to the terminal. One dot per processed record. Thousands of records. But the dots do not appear one at a time. They appear in bursts. Fifty at once. Then a pause. Then fifty more.Your code is fine. Your output buffering is not.
One line. Three characters. Every$|++;
print to STDOUT now appears
immediately. No buffering. No batching. Each dot hits the terminal
the instant you print it.
But wait. $| is a variable. You just incremented it. What kind
of variable fixes buffering when you add 1 to it? Welcome to the
weirdest boolean toggle in Perl.
Part 1: WHAT $| DOES
The special variable$| controls output autoflush for the
currently selected filehandle (usually STDOUT). When it is 0
(the default), output is buffered. When it is nonzero, output is
flushed after every write operation.
With buffering on, Perl collects your$| = 0; # buffered (default) $| = 1; # autoflush on
print output in an internal
buffer and writes it to the operating system in efficient chunks.
Usually 4K or 8K at a time. This is faster for bulk output because
fewer system calls happen.
With autoflush on, every print triggers an immediate write. Slower
for bulk output. But essential when you need output to appear in
real time.
Buffered ($| = 0): print "." --> [buffer: .] print "." --> [buffer: ..] print "." --> [buffer: ...] ... print "." --> [buffer: .....(4096 chars)] --> WRITE to terminal Autoflush ($| = 1): print "." --> WRITE to terminal print "." --> WRITE to terminal print "." --> WRITE to terminal
Part 2: WHY $|++ WORKS
Here is the trick.$| is a boolean magic variable. It can only
be 0 or 1. Perl enforces this.
When you do $|++, you are incrementing it from 0 to 1. That turns
autoflush on. Simple enough.
But what if you do $|++ again? Normal math says 1 + 1 = 2. But
$| is magic. It coerces any nonzero value to 1. So $|++ on a
value of 1 produces... 1.
This meanssay $|; # 0 (default) $|++; say $|; # 1 (incremented) $|++; say $|; # 1 (still 1, not 2) $|++; say $|; # 1 (always 1)
$|++ is idempotent. Call it once, call it ten times,
the result is the same. Autoflush is on. It is a toggle that only
goes one way.
The ++ is not really "increment." It is "turn on, and stay on
no matter how many times you call it." A latch, not a counter.
Part 3: $|++ VS $| = 1
They do the same thing. The difference is cultural.$| = 1; # explicit: "set autoflush to true" $|++; # idiomatic: "turn on autoflush"
$| = 1 is clearer to beginners. It says exactly what it means.
$|++ is a Perl idiom. You see it in one-liners, in old-school
scripts, in code written by people who have been writing Perl since
the early 1990s.
Neither is wrong. But $|++ has a certain charm. It looks like you
are doing something sneaky, but you are really just flipping a
switch. Peak Perl.
Some people prefer $|-- as a joke. Decrementing from 0 to... well,
$| treats -1 as truthy, so it also turns autoflush on. But
$|-- on 1 gives 0 and turns it off. So $|-- is actually
unreliable as a toggle. Stick with $|++ or $| = 1.
Part 4: WHY YOU NEED AUTOFLUSH
Buffering is invisible until it bites you. Here are the situations where it hurts:Progress indicators:
Piped output:#!/usr/bin/env perl use strict; use warnings; $|++; # without this, dots appear in bursts for my $i (1 .. 1000) { print "."; do_slow_thing($i); } print "\n";
Without autoflush, the pipe receives data in big chunks. If$ perl generate.pl | tee output.log
generate.pl crashes halfway through, the last buffer of output
may be lost. It was sitting in Perl's buffer, never flushed to the
pipe.
CGI scripts:
Old-school CGI sends HTTP headers followed by content. If the
headers are buffered and the content arrives in the same buffer
flush, the web server might get confused about where headers end.
$|++ at the top of a CGI script is classic defensive coding.
Real-time logging:
If the process dies, you want every log line that was printed to actually be in the log file. Not sitting in a buffer that died with the process.#!/usr/bin/env perl use strict; use warnings; use POSIX qw(strftime); $|++; # flush every log line immediately while (my $event = wait_for_event()) { my $ts = strftime("%Y-%m-%d %H:%M:%S", localtime); print "[$ts] $event\n"; }
Part 5: STDOUT VS OTHER FILEHANDLES
Here is the gotcha that catches people.$| applies to the
currently selected filehandle. Not to all filehandles. Not
even necessarily to STDOUT.
If someone called$|++; # sets autoflush on the currently selected filehandle # (usually STDOUT)
select() earlier to change the default output
filehandle, $|++ affects that filehandle instead:
This is a real trap. Theopen my $log, '>', 'app.log' or die $!; select $log; # $log is now the default output filehandle $|++; # this sets autoflush on $log, NOT STDOUT print "goes to log\n"; # autoflushed print STDOUT "goes to term\n"; # NOT autoflushed!
select function changes which filehandle
gets the $| treatment. If you are not sure what select is set
to, be explicit:
Or use the old-school one-liner idiom:# explicit: set STDOUT autoflush regardless of select() my $old = select STDOUT; $| = 1; select $old;
That selects STDOUT, setsselect((select(STDOUT), $| = 1)[0]);
$|, and re-selects whatever was selected
before. All in one expression. Beautiful and completely unreadable.
Part 6: IO::HANDLE TO THE RESCUE
Modern Perl offers a cleaner way. TheIO::Handle module adds
an autoflush method to filehandles:
Nouse IO::Handle; STDOUT->autoflush(1); STDERR->autoflush(1); open my $log, '>', 'app.log' or die $!; $log->autoflush(1);
select shenanigans. No magic variables. Just call a method
on the filehandle you want. Each filehandle is independent.
IO::Handle ships with core Perl. No CPAN install needed. This is
the recommended approach in modern code. It is explicit, readable,
and works on any filehandle without side effects.
#!/usr/bin/env perl use strict; use warnings; use feature 'say'; use IO::Handle; # autoflush both standard streams STDOUT->autoflush(1); STDERR->autoflush(1); # now both streams flush immediately say "this appears right away"; warn "so does this";
Part 7: REAL-TIME LOGGING EXAMPLE
A practical logger that writes to both terminal and file, with autoflush on both:Both outputs are autoflushed. If the process is killed at item 50, the log file contains all 50 lines. Nothing lost in a buffer. The terminal shows each line as it happens. No burst behavior.#!/usr/bin/env perl use strict; use warnings; use feature 'say'; use IO::Handle; open my $logfile, '>>', 'process.log' or die "Cannot open log: $!"; $logfile->autoflush(1); STDOUT->autoflush(1); sub log_msg { my ($msg) = @_; my $ts = localtime(); my $line = "[$ts] $msg"; say $line; # terminal say $logfile $line; # file } log_msg("Starting process"); for my $i (1 .. 100) { log_msg("Processing item $i"); sleep 1; # simulate work } log_msg("Done"); close $logfile;
Part 8: THE PER-FILEHANDLE GOTCHA
Every filehandle has its own buffering state. Setting autoflush on STDOUT does not affect STDERR, and vice versa. Each one is independent.STDERR is typically unbuffered by default on Unix systems. That is why error messages usually appear immediately even without autoflush. But STDOUT is line-buffered when connected to a terminal and fully-buffered when connected to a pipe or file.$|++; # STDOUT autoflush only # STDERR is still buffered (though in practice, # STDERR is usually unbuffered by default on most systems)
This is why progress dots without newlines get stuck in a terminal. STDOUT is line-buffered, so it only flushes onSTDOUT to terminal: line-buffered (flush on \n) STDOUT to pipe: fully-buffered (flush on buffer full) STDOUT to file: fully-buffered (flush on buffer full) STDERR: unbuffered (flush on every write)
\n. A dot has no
newline, so it sits in the buffer.
$|++ switches STDOUT from line-buffered (or fully-buffered) to
unbuffered. Every write, no matter how small, gets flushed
immediately.
Part 9: PERFORMANCE IMPLICATIONS
Autoflush is not free. Everyprint becomes a system call. If you
are writing millions of lines to a file, autoflush will be
dramatically slower than buffered I/O:
The flushed version will typically be 5x to 20x slower. Do not slap#!/usr/bin/env perl use strict; use warnings; use Time::HiRes qw(gettimeofday tv_interval); open my $fh, '>', '/dev/null' or die $!; # buffered my $t0 = [gettimeofday]; for (1 .. 100_000) { print $fh "line of data\n"; } my $buffered = tv_interval($t0); # autoflushed $fh->autoflush(1); $t0 = [gettimeofday]; for (1 .. 100_000) { print $fh "line of data\n"; } my $flushed = tv_interval($t0); printf "Buffered: %.3fs\n", $buffered; printf "Flushed: %.3fs\n", $flushed;
$|++ on every script just because you can. Use it when
you need real-time output. Leave buffering on when throughput
matters.
Part 10: THE FULL PICTURE
.--. |o_o | "$|++ |:_/ | Three characters. // \ \ Zero buffering." (| | ) /'\_ _/`\ \___)=(___/
$|++ is one of those Perl idioms that looks like line noise to
outsiders and poetry to Perl programmers. It exploits the fact
that a magic boolean variable silently clamps itself to 0 or 1,
so incrementing it is the same as setting it to true.
For new code, use IO::Handle->autoflush(1). It is clearer, it
works on any filehandle, and it does not depend on select.
For one-liners and quick scripts, $|++ is still unbeatable.
Three characters that solve the "why is my output stuck in a
buffer" problem. No import. No method call. Just a variable
increment that happens to do exactly what you need.
Buffering exists for performance. Autoflush exists for sanity.
Know when to use each one. And when your progress dots come out
in bursts, $|++ is the answer.
perl.gg