<!-- category: hidden-gems -->
eof vs eof() - Parentheses Change Everything
Two function calls. Same name. One pair of parentheses apart. Completely different semantics.Without parentheses,eof # is the CURRENT file done? eof() # is ALL input done?
eof tests whether the current filehandle
has been exhausted. With empty parentheses, eof() tests whether
the last filehandle read has reached the end, which in
practice means "is everything done, including all remaining
files in @ARGV?"
One character of punctuation. Total semantic flip. Welcome to Perl.
Part 1: WHAT EOF (NO PARENS) TESTS
The bareeof without parentheses tests the most recently read
filehandle:
After the last line is read fromopen my $fh, '<', 'data.txt' or die "Can't open: $!\n"; while (<$fh>) { chomp; say $_; if (eof) { say "--- end of data.txt ---"; } }
$fh, eof returns true. It
checks the filehandle that <$fh> just read from. Simple enough.
When used inside a while (<ARGV>) or while (<>) loop, eof
without parens tests whether the current file in the argument
list is done. Not all files. Just the current one.
Output:# called as: perl script.pl file1.txt file2.txt file3.txt while (<>) { chomp; say $_; if (eof) { say "--- end of $ARGV ---"; } }
Theline1 from file1 line2 from file1 --- end of file1.txt --- line1 from file2 --- end of file2.txt --- line1 from file3 line2 from file3 line3 from file3 --- end of file3.txt ---
eof fires at the end of each file, then <> automatically
opens the next one and keeps going. The $ARGV variable holds
the name of the current file being processed.
Part 2: WHAT EOF() (WITH PARENS) TESTS
Empty parentheses change the question entirely.eof() asks:
"Has the last filehandle read reached the end AND are there no
more files to process?"
Output:# called as: perl script.pl file1.txt file2.txt file3.txt while (<>) { chomp; say $_; if (eof()) { say "=== ALL INPUT EXHAUSTED ==="; } }
The message only prints once, after the very last line of the very last file. The intermediate file boundaries are invisible.line1 from file1 line2 from file1 line1 from file2 line1 from file3 line2 from file3 line3 from file3 === ALL INPUT EXHAUSTED ===
Here is the distinction side by side:
file1.txt file2.txt file3.txt --------- --------- --------- eof (no parens) fires HERE fires HERE fires HERE eof() (parens) fires HERE
Part 3: THE DIAMOND OPERATOR AND MULTIPLE FILES
The diamond operator<> reads from files listed in @ARGV, or
from STDIN if @ARGV is empty. It is the backbone of Unix-style
filter programs.
Here is where the eof distinction matters most. The line counter# process any number of input files while (<>) { # $_ has the current line # $ARGV has the current filename # $. has the line number }
$. does NOT reset between files unless you tell it to:
To reset it, you close ARGV at the end of each file:while (<>) { # $. counts continuously across files # file1 line 1: $. = 1 # file1 line 2: $. = 2 # file2 line 1: $. = 3 (not 1!) }
Notice:while (<>) { chomp; printf "%s:%d: %s\n", $ARGV, $., $_; close ARGV if eof; # reset $. for next file }
eof (no parens). We want to detect the end of each
individual file so we can reset the counter. If we used eof()
instead, the counter would only reset after the very last file,
which is useless.
Part 4: PRINTING SEPARATORS BETWEEN FILES
The classic use case. Print a separator between files but not at the end:Read that carefully.while (<>) { print; if (eof) { print "---\n" unless eof(); } }
eof (no parens) fires at the end of
each file. eof() (with parens) fires only at the end of all
input. So the separator prints between files but not after the
last one.
No trailing separator. Clean output. Two eof calls, different meanings, working together.Content of file1.txt --- Content of file2.txt --- Content of file3.txt
Here is a more elaborate version that adds file headers:
Themy $first_file = 1; while (<>) { if ($. == 1 || $first_file) { print "\n" unless $first_file; print "=== $ARGV ===\n"; $first_file = 0; } print " $_"; close ARGV if eof; # reset $. for next file }
close ARGV if eof resets $. to 0 at the end of each file,
so $. == 1 catches the first line of every file.
Part 5: EOF(FILEHANDLE)
You can also pass a specific filehandle toeof:
This is the most explicit form. No ambiguity about which handle is being tested. Use this when you have multiple filehandles open and need to check a specific one:open my $fh, '<', 'data.txt' or die "Can't open: $!\n"; until (eof($fh)) { my $line = <$fh>; chomp $line; process($line); } close $fh;
A quick diff tool in 15 lines. Theopen my $in, '<', 'input.txt' or die $!; open my $ref, '<', 'reference.txt' or die $!; while (!eof($in) && !eof($ref)) { my $line_in = <$in>; my $line_ref = <$ref>; chomp($line_in, $line_ref); if ($line_in ne $line_ref) { say "DIFF at line $.: '$line_in' vs '$line_ref'"; } } say "input.txt has extra lines" unless eof($in); say "reference.txt has extra lines" unless eof($ref);
eof($handle) checks let
you handle files of different lengths gracefully.
Part 6: THE SUBTLE BUG
Here is the bug that gets people. You are processing multiple files and want to print a summary after each one:With# BUGGY: uses eof() instead of eof my $count = 0; while (<>) { $count++; if (eof()) # WRONG - this means "all input done" { say "$ARGV: $count lines"; $count = 0; } }
eof(), the summary only prints after the very last file.
Every other file's count gets rolled into the next. You end up
with one summary line instead of one per file.
The fix is obvious once you know it:
Now each file gets its own count. The output looks right:# CORRECT: uses eof (no parens) my $count = 0; while (<>) { $count++; if (eof) # current file done { say "$ARGV: $count lines"; $count = 0; } }
The difference between "right" and "wrong" is two characters of punctuation. Perl loves this kind of joke.file1.txt: 42 lines file2.txt: 17 lines file3.txt: 103 lines
Part 7: ARGV PROCESSING PATTERNS
Several common multi-file processing patterns rely on the eof distinction.Numbering lines per file:
Adding a filename header only when switching files:while (<>) { printf "%4d: %s", $., $_; close ARGV if eof; # reset $. per file }
Collecting all content per file into a hash:my $current_file = ''; while (<>) { if ($ARGV ne $current_file) { say "\n--- $ARGV ---" if $current_file; say "--- $ARGV ---" unless $current_file; $current_file = $ARGV; } print; }
my %files; my $current = ''; while (<>) { $current = $ARGV; $files{$current} .= $_; } for my $name (sort keys %files) { my $lines = () = $files{$name} =~ m~\n~g; say "$name: $lines lines, " . length($files{$name}) . " bytes"; }
Part 8: STDIN AND EOF
When reading from STDIN (no files in@ARGV), both eof and
eof() mean the same thing, because there is only one input
stream:
The distinction only matters when# echo "hello" | perl -e 'while(<>){ print; say "eof" if eof }' # echo "hello" | perl -e 'while(<>){ print; say "eof" if eof() }' # both produce the same output
@ARGV has multiple files. With
a single input stream, "end of current file" and "end of all
input" are identical.
You can also test STDIN explicitly:
This is useful for programs that accept optional piped input:if (eof(STDIN)) { say "No input on STDIN"; } else { while (<STDIN>) { process($_); } }
Theif (-t STDIN) { # STDIN is a terminal, not a pipe say "Usage: cat data.txt | $0"; exit 1; } while (<STDIN>) { chomp; say uc($_); }
-t test checks if STDIN is a terminal (interactive).
Combine it with eof(STDIN) for belt-and-suspenders input
validation.
Part 9: A COMPLETE MULTI-FILE TOOL
Pulling it all together. A file concatenation tool with headers, line numbers, and a summary:Both forms of#!/usr/bin/env perl use strict; use warnings; use feature 'say'; die "Usage: $0 file1 [file2 ...]\n" unless @ARGV; my $total_lines = 0; my $total_files = 0; my $file_lines = 0; while (<>) { # print header at start of each file if ($. == 1) { $total_files++; $file_lines = 0; say "" if $total_files > 1; # blank line between files say "=== $ARGV ==="; } $file_lines++; $total_lines++; printf "%4d | %s", $file_lines, $_; if (eof) { say "--- $file_lines lines ---"; # print final summary only after the last file if (eof()) { say ""; say "Total: $total_files files, $total_lines lines"; } close ARGV; # reset $. for next file } }
eof in one program, each doing exactly what it
is supposed to. eof detects the end of each file for per-file
summaries. eof() detects the end of all input for the grand
total.
Part 10: THE DECISION TABLE
When in doubt, use this:It is one of those Perl things that looks like a bug in the language until you understand why it exists. The Unix filter model processes multiple files as a single stream. You need a way to ask both "is this file done?" and "is the stream done?"WHAT YOU WANT TO KNOW USE THIS -------------------------- ---------- Current file exhausted? eof All input exhausted? eof() Specific handle exhausted? eof($fh) eof vs eof() no parens: "this file done?" parens: "everything done?" eof eof() | | current ALL file input | | fires fires often once .--. |o_o | "Two parens. |:_/ | Total semantic flip." // \ \ (| | ) /'\_ _/`\ \___)=(___/
Perl gives you both. Same function name, different calling convention. The parentheses are the difference between a per-file event and a global event.
Confusing? Sure, for about ten minutes. After that, you will never mix them up again. And the first time you write a clean multi-file processor that handles separators, resets line numbers, and prints a summary at the end, you will appreciate having both forms available.
Two bytes of punctuation. Two different questions. Perl does not waste syntax.
perl.gg