<!-- category: snippets -->
Tie::File - When Your Array IS the File
Open a file. Read it into an array. Modify the array. Write it back. Close the file. Four steps of boilerplate for something that should be one thought: "change line 5."Tie::File makes the file an array. Literally.
No slurping the whole file into memory. No writing it back out. You index into the array, and Tie::File reads from (or writes to) the actual file on disk. Line 0 is line 1 of the file. Line 99 is lineuse Tie::File; tie my @lines, 'Tie::File', 'config.txt' or die $!; $lines[4] = 'new_setting = true'; # change line 5 on disk print $lines[0]; # read line 1 from disk untie @lines;
- Array operations become file operations.
It ships with Perl. Core module since 5.7.3. Nothing to install.
Part 1: THE BASICS
Tying is straightforward. You give it an array, a filename, and you are done:Every array access goes through Tie::File's magic. When you readuse Tie::File; tie my @lines, 'Tie::File', '/etc/hosts' or die "Cannot tie: $!"; # read the first line print "First line: $lines[0]\n"; # how many lines? print "Total lines: ", scalar @lines, "\n"; # read the last line print "Last line: $lines[-1]\n"; untie @lines;
$lines[0], it seeks to the beginning of the file and reads the
first line. When you read $lines[-1], it finds the last line.
No array of the whole file sits in memory. Tie::File uses a cache and does seeks. For a 10 GB log file, you still only use a few megabytes of RAM.
Part 2: MODIFYING LINES
Assignment to an array element rewrites that line in the file:The file is modified immediately. No separate write step. When you assign totie my @lines, 'Tie::File', 'app.conf' or die $!; # change line 3 (index 2) $lines[2] = 'debug = 1'; # append to the end of the file push @lines, 'new_option = value'; # delete the last line pop @lines; untie @lines;
$lines[2], Tie::File rewrites line 3 on disk right then
and there. (Technically it may buffer a bit, but the point is you do
not have to flush or save manually.)
The file grows and shrinks as you push, pop, splice, and delete. Just like a regular array, except the storage is a file.
Part 3: SPLICE, THE POWER MOVE
splice on a tied array is where things get truly powerful.
Insert lines, delete ranges, replace blocks:
Tie::File handles all the byte-level shuffling. Lines after the splice point get shifted forward or backward on disk. You just think in terms of array indices.tie my @lines, 'Tie::File', 'data.txt' or die $!; # delete lines 5-9 (indices 4..8) splice @lines, 4, 5; # insert three lines before line 3 (index 2) splice @lines, 2, 0, 'inserted line A', 'inserted line B', 'inserted line C'; # replace lines 10-12 with a single line splice @lines, 9, 3, 'consolidated line'; untie @lines;
Part 4: IN-PLACE EDITING
This is the killer use case. Edit a config file without reading the whole thing, munging it, and writing it back:The magic here is thatuse Tie::File; tie my @conf, 'Tie::File', '/etc/myapp/settings.conf' or die "Cannot open config: $!"; for my $line (@conf) { # uncomment a setting $line =~ s~^#\s*(max_connections)~$1~; # change a value $line =~ s~^(timeout\s*=\s*)\d+~${1}30~; # the file updates on disk as we modify $line } untie @conf;
$line in the loop is an alias to the tied
array element. When you modify it with s~~~, the change propagates
through the tie back to the file. This is the same aliasing behavior
that regular for loops have with arrays, but now it writes to disk.
Part 5: PRACTICAL RECIPE: LOG ROTATION
Trim a log file to keep only the last N lines:No temp files. No copying. No risk of losing the file if the script crashes halfway through a rewrite. Tie::File handles the disk operations atomically enough for most purposes.use Tie::File; my $max_lines = 10_000; tie my @log, 'Tie::File', '/var/log/myapp/app.log' or die "Cannot tie log: $!"; if (@log > $max_lines) { my $excess = @log - $max_lines; splice @log, 0, $excess; # remove oldest lines print "Trimmed $excess lines from log\n"; } untie @log;
Part 6: PRACTICAL RECIPE: SEARCH AND REPLACE
Find and replace across a file, with line numbers:This is the Perl equivalent ofuse Tie::File; my $file = 'source.txt'; my $find = qr~old_function~; my $replace = 'new_function'; tie my @lines, 'Tie::File', $file or die $!; my $count = 0; for my $i (0 .. $#lines) { if ($lines[$i] =~ s~$find~$replace~g) { $count++; print " Line ${\($i+1)}: $lines[$i]\n"; } } print "Replaced in $count lines\n"; untie @lines;
sed -i, but with the full power of
Perl regex and the ability to add logic around each replacement.
Part 7: OPTIONS AND TUNING
Tie::File accepts several options that control its behavior:The memory option controls how much RAM Tie::File uses for its internal cache. Bigger cache means fewer disk seeks. For huge files, bumping this up can make a real difference.tie my @lines, 'Tie::File', 'data.txt', memory => 20_000_000, # cache size in bytes (default ~2MB) recsep => "\n", # record separator (default: OS native) autochomp => 1, # strip recsep from reads (default: 1) or die $!;
The recsep option is the record separator. Default is \n on
Unix, \r\n on Windows. You can set it to anything:
The autochomp option (on by default) strips the record separator from the end of each line when you read it. Set it to 0 if you want the raw line including the newline.# file delimited by null bytes tie my @records, 'Tie::File', 'data.bin', recsep => "\0" or die $!;
Part 8: HOW IT WORKS UNDER THE HOOD
Tie::File does not load the entire file. It builds an index of byte offsets for each line, reading and caching as needed.When you access@lines[0] --> byte offset 0 @lines[1] --> byte offset 47 @lines[2] --> byte offset 93 @lines[3] --> byte offset 128 . . @lines[N] --> byte offset ??? (computed on demand)
$lines[500], Tie::File:
- Checks the cache first
- If not cached, seeks to the known offset for line 500
- If the offset is not known, scans forward from the nearest known offset
- Reads the line, caches it, returns it
When you modify a line, Tie::File:
- Calculates the difference in length
- If the new line is the same length, overwrites in place (fast)
- If different, shifts all subsequent bytes forward or backward (slower)
This means random reads are fast. Sequential reads are very fast. Modifications near the end of the file are fast. Modifications near the beginning of a huge file are slow, because everything after the changed line has to shift.
OPERATION SPEED -------------------- ------------------------- Read any line Fast (seek + read) Read sequentially Very fast (cached) Modify last line Fast (no shift needed) Modify first line Slow for huge files (shift) Append (push) Fast Delete from end (pop) Fast Delete from start Slow for huge files
Part 9: GOTCHAS
A few things that will bite you if you are not careful.Concurrent access. Tie::File does not lock the file by default. If two processes tie the same file, you get corruption. Lock it yourself:
Large insertions at the start. Inserting 10,000 lines at index 0 of a million-line file means shifting a million lines forward on disk. This is slow. If you need to prepend a lot, consider writing to a new file instead.use Tie::File; use Fcntl qw(:flock); tie my @lines, 'Tie::File', 'shared.txt' or die $!; my $obj = tied @lines; $obj->flock(LOCK_EX); # exclusive lock # safe to modify now $lines[0] = 'updated safely'; $obj->flock(LOCK_UN); # unlock untie @lines;
Not for binary files. Tie::File is line-oriented. It splits on the record separator. Binary files with arbitrary byte sequences will not work as expected.
The untie matters. Always untie when you are done. It flushes
the cache and closes the file. Leaving it tied can cause data loss if
the script exits abruptly.
Part 10: THE TIE PHILOSOPHY
Tie::File is part of a broader Perl concept: tied variables. You can tie scalars, arrays, and hashes to custom implementations. The variable looks and behaves normally, but behind the scenes, your code (or a module's code) handles every access.The beauty of tie is transparency. Code that usesTIED TO WHAT HAPPENS --------------- ---------------------------------- Tie::File Array backed by a file on disk Tie::Hash::DBM Hash backed by a DBM database Tie::StdScalar Scalar with custom get/set logic Your own class Whatever you want
@lines does not
need to know or care that it is tied. Loops, slices, grep, map, sort,
all of it works. The array just happens to be a file.
# these all work on tied arrays my @errors = grep { m~ERROR~ } @lines; my @sorted = sort @lines; my @first_ten = @lines[0..9]; my $count = scalar @lines;
Tie::File takes a problem that normally requires five steps and collapses it into zero steps. There is no "read the file" step. There is no "write the file" step. There is only "use the array.".--. |o_o | "Your array IS the file. |:_/ | Deal with it." // \ \ (| | ) /'\_ _/`\ \___)=(___/
It has been in core since 2002. It is stable, documented, and does exactly one thing well: making a file behave like an array.
The next time you find yourself writing open, read, modify,
write, close for the thousandth time, remember that the array
already IS the file. You just have to tie them together.
perl.gg