Bug#1040947: perl: readline fails to detect updated file

Tue Jul 18 02:23:51 BST 2023

Ian Jackson <ijackson at chiark.greenend.org.uk> wrote:
> Hi.  I'm sorry that something I had a hand in is causing you an
> inconvenience.
> 
> I'm afraid it's not clear to me what "working" and "non-working"
> behaviour from your example program is.  I don't feel I can reply
> comprehensively, so I will comment on some of the details in your
> message.

With each iteration of the loop in my example, the file size
increases as shown by `-s', yet readline isn't returning any
data after it sees a transient EOF.

> Eric Wong writes ("Re: Bug#1040947: perl: readline fails to detect updated file"):
> > Both can be data loss bugs, but checking a log file which is
> > being written to by another process is a much more common
> > occurence than FS failures[1] (or attempting readline on a
> > directory (as given in the #1016369 example))
> 
> AFAICT you are saying that the fix to #1016369 broke a program which
> was tailing a logfile.  I agree that one should be able to tail a
> logfile in perl.  I don't think I have a complete opinion about
> precisely what set of calls ought to be used to do that, but I would
> expect them to mirror the calls needed in C with stdio.

Right, and C stdio.h is similarly tricky when it comes to
properly dealing with errors, too.  I often ended up using
unistd.h read(2) or Perl sysread directly for stuff I really
care about; combined with checking stat(2) st_size to ensure
I've read everything.

Data I care about tends to have checksumming built into it's
data format (git objects/packs, gzipped texts/tarballs, FLAC audio)

Uncompressed log files are transient data that ceases to be
relevant after a short time.

> > Since this is Perl and TIMTOWTDI, I've never used IO::Handle->error;
> > instead I always check defined-ness on each critical return value and
> > also enable Perl warnings to catch undefined return values.
> > I've never used `eof' checks, either; checking `chomp' result
> > can ensure proper termination of lines to detect truncated reads.
> 
> AFIACT you are saying that you have always treated an undef value
> from line-reading operations as EOF, and never checked for error.
> I think that is erroneous.

Maybe so, though most of the reads I do are less critical than
writes.  A failed read means there's *already* lost data and
there's nothing one can do about it.

A failed write can be retried on a different FS or rolled back.

IME, write errors are far more common (but perhaps that's because
I have errors=remount-ro in all my fstabs)

In the case of an application's log file, the application is
already toast if there's any I/O error; thus any monitoring on
application-level log files would cease to be relevant.

> That IO errors are rare doesn't mean they oughtn't to be checked for.
> Reliable software must check for IO errors and not assume that undef
> means EOF.
> 
> I believe perl's autodie gets this wrong, which is very unfortunate.

Right, autodie doesn't appear to handle readline at all.

> > [1] yes, my early (by my standards) upgrade to bookworm was triggered
> >     by an SSD failure, but SSD failures aren't a common occurence
> >     compared to tailing a log file.
> 
> I don't think this is the right tradeoff calculus.
> 
> *With* the fix to #1016369 it is *possible* to write a reliable
> program, but soee buggy programs lose data more often.
> 
> *Without* the fix to #1016369 it is completely impossible to write a
> reliable program.

For reliable programs (e.g. file servers), it's required to
check expected vs actual bytes read; that pattern can be
applied regardless of #1016369.

> Having said all that, I don't see why the *eof* indicator ought to
> have to persist.  It is only the *errors* that mustn't get lost.  So I
> think it might be possible for perl to have behaviour that would
> make it possible to write reliable programs, which still helping buggy
> programs fail less often.

Right; EOF indicators should be transient for regular files.
It's wrong to consider EOF a permanent condition on regular files.

> But, even if that's possible, I'm not sure that it's a good idea.
> Buggy programs that lose data only in exceptional error conditions are
> a menace.  Much better to make such buggy programs malfunction all the
> time - then they will be found and fixed.

This mentality of breaking imperfect-but-practically-working
code in favor of some perfect ideal is damaging to Perl (based
on my experience with changes to Ruby driving away users).

Fwiw, using `strace -P $PATH -e inject=syscall...' to inject
errors for certain paths, both gawk and mawk fail as expected
when it fails to read STDIN:

  echo hello >x # create file named `x'

  strace -P x -e inject=read:error=EIO gawk '{ print }' <x
  # exits with 2

However, Neither Perl 5.36 nor 5.32.1 detect EIO on STDIN:

  strace -P x -e inject=read:error=EIO perl -ane '{ print $_ }' <x
  # exits with 0 even on Perl 5.36 bookworm

At this point (given Perl's maturity); it's less surprising if
it kept it's lack of error detection in all cases.