[Pkg-rust-maintainers] Bug#925544: ripgrep: Exits immediately without warning if it encounters a NUL byte inside the file to be searched, might exit with wrong exit code depending on the position of the match
Axel Beckert
abe at debian.org
Tue Mar 26 17:12:23 GMT 2019
Package: ripgrep
Version: 0.10.0-2
Severity: important
Tags: upstream
Hi,
with several GB via STDIN, rg as well as rg -F immediately exited
without any output while fgrep found many hits until it issued the
warning "Binary file (standard input) matches".
Consider the following example (based on the attached file):
→ cat -v aeh.txt
a
M-CM-$
^@
a
→ cat aeh.txt | fgrep a
Binary file (standard input) matches
→ cat aeh.txt | fgrep -a a
a
a
→ cat aeh.txt | rg a
a
→ cat aeh.txt | rg -a a
a
a
In the third example with "rg a", rg neither crashed nor issued a
warning. fgrep in comparison issued a warning.
While the above example might be close to what fgrep does, just without
the warning, the following example is even worse:
→ cat aeh.txt | fgrep ä
Binary file (standard input) matches
→ echo $?
0
→ cat aeh.txt | fgrep ö
→ echo $?
1
→ cat aeh.txt | rg ä
→ echo $?
1
→ cat aeh.txt | rg ö
→ echo $?
1
→ cat aeh.txt | rg -a ä
ä
→ echo $?
0
So fgrep properly indicates with the exit code if there was a hit even
though it didn't output anything besides the warning about binary junk.
But even though the hit would have been before the NUL byte, rg claims
(via exit code) that there is no hit inside the STDIN despite "rg -a"
says otherwise (via output and exit code).
"cat aeh.txt | strace rg ä" shows that it exits rather quickly after
having read the NUL byte:
read(0, "a\n\303\244\n\0\na\n", 8192) = 9
sigaltstack({ss_sp=NULL, ss_flags=SS_DISABLE, ss_size=8192}, NULL) = 0
munmap(0x7fbfac3d0000, 8192) = 0
exit_group(1) = ?
+++ exited with 1 +++
Constraints to trigger the issue: data must contain a NUL byte and
neither of the options "-a" and "--text" must be set. On larger files
(gigabytes) it is obvious that rg exits preliminarily if the NUL byte is
close to the beginning solely because of how quick the command exits. We
actually discovered the issue that way: rg exited way too quickly and
without any output at all, especially in comparison to fgrep.
Impact: Does not indicate that there were hits and preliminarily exits
without further notice, hence can yield wrong results (exit code as well
as output) without any indication of there being an issue.
Workaround: always use option -a or --text when contents might contain
binary junk.
P.S.: Yes, fgrep/grep/egrep also has its issues there like the warning
being on STDOUT, not STDERR, but it's still much more clear in
indicating the issue compared to rg.
P.P.S.: I also tried to see if the options -F and --no-encoding make a
difference in this case, but they don't.
P.P.P.S.: This might be related to
https://github.com/BurntSushi/ripgrep/issues/1207
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: aeh.txt
URL: <http://alioth-lists.debian.net/pipermail/pkg-rust-maintainers/attachments/20190326/2edfc49c/attachment.txt>
-------------- next part --------------
-- System Information:
Debian Release: buster/sid
APT prefers unstable
APT policy: (990, 'unstable'), (600, 'testing'), (500, 'unstable-debug'), (500, 'buildd-unstable'), (110, 'experimental'), (1, 'experimental-debug'), (1, 'buildd-experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 4.19.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: sysvinit (via /sbin/init)
LSM: AppArmor: enabled
Versions of packages ripgrep depends on:
ii libc6 2.28-8
ii libgcc1 1:8.3.0-3
ripgrep recommends no packages.
ripgrep suggests no packages.
-- no debconf information
More information about the Pkg-rust-maintainers
mailing list