Bug#863636: diffoscope: usage of FIFOs causes pair-comparisons to not run in parallel, wasting performance by about 1/2
Ximin Luo
infinity0 at debian.org
Tue May 30 11:40:00 UTC 2017
Ximin Luo:
> Package: diffoscope
> Version: 78
> Severity: normal
>
> Dear Maintainer,
>
> diff(1) first reads the contents of one file then the next one:
>
> https://sources.debian.net/src/diffutils/1:3.5-3/src/io.c/#L552
>
> This means that if the "files" are actually FIFOs connected to the output of a
> process, as they are in many cases in diffoscope, the second process has to wait
> for diff(1) to fully read the output of the first process, before it itself can
> run. This prevents both processes from running in parallel.
>
> An appropriate fix would be to store the output of at least one of the commands
> into a temporary file, and have diff(1) read from this instead. This has to be
> done carefully however, to make sure that diff(1) doesn't accidentally read it
> before the process is finished.
>
> [..]
It seems readelf specifically has weird performance behaviours when running in parallel. My initial attempt at fixing this works in the general case, but not for specific cases that are important to diffoscope and this scenario.
https://anonscm.debian.org/cgit/reproducible/diffoscope.git/commit/?h=experimental&id=e28b540b0b289ce9fda70095160382799d7602a6
Taking diffoscope completely out of the equation, we see the following behaviours:
$ f() { sleep 3; }; time ( f 1; f 2; ); time ( f 1 & x=$!; f 2; wait $x; )
real 0m6.010s
user 0m0.000s
sys 0m0.000s
real 0m3.004s
user 0m0.000s
sys 0m0.000s
$ f() { sha256sum /run/shm/elf.$1 >/dev/null; }; time ( f 1; f 2; ); time ( f 1 & x=$!; f 2; wait $x; )
real 0m0.451s
user 0m0.432s
sys 0m0.012s
real 0m0.221s
user 0m0.428s
sys 0m0.004s
(The above two cases confirm that the shell snippets are correct. /run/shm/elf.{1,2} is about 50MB.)
$ f() { taskset $1 readelf --wide --debug-dump=rawline /run/shm/elf.$1 >/dev/null; }; time ( f 1; f 2; ); time ( f 1 & x=$!; f 2; wait $x; )
real 0m17.128s
user 0m17.096s
sys 0m0.008s
real 0m19.126s
user 0m37.256s
sys 0m0.016s
$ f() { taskset $1 readelf --wide --debug-dump=rawline /usr/lib/debug/.build-id/22/46ba050897f1d98034a7ca4b7ec06b594a373d.debug >/dev/null; }; time ( f 1; f 2; ); time ( f 1 & x=$!; f 2; wait $x; )
real 0m3.330s
user 0m3.324s
sys 0m0.000s
real 0m1.674s
user 0m3.324s
sys 0m0.000s
$ f() { taskset $1 readelf --wide --debug-dump=info /usr/lib/debug/.build-id/22/46ba050897f1d98034a7ca4b7ec06b594a373d.debug >/dev/null; }; time ( f 1; f 2; ); time ( f 1 & x=$!; f 2; wait $x; )
real 0m18.437s
user 0m18.384s
sys 0m0.012s
real 0m29.039s
user 0m56.196s
sys 0m0.044s
/usr/lib/debug/.build-id/22/46ba050897f1d98034a7ca4b7ec06b594a373d.debug is about 4MB and from libc6-dbg=2.24-10.
So it looks like readelf has issues when running for a long time in parallel. (--debug-dump=rawline with a 4MB file sped up when running in parallel.)
However other tools don't seem to have this problem:
$ time ( { echo 1; echo 2; } | xargs -n1 -I '{}' taskset '{}' readelf --wide --debug-dump=info /usr/lib/debug/.build-id/22/46ba050897f1d98034a7ca4b7ec06b594a373d.debug >/dev/null )
real 0m18.730s
user 0m18.644s
sys 0m0.032s
$ time ( { echo 1; echo 2; } | parallel --will-cite taskset '{}' readelf --wide --debug-dump=info /usr/lib/debug/.build-id/22/46ba050897f1d98034a7ca4b7ec06b594a373d.debug >/dev/null )
real 0m33.076s
user 1m0.844s
sys 0m0.460s
$ time ( { echo 1; echo 2; } | xargs -n1 -I '{}' taskset '{}' sha256sum /run/shm/test.iso >/dev/null )
real 0m33.302s
user 0m32.332s
sys 0m0.932s
$ time ( { echo 1; echo 2; } | parallel --will-cite taskset '{}' sha256sum /run/shm/test.iso >/dev/null )
real 0m16.843s
user 0m32.224s
sys 0m1.272s
Here, /run/shm/test.iso is about 4GB in size, and it does speed up when being processed by sha256sum in parallel.
Running strace on the "readelf" calls shows that most syscalls are done at the start, then for most of the rest of the execution it's just "write()" syscalls and indeed the "sys" times above are negligible. So it seems unlikely that the processes are blocking on anything.
X
--
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git
More information about the Reproducible-builds
mailing list