Bug#879003: diffoscope: objdump --line-numbers makes diffoscope *very* slow
Mike Hommey
mh at glandium.org
Wed Oct 18 23:50:19 UTC 2017
On Wed, Oct 18, 2017 at 07:29:52PM +0900, Mike Hommey wrote:
> Package: diffoscope
> Version: 87
> Severity: normal
>
> Dear Maintainer,
>
> Today I was comparing firefox builds from mozilla CI with diffoscope,
> and it was taking an awful lot of time. Quick inspection revealed that
> this is largely due to objdump --line-numbers. Remove --line-numbers
> from the objdump command line created in ObjdumpDisassembleSection makes
> a huge difference:
>
> $ time diffoscope --html diff.html firefox{1,2}.tar.bz2 # unpatched
> real 58m48.870s
> user 62m41.966s
> sys 3m3.710s
>
> $ time diffoscope --html diff.html firefox{1,2}.tar.bz2 # patched
> real 7m19.159s
> user 7m42.558s
> sys 3m2.730s
Some timings, fwiw:
$ time objdump --disassemble --demangle --section=.text libxul.so > /dev/null
real 0m23.431s
user 0m23.366s
sys 0m0.064s
$ time objdump --line-numbers --disassemble --demangle --section=.text libxul.so > /dev/null
real 25m49.537s
user 25m49.191s
sys 0m0.308s
The irony is that the file doesn't even contain line numbers
debug info, and all --line-numbers does is add lines with mangled
function names:
--- /dev/fd/63 2017-10-19 08:42:00.866097676 +0900
+++ /dev/fd/62 2017-10-19 08:42:00.866097676 +0900
@@ -5,6 +5,7 @@
セクション .text の逆アセンブル:
0000000000966f20 <InvalidArrayIndex_CRASH(unsigned long, unsigned long)>:
+_Z23InvalidArrayIndex_CRASHmm():
966f20: 55 push %rbp
966f21: 48 89 f1 mov %rsi,%rcx
966f24: 48 8d 35 a5 8f 0c 04 lea 0x40c8fa5(%rip),%rsi # 4a2fed0 <mozilla::Dafsa::kKeyNotFound+0xc8>
@@ -15,6 +16,7 @@
966f38: e8 63 a2 ff ff callq 9611a0 <MOZ_CrashPrintf at plt>
0000000000966f3d <mozilla::HangMonitor::Crash() [clone .part.25]>:
+_ZN7mozilla11HangMonitor5CrashEv.part.25():
966f3d: 55 push %rbp
966f3e: 48 89 e5 mov %rsp,%rbp
966f41: 48 83 ec 20 sub $0x20,%rsp
@@ -42,6 +44,7 @@
966fb3: 0f 0b ud2
0000000000966fb5 <isFollowedByCasedLetter(int (*)(void*, signed char), void*, signed char)>:
+_ZL23isFollowedByCasedLetterPFiPvaES_a():
966fb5: 48 85 ff test %rdi,%rdi
966fb8: 74 34 je 966fee <isFollowedByCasedLetter(int (*)(void*, signed char), void*, signed char)+0x39>
966fba: 55 push %rbp
etc.
It's also worth noting that the command is run whether the sections
differ or not. So, if, like in my case, you have files that only
differ via their build-id (not sure why yet), you still waste that extra
processing time on the .text section, while there's no difference.
FWIW:
$ time objdump -s --section=.text libxul.so > /dev/null
real 0m4.848s
user 0m4.784s
sys 0m0.064s
So, a preliminaty check of the raw data would make things much faster
overall for files with few differences.
Even better, the output from readelf -SW could be parsed to get the
section offsets and sizes, and then the actual raw data read directly
(without having objdump do a dump of it).
because...
$ time cat libxul.so > /dev/null
real 0m0.040s
user 0m0.000s
sys 0m0.041s
Mike
More information about the Reproducible-builds
mailing list