Bug#841907: diffoscope: smarter hex dump differ

Daniel Shahaf danielsh at apache.org
Mon Oct 24 10:19:44 UTC 2016


Package: diffoscope
Version: 61
Severity: wishlist
X-Debbugs-Cc: 838569 at bugs.debian.org

Currently, insertion or deletion of a single byte causes the remainder
of a hex dump to be shown as "all lines are different", since the
stream of byte values is the same but the lines of the hex dump (16 byte
values per line) are not, they only have 15 out of 16 byte values equal.

Example: see attached monkeystudio diff.  The difference is on the first
line (0xA9 v. 0x5C 0x32 0x35 0x31), but _every single line_ in the
section shows a three-byte diff; the last three bytes of left line N are
equal to the first three bytes of right line N+1, but the diff overlooks
that.  Consequently, the signal/noise ratio of the diff is low.

I think the following patch should improve the situation: it causes the
output to omit line numbers, and include a newline after each byte value,
so any insertion/deletion of a single byte would result in a diff that
inserts/deletes a single line, without ripple effects.  The line numbers
in the diff would correspond to byte offsets in the hex dumped file.

I originally ran into that issue in .rodata diffs, which don't use the
xxd codepath, but the cases are analogous.  If this idea works out, we
should teach the same trick to the ELF comparator's «readelf --hexdump»
output.  (This would also fix #838569, about ignoring addresses in
.rodata.)

I haven't tested this idea yet; I'm only filing this issue so I don't
forget it.

Cheers,

Daniel

[[[
diff --git a/diffoscope/comparators/utils.py b/diffoscope/comparators/utils.py
index 1529dae..4c8603d 100644
--- a/diffoscope/comparators/utils.py
+++ b/diffoscope/comparators/utils.py
@@ -350,4 +350,4 @@ class NonExistingArchive(Archive):
 class Xxd(Command):
     @tool_required('xxd')
     def cmdline(self):
-        return ['xxd', self.path]
+        return ['xxd', '-p', '-c1', self.path]
]]]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot - 10242016 - 12:06:26 AM.png
Type: image/png
Size: 26015 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20161024/c5bf0e69/attachment.png>


More information about the Reproducible-builds mailing list