Bug#848049: diffoscope: Add detection of order-only differences in plain text formats

Maria Glukhova siamezzze at gmail.com
Thu Jan 12 10:13:14 UTC 2017


On Sun, 25 Dec 2016 15:28:52 +0100 Jérémy Bobbio <lunar at debian.org> wrote:

Hi Lunar!

> You would not have to read the file twice as long as you do the hash
> in the difference module, when each line is actually fed to diff.
> A similar trick is already used to cope with files that are too long,
> see diffoscope.difference.make_feeder_from_raw_reader()
>

I implemented what I believe was your idea in the attached patch. Thank you
for pointing me to it!
Still, I don't think that feature worth invading into diff.py/diffoscope.py
modules. It doesn't speed up comparison significantly, because call to diff
still takes most of the time on big files with difference only in line
order. Besides, I can't think of many examples of where that feature would
be needed, save from text files.

In any case, thank you again for taking time to provide me with that idea!


Maria
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20170112/24ada808/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Generic-order-line-difference-for-all-kind-of-inputs.patch
Type: text/x-diff
Size: 7592 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20170112/24ada808/attachment.patch>


More information about the Reproducible-builds mailing list