more (late) feedback on Diffoscope

Holger Levsen holger at layer-acht.org
Thu Jul 10 16:27:54 BST 2025


Hi,

someone mailed me some suggestions for diffoscope, and I'd think some
of them still might be worthwhile to be turned into bugs, but i'm not fully
sure hence I havent dont that yet...

(this is from someone who uses diffoscope to compare PDFs or other text
documents to see how their content has changed, eg when laws or contracts
are updated etc)
 
----- Forwarded message -----

Date: Fri, 26 Jan 2024 15:07:38 +0100
From: (in bcc:, please do reply if you want to add anything)
To: Holger Levsen <holger at layer-acht.org>
Subject: More (late) feedback on Diffoscope

Hi Holger,

Let’s see, it’s been a long while since I’ve tried it and made these notes, 
but here goes:

It would be cool if one could tell Diffoscope to show the difference only 
between the content (ignoring the styling and markup tags) – e.g. useful for 
documents like “OOXML”, ODF, PDF, perhaps even HTML.

Same but reversed? Might be useful to ignore the content, but see the changes 
between markup and styling are between two websites or documents.

Since the HTML output when comparing large inputs simply gives up at some 
point, it would be useful to force it to still produce an HTML of the whole 
diff (i.e. yes, I know this will eat up my RAM, I have enough and want you to 
do it!)

I noticed – maybe this has been fixed in the ¾ year since then – that diffing 
large plaintext files sometimes(?) causes Diffoscope to fall back to binary 
comparison instead. Which is not very useful in those cases TBH.

All that said, it is perfectly fine if the limitations I ran into are just out 
of scope of Diffoscope. I know I tried to use it for a niche use case (at 
least compared to its majority user base) ;)


cheers,
---- End forwarded message -----

-- 
cheers,
	Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

Money is worth nothing on a dead planet.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/reproducible-builds/attachments/20250710/15c39170/attachment.sig>


More information about the Reproducible-builds mailing list