[Reproducible-builds] Bug#808121: Bug#808121: Bug#808121: diffoscope: HTML output is bloated

Jérémy Bobbio lunar at debian.org
Thu Dec 17 13:56:32 UTC 2015


Esa Peuha:
> While we are at it, let's convert HTML character entity references
> (which each use 6-8 characters and as many bytes in the HTML file)
> to actual characters (which UTF-8 encodes as 2-3 bytes). Since all
> diffoscope output files are peppered with abundant amounts of these
> things, this could reduce the file sizes by a few percent at least.
> I used Python string literals instead of the actual characters in
> the Python file, because 1) the non-breaking and zero-width spaces
> would be very hard to distinguish from ordinary space and missing
> string content, respectively, and 2) it is impossible to be sure
> that every piece of software that is ever going to be used to view
> or edit the file would handle non-ASCII characters correctly.

Thanks for the patch. It's been commited and push.

I would be grateful if you could submit ready-to-merge Git changes next
time (see git-format-patch(1)).

-- 
Lunar                                .''`. 
lunar at debian.org                    : :Ⓐ  :  # apt-get install anarchism
                                    `. `'` 
                                      `-   
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20151217/70b9d5f6/attachment.sig>


More information about the Reproducible-builds mailing list