[Debian GNUstep maintainers] Please add universal-detector to embedded-code-copies
Yavor Doganov
yavor at gnu.org
Mon Sep 9 06:11:00 BST 2024
The universal-detector package (only in sid/testing) embeds a copy of
uchardet. It is used only by unar.
I tried hard to make it work with the uchardet shared library but
there were several problems:
* The test program sometimes gives wrong results, e.g. it reports a
UTF-8 file as WINDOWS-1258. A pure C test program using only
uchardet gives the same results so this is probably a bug in
uchardet.
* There are discrepancies in the returned strings,
e.g. uchardet_get_charset returns "ASCII" while unar expects
"US-ASCII".
* unar uses the so called "confidence" feature (a float value
representing the certainty of the the guess) which is not
available through public uchardet functions.
By examining the diff between universal-detector's uchardet copy and
the uchardet versions in various Debian releases I came to the
conclusion that this is a fork.
More information about the pkg-GNUstep-maintainers
mailing list