[Debian GNUstep maintainers] Please add universal-detector to embedded-code-copies

Yavor Doganov yavor at gnu.org
Mon Sep 9 06:11:00 BST 2024


The universal-detector package (only in sid/testing) embeds a copy of
uchardet.  It is used only by unar.

I tried hard to make it work with the uchardet shared library but
there were several problems:

  * The test program sometimes gives wrong results, e.g. it reports a
    UTF-8 file as WINDOWS-1258.  A pure C test program using only
    uchardet gives the same results so this is probably a bug in
    uchardet.
  * There are discrepancies in the returned strings,
    e.g. uchardet_get_charset returns "ASCII" while unar expects 
    "US-ASCII".
  * unar uses the so called "confidence" feature (a float value
    representing the certainty of the the guess) which is not
    available through public uchardet functions.

By examining the diff between universal-detector's uchardet copy and
the uchardet versions in various Debian releases I came to the
conclusion that this is a fork.



More information about the pkg-GNUstep-maintainers mailing list