Bug#950363: licensecheck reports dubious (may be misleading) information for image files

Jonas Smedegaard jonas at jones.dk
Fri Jan 31 19:28:50 GMT 2020


Hi Dominique,

Quoting Dominique Dumont (2020-01-31 19:34:08)
> When parsing an image file as a binary blob, licenscheck report that 
> the copyright of the image is owned by HP:
> 
> $ licensecheck --encoding utf8 --copyright --machine --deb-fmt --recursive docs/src/static/diagrams.key//Data/st0-311.jpg
> docs/src/static/diagrams.key//Data/st0-311.jpg  UNKNOWN 1998 Hewlett-Packard CompanydescsRGB IEC61966-2.1sRGB IEC61966-2.1XYZ óQÌXYZ XYZ o¢8õXYZ b· / ±¹ÁÉÑÙáéòú / ®²·¼ÁÆËÐÕÛàåëðöû
> 
> I think licensecheck is misled by the copyright ownership of the color 
> profile of this image:
> 
> $ exiftool docs/src/static/diagrams.key//Data/st0-311.jpg | grep Profile
> Profile CMM Type                : Linotronic
> Profile Version                 : 2.1.0
> Profile Class                   : Display Device Profile
> Profile Connection Space        : XYZ
> Profile Date Time               : 1998:02:09 06:49:00
> Profile File Signature          : acsp
> Profile Creator                 : Hewlett-Packard
> Profile ID                      : 0
> Profile Copyright               : Copyright (c) 1998 Hewlett-Packard Company
> Profile Description             : sRGB IEC61966-2.1
> 
> The image itself has not Copyright information:
> 
> $ exiftool docs/src/static/diagrams.key//Data/st0-311.jpg | grep -i copyright
> Profile Copyright               : Copyright (c) 1998 Hewlett-Packard Company

I agree with your analysis but not your conclusion: I will argue that 
the image _does_ have copyright information - just not at its ideal 
place.

Licensecheck checks for copyright and license statements in files.  It 
does not promise to look only at ideal places - but also does not 
promise to look in all possible ways, only it as much ways that it can.

Concretely I do think that you have spotted an issue with an image 
containing non-free code, and I recommend that you report or fix it.  
When I discover images containing embedded non-free ICC data (which in 
my understanding includes data explicitly copyright protected without 
explicit licensing, due to the [Berne Convention] - and in my 
understanding the very reason we in Debian track not only licensing but 
also copyrigh statements) then I report it, and have so far been met 
with understanding and appreciation by upstream projects who generally 
(if not in all cases) has chosen to replace or strip ICC profiles from 
their graphics files - and several of them was in particular asking how 
I discovered it and were happy to learn about the powerful exiftool.

Just yesterday I wrote down in the TODO file for licensecheck (but not 
yet added that edit to git) that it would be nice if a set of 
"qualities" was expressed, besides the concrete task of finding 
copyright and licnesing statements.  It was inspired by the currently 
the only "side note" tracked - "(with wrong address)" - and presented 
only in default output (it really should be added as a Comment when 
generating DEP-5 output), but fits well with this example too.

Here is the full list I wrote down:

 * Quality flagging
   + ambiguous: license ref pointing to multiple license fulltexts
     (e.g. "MIT" or "GNU" or "GPL"
   + unlicensed: copyright holder(s) but no licensing
   + ungranted: license fullref requiring explicit grant,
     but no corresponding license grant
   + incomplete: fractions of license fullref, but no complete fullref
   + alien: license label but no license name
   + unowned: license but no copyright holder
   + uncertain: license ref and more unknown text
     in same sentence/paragraph/section
   + buried: license or copyright not at top of file
   + unstructured: license/copyright not at ideal place of data structure
     (e.g. in commend field of EXIF data, or in content o of PDF/HTML)
   + unaligned: license/copyright out of sync between layers of structure
     (e.g. ICC data and EXIF data of PNG, or content and metadata of PDF/HTML)
   + imperfect: license ref not following format documented in license fulltext
   + conflict: incompatible licenses
     (e.g. GPL-3+ and GPL-2-only, or OpenSSL and GPL)

The example you present here would ideally (continue to report HP as 
copyright holder - and more reliably so, but that's a separate issue - 
and) be flagged as "unlicensed", "buried" and "unaligned".

Does that make sense?  Would you agree to turn this bugreport into a 
wishlist reminder for making that side-note spiffy-ness happen?


Kind regards,

 - Jonas

[Berne Convention]: https://en.wikipedia.org/wiki/Copyright#International_copyright_treaties

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: signature
URL: <http://alioth-lists.debian.net/pipermail/pkg-perl-maintainers/attachments/20200131/b968020f/attachment.sig>


More information about the pkg-perl-maintainers mailing list