Bug#788864: python-debian: License field in files paragraph should be required not optional

Stuart Prescott stuart at debian.org
Sun Feb 18 04:59:38 UTC 2018


Control: tags -1 + patch

Dear Orestis, python-debian mainatiners & debsources folks,

> Debian standard [1] suggests that the license field in the files
> paragraph is required whereas when you parse it is only optional.
> 
> This is sometimes causing a trouble when using the package since
> the user has to verify that the license object in the files
> paragraph is not None and thus raising AttributeError when
> accessing the synopsis for example.
> 
> I guess the solution must not be to consider the d/copyright file
> as non machine readable but you might want to omit that specific
> paragraph and log this error.

I think this is a good suggestion, and looking at the use of 
`debian.copyright.Copyright` in debsources, I can see that they have had to do 
a similar dance to what you describe.

https://salsa.debian.org/qa/debsources/blob/master/lib/debsources/
license_helper.py#L105

I've taken a first cut at making the `Copyright` reader more strict with 
regards to required fields but I would like some feedback from users of the 
`Copyright` class before merging it.

https://salsa.debian.org/python-debian-team/python-debian/merge_requests/1

In particular:

* It introduces a `MachineReadableFormatError` which is used for format 
errors; I think it's worth distinguishing between an error in the format and 
the copyright file not being in the format at all (`NotMachineReadableError`). 
`MachineReadableFormatError` is derived from `ValueError` which I think makes 
sense.

* I've changed other uses of `ValueError` within `Copyright` to use 
`MachineReadableFormatError` to be consistent (but that should also be 
backwards compatible)

* As suggested within comments already in the code, I've allowed a 
`strict=False` mode which continues to use python warnings rather than raising 
exceptions.

* The comments in the code also talked about treating the http and https 
versions of the copyright spec as being the same; the spec has since been 
changed to explicitly say that both are OK but that the https URL is preferred 
and so the code will silently upgrade from http to https too.

By throwing an exception as soon as an error is found, this becomes a bit of 
an all-or-nothing approach. Would a better approach be an incremental 
validation where as much as is possible is read with an `valid` attribute per 
stanza that propagates to the whole `Copyright` instance? Users would then 
check that the file is valid after read in rather than using exception 
handling.

comments, please! (Either to this bug or to the MR on salsa)

thanks
Stuart


-- 
Stuart Prescott    http://www.nanonanonano.net/   stuart at nanonanonano.net
Debian Developer   http://www.debian.org/         stuart at debian.org
GPG fingerprint    90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7



More information about the pkg-python-debian-maint mailing list