Bug#989912: libregexp-pattern-license-perl: No deterministic results are provided

Walter Lozano wlozano at collabora.com
Tue Jun 22 17:56:05 BST 2021


Hi Jonas,

On 6/22/21 5:22 AM, Jonas Smedegaard wrote:
> Quoting Walter Lozano (2021-06-16 14:44:01)
>> On 6/16/21 2:50 AM, Jonas Smedegaard wrote:
>>> Quoting Walter Lozano (2021-06-16 04:12:23)
>>>> On 6/15/21 9:17 PM, Jonas Smedegaard wrote:
>>>>> Quoting Walter Lozano (2021-06-15 20:42:53)
>>>>>> As as user of licensecheck I found it does not provide
>>>>>> deterministic results on some circumstances. And example of this
>>>>>> is gnutls28/m4/ax_code_coverage.m4 which is detected as UNKNOWN
>>>>>> or LGPL.
>>>>>>
>>>>>> After some debugging I found that the root cause could be in
>>>>>> libregexp-pattern-license-perl, I have proposed a fix which you
>>>>>> can find in
>>>>>>
>>>>>> https://salsa.debian.org/perl-team/modules/packages/libregexp-pattern-license-
>>>>>> perl/-/merge_requests/1
>>>>>>
>>>>>> I hope you can help me to clarify this issue.
>>>>> Great - thanks a lot!
>>>>>
>>>>> I suspect that this might be bug#982849.
>>>> Yes, it looks exactly the same issue I faced. I hope you can
>>>> confirm and fix it
>>> I will certainly do that.
>> In relation to this, I find that the problem is more evident at least
>> after these commits, which are related to versioning
>>
>>    * eddc64dd1f0e6f9bd1769ef580a217aa4be762b8 (synthesize subject pattern
>>      name: optimize version matching)
>>    * cd75d77da201260bc9deef4631d5c4d3a42fa41d (add license patterns
>>      lgpl_2 lgpl-2_1 lgpl-3)
>>
>> I hope this information is useful.
> Thanks.  You are right that those commits are directly related to the
> issue - but not the cause, it turned out:
>
> At build-time, the library composes regular expressions from metadata
> (what I call "synthesizing").  If done right, the order of stepping
> through and synthesize objects should not matter - but the synthesizing
> logic was buggy at three places:
>
> a) Synthesizing metadata from single-version object (e.g. "lgpl_2_1") as
> regex patterns in versioned object (e.g. "lgpl") cannot be fully random,
> but must wait till after the single-version object has been synthesized.
> Now fixed in commit 2ec7af9eb0fdf72711eeb2689a6726b5ff30f82d
>
> b) Only a subset of metadata from single-version object was synthesized.
> Now fixed in commit bfd071032a88fd2d56e20b3a7ef092524dc3491a
>
> With those two underlying bugs fixed, the library should now build its
> DefHash structure deterministically.
>
> ...but the structure now has more rich versioned objects, which revealed
> another bug in Licensecheck:
>
> Licensecheck looks for more specific objects first - first singleversion
> objects with optional trailer (e.g. "lgpl_2_1" + "version 2.1" + "or any
> later"), and then versioned object with optional trailer (e.g. "lgpl" +
> "version 2.1" + "or any later").
>
> Notice the bug?  For singleversion objects it should skip the version
> part of a trailer (i.e. only e.g. "lgpl_2_1" + "or any later").
>
> So Licensecheck would fail to detect "or later" for singleversion
> objects because it bogusly looked for double version, and would then
> succeed in detecting "or later" with the more general versioned object -
> as long as that was crippled to miss the version on its own, so that
> version was part of the trailer.
>
> If you are still with me in all this (I am not good at describing this,
> I realize that), you can imagine how frustrated I have been to try
> figure out what was really failing - until you pointed out the one place
> I could make the build-time (still wrong but at least) deterministic.
>
> Thanks a lot!

Thank you for your detailed explanation. I cannot completely follow you 
but I can follow the high level idea. I was completely sure that the 
issue was related to how license versioning was handled, but my limited 
experience in perl and in these particular modules make it impossible 
for me to go deeper. So I establish a personal goal of at least make a 
bug report which were really useful for you and provide a basement for 
your investigation, mainly by pointing to what was more evident for me, 
the non deterministic output.

I'm really happy that this report was helpful.

> New releases went out upstream to CPAN last night, and I expect to
> release packages for Debian today.  Unfortunately too late to be
> included with the upcoming Bullseye release of Debian.
>
Thanks again!

Regards,

Walter



More information about the pkg-perl-maintainers mailing list