Bug#867302: licensecheck: incorrectly parses multi-line copyright notices

Jonas Smedegaard jonas at jones.dk
Wed Jul 5 22:13:43 UTC 2017


Quoting Ximin Luo (2017-07-05 21:17:00)
> Jonas Smedegaard:
> > [..]
> > 
> > Have a look (if interested) at /usr/share/perl5/String/Copyright.pm 
> > and in particular the (huge when expanded) $signs_and_more_re at 
> > line 138.
> > 
> > [..]
> 
> Thanks for the tips! I'm not sure if you got my other follow-ups to 
> the bug report - I did in fact find String::Copyright, but I didn't 
> know about the history nor plans for it, so thanks for filling me in 
> on that.
> 
> At any rate, here is an updated version of my patch, along with some 
> test cases for Sage's copyright notices.
> 
> I did try to think of a way to achieve the same logic *inside* the 
> massive $re regexes. However I don't think this is possible, at least 
> with my current approach - which tries to be conservative in order to 
> adapt to humans being annoyingly inconsistent.
> 
> What it does is, it joins subsequent lines only when the indent is 
> greater than the main line (with the "Copyright" part). This means I 
> have to call length() in an expression-replacement, which I don't 
> think is possible to do inside a normal regex...

I did see your other emails, but only after I posted my initial reply (I 
am slow at writing emails).

I have now published App::Licensecheck 3.0.30 to CPAN, and if it 
survives CPANtesters inspections then I will release that to Debian.  
That release does not fix the topic of this bugreport, but it does fix a 
bug in that String::Copyright expects plain text as input but was passed 
text with comment markers by App::Licensecheck.  Which seems is what 
complicates your patch, so I will ask you to please try again with that 
newer App::Licensecheck to see how much you can reduce the patch.

If you want to try with the 3.0.30 release before it gets packaged for 
Debian, you can do it like this:

  sudo apt install cpanminus
  cpanm App::Licensecheck
  export PATH="$HOME/perl5/bin:$PATH"
  export PERL5LIB="$HOME/perl5/lib/perl5"

...and when done exploring (assuming you want _any_ local CPAN gone):

  rm -rf ~/perl5 ~/.cpanm

NB! It is easiest for me if you file a new bugreport for each separate 
issue - e.g. the one of not matching double-dashed year ranges.  Fine if 
you work on a patch that addresses multiple issues, but still safer to 
report the issues separately, so that I don't accidentally miss fixing 
some of it, e.g. if I choose to resolve things differently than with 
your tested patch..


> As for speed:
> 
> # with the patch
> $ time debian/rules debian/licensecheck.copyright
> licensecheck -l250 -i ^sage/build/ -r --deb-machine --merge-licenses sage > "debian/licensecheck.copyright"
> 
> real    0m35.318s
> user    0m35.204s
> sys     0m0.056s
> 
> # without the patch
> $ time debian/rules debian/licensecheck.copyright
> licensecheck -l250 -i ^sage/build/ -r --deb-machine --merge-licenses sage > "debian/licensecheck.copyright"
> 
> real    0m31.168s
> user    0m31.040s
> sys     0m0.076s

Thanks :-)

 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-perl-maintainers/attachments/20170706/6edf1559/attachment.sig>


More information about the pkg-perl-maintainers mailing list