[Piuparts-devel] Bug#698526: Bug#698526: Sort known issues by reverse dependency count

Thu Feb 21 14:55:41 UTC 2013

On Thu, Feb 21, 2013 at 4:24 AM, Andreas Beckmann <anbe at debian.org> wrote:
>
> this work looks really promising and I'm curious to try it some day on
> my instance.
>
> But as I wrote before there is no need to reimplement the .tpl
> generation in python. Instead these intermediate files should go away
> and the html generation should be moved directly into piuparts-report.
> There will be a package db available.
> I think this "requirement" to generate .tpl externally dates back to the
> time when all logfiles were grepped daily, i.e. before we remembered the
> results in .kpr.
>

I took the least invasive path from mimicking detect_well_known_errors
to sorting by rdep to eliminating linktarget_by_template (where rdep
sorting was the single original goal). I agree that .tpl's are
obsolete, but that wasn't an overriding goal for me, and not necessary
to get issue logic out of piuparts-report. There's no significant
performance issue.

> Even if .kpr generation can be sped up significantly, I don't think I
> want to run this from inside piuparts-report. Just like piuparts-analyze
> (that takes 30-60 minutes for my instance) this is something that will
> continue to be run from the generate-piuparts-report driver script ...
> and having it sped up by a magnitude will decrease my hesitation to run
> it with --recheck-all.

OK. A minimally invasive fix would be to add a 'skip kpr creation'
option, used inside piuparts-report, and re-introduce
detect_well_known_errors, which imports known_problems. Interested?

> Also if the .tpl files are gone, we can actually run piuparts-report
> without running piuparts-analyze or detect_well_known_errors directly
> before it.

The above would have the same net effect.

> And about speeding up the "grepping" - wouldn't it be even faster if we
> can run multiple regexes at the same time on the input - either by
> 'ORing' them together or passing a list to re or ... then we would just
> need to figure out which one has matched ... (No, I haven't tried
> anything like this, but I'm considering testing this with the multiple
> grep calls in detect_piuparts_issues.
>   grep -lE '(foo)|(bar)|(f[o0]{2}bar|baz)'
> should be significantly faster than
>   grep -l foo
>   grep -l bar
>   grep -lE 'f[o0]{2}bar|baz'
> And there we only care about 'any match' disregarding which matched.
> Or am I mistaken here?
>

Interesting idea. I'll give it a try.