[Piuparts-devel] setting priority of sections

Andreas Beckmann debian at abeckmann.de
Sat Nov 26 19:24:38 UTC 2011


On 2011-11-26 19:52, Dave Steele wrote:
> On Sat, Nov 26, 2011 at 9:33 AM, Holger Levsen <holger at layer-acht.org> wrote:

>>> 1) There is much downloading/parsing of the Packages file, and
>>> recalculating package states going on. Both are expensive.
>>
>> For the first problem, we should fix really fix proxy support :) (It's broken
>> currently, try it... not sure if there is a bug.)

I have a local mirror, so I don't care ... but for squeeze-security I'd
like to be able to use a proxy to not put excessive load on Debian
servers ...

>> For the 2nd, well, it's slightly inevitable and then I also dont think those
>> 1-2min hurt that much.

And with more complex dependencies between sections (see below) I don't
know how this could be cached.

>>> 2) The master process doesn't like package state changes it didn't
>>> cause. There are opportunities for race conditions with multiple
>>> slaves/master processes, which would need to be worked out.
>>
>> There should always only be one master...
> 
> This is not enforced. If you have two slaves, you can have two active
> masters working in the same section tree. I assumed that is why there
> is a shuffle() call in reserve_package().

one master-directory on one master-host, but several concurrent
piuparts-master processes may be running on this driven by different
slaves ... and all writing to the same log file ==> BUG/RACE!

> Bug? I don't think so, currently. Reserve() handles races in creating
> the reserve log file.

correct.

>> How do you cause rescheduling? The best is really to just delete the log.
>>
> 
> That is essentially what the patch is doing. The trick is to do that
> automatically, while maintaining a quick response to new submissions.
> If master could provide visibility into the size of his prioritized
> work queues, a fairly efficient compromise could be implemented in a
> cron job (delete log files to get the queue up to a threshold number
> of entries).

my patch in branch
  feature/master/state-command
allows to access the package counts on master

> It's just something to think about as you work on adding priority
> support. The best bang-for-the-buck may come from a different angle of
> attack, such as tailored/schedule-able slaves, or a persistent master
> daemon.

> The Piuparts sid statistics over the last couple of days show piatti
> running way below the '3 day sid' benchmark mentioned in the wiki. The
> summary suggests that it would take more like 3 months to work off the
> "broken symlink" warnings. Do we know why it is taking so long to
> rerun the failed queue?

Are there logs available where we could get a "number of packages
processed per day and section" as well as idle time?

piuparts-master is already taking 250-350 MB RAM for a single section,
you don't want to have multiple sections persistent :-)
(piuparts-report takes > 3 GB RAM in my setup)

What I have running locally is a configuation with e.g. sid/main,
sid/non-free, sid/contrib where contrib and non-free form some kind of
circular dependency and both depend on main. Of course packages are only
tested once, but a change in one section may propagate to other sections
by enabling more package to be tested, so I'm not sure how caching could
help (probably the file system cache helps a lot for queries like
os.path.exists(...)). It takes about 5 minutes to cycle through
(experimental, sid, wheezy, squeeze) x (main, contrib, non-free) to
download the packages file(s) (local mirror), parse it/them and compute
package states. (for sid/non-free you need the packages files of
sid/main and sid/contrib, too, - and for experimental/non-free even more)


Andreas



More information about the Piuparts-devel mailing list