[Reproducible-builds] Patch V2 for build nodes pools

Mon Dec 21 17:39:08 UTC 2015

On 2015-12-21, Holger Levsen wrote:
> On Samstag, 19. Dezember 2015, Vagrant Cascadian wrote:
>> I didn't spend any time really figuring out which nodes to add to the
>> example 16th build job, so that might need some adjusting.
>
> put some 4cores in one pool, and 2cores in another?

It could be done any number of ways, I merely added it to show how it
would work with the code I was proposing. I was hoping the pool code
could be ready enough to use with the new nodes that should be coming by
the end of the year, and they'd be reasonable first tests...

>> - Split load estimating into it's own script, and add support for
>> available memory.
>
> I'd still suggest to measure the load constantly by a job outside the build 
> script… (then it's also easy to read "not updated node load since $time" as 
> "node is to busy to be scheduled on…)
>
>> - Call timeout so that the ssh processes don't take too long to complete.
>
> see above, don't ssh from the build script please.

Implementing that outside of the build script would make this much more
complicated...

The second build needs to check for load when it is about to be run, as
it doesn't make sense to check when build_rebuild is run (unless you run
both build1 and build2 in parallel... but that's a whole different
proposal), as the load of the machines is likely to change between the
first and second build.

I'm not sure how to do all that outside the build script and keep the
code reasonably simple.

What's the primary concern with ssh from within the build script? Taking
too long to get a response?

>> diff --git a/bin/reproducible_build.sh b/bin/reproducible_build.sh
>
> I'll only comment on the most "pressing" issues now.
>
>>  build_rebuild() {
>>  	FTBFS=1
>>  	mkdir b1 b2
>> +	local selected_node
>> +	selected_node=$(select_least_loaded_node $NODE1_POOL)
>
> please make this somehow conditional so that this code path is not used for 
> "normal operation" (=without this new pooling), so we can test this easily on 
> one builder job, but not on all.

It basically is conditional in that the select_least_loaded_node
function simply returns the node if only one argument is passed.

> so for builder_armhf_16…:
>
>> +++ b/job-cfg/reproducible.yaml
>> +                - '16': { my_node1: 'wbd0-armhf-rb:2223
>> wbq0-armhf-r:2225', my_node2: 'bpi0-armhf-rb:2222 odxu4-armhf-rb:2229' } +
>>            my_shell: '/srv/jenkins/bin/reproducible_build.sh "{my_node1}"
>
> …reproducible_build.sh should probably be called with "experimental-pooling" 
> as first param, which is then shifted away…

That shouldn't be too hard, sure.

Could alternately use something like:

   - '16': { my_node1: 'pool,wbd0-armhf-rb:2223,wbq0-armhf-rb:2225',
             my_node2: 'pool,bpi0-armhf-rb:2222,odxu4-armhf-rb:2229' }

Maybe this should be written in two stages, first implementing a simpler
patch just providing failover, and then adding the load checks later.

live well,
  vagrant
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20151221/637d7f71/attachment.sig>