again, and now for longer (Re: Debian CI builds disabled for the moment)

Holger Levsen holger at layer-acht.org
Thu Aug 31 10:11:20 BST 2023


On Tue, Aug 29, 2023 at 07:05:05PM +0000, Holger Levsen wrote:
> fwiw, at 18:48:18 UTC jenkins.d.n had a load of 3.79 (with 23 cores),
> then mosh lost connection for 12min, then at 19:00:03 the load was 155.37,
> then at 19:01:09 the load was 60.10, a min later the load was 25.
> 
> Last time we looked it was the diffoscope process causing this load,
> though I'm now surprised how something built this quickly can cause diffoscope
> to do this... so maybe another red herring...

from my local logs:

2023-04-29 08:46 UTC, jenkins, powercycle, no ping
2023-05-08 18:54 UTC, jenkins, powercycle, no ping
2023-07-22 22:40 UTC, jenkins, powercycle, no ping
2023-07-23 12:48 UTC, jenkins, powercycle, no ping
2023-07-23 16:12 UTC, jenkins, powercycle, no ping
2023-07-24 11:28 UTC, jenkins, powercycle, no ping
2023-07-25 15:46 UTC, jenkins, powercycle, no ping
2023-07-26 15:15 UTC, jenkins, powercycle, no ping
2023-07-28 12:20 UTC, jenkins, powercycle, no ping
2023-07-29 00:28 UTC, jenkins, powercycle, no ping
2023-07-29 08:35 UTC, jenkins, powercycle, no ping
2023-08-07 09:05 UTC, jenkins, powercycle, no ping
2023-08-11 16:54 UTC, jenkins, powercycle, no ping
2023-08-27 10:02 UTC, jenkins, powercycle, no ping
2023-08-29 07:38 UTC, jenkins, powercycle, no ping
2023-08-29 08:40 UTC, jenkins, powercycle, no ping
2023-08-29 23:08 UTC, jenkins, powercycle, no ping
2023-08-30 22:53 UTC, jenkins, powercycle, no ping
2023-08-31 08:39 UTC, jenkins, powercycle, no ping

(what's not visible here is the cleanup work required after each of these
useless powercycles...)

so I've disabled the Debian r-b CI builds again, to see if this makes this
issue (the machine is so loaded it doesnt even respond to pings anymore)
go away.

Sadly, it's rather hard to see if this "helps", so it will be some days 
until I'll reenable them.

what has changed in July is that this host was upgraded to bookworm. what
also has changed is that diffoscope was upgraded (constantly to the sid
version), though I don't see any relevant changes in changelog.

Investigating this is also really difficult, as you might imagine. help much
welcome. So far we could see that the load is getting really high and that
its probably the diffoscope process, not java.

one idea would be to run diffoscope from stable, though as this is both
less than ideal as well means some work to implement this change, I've
refrained from trying this so far. I guess this will be my next step^wpoke.

help much welcome.


-- 
cheers,
	Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

20230709: Today was the warmest day on earth in 125,000 years. Today was also
the day with the most planes in the air at one time ever in history. By the time
you read this both of these records have probably been broken.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/reproducible-builds/attachments/20230831/d8b39f1c/attachment.sig>


More information about the Reproducible-builds mailing list