[Git][qa/jenkins.debian.net][master] 2 commits: djm: improve UX when rebooting a node fails
Holger Levsen (@holger)
gitlab at salsa.debian.org
Wed Jul 12 13:08:16 BST 2023
Holger Levsen pushed to branch master at Debian QA / jenkins.debian.net
Commits:
1289b81d by Holger Levsen at 2023-07-12T11:43:40+02:00
djm: improve UX when rebooting a node fails
Signed-off-by: Holger Levsen <holger at layer-acht.org>
- - - - -
8d82b529 by Holger Levsen at 2023-07-12T14:07:16+02:00
reproducible system health: ignore less than 10 unkillable zombies.
this just happens (and could be migated with more isolation I guess)
but is also harmless and can only be fixed by rebooting the node in question.
Signed-off-by: Holger Levsen <holger at layer-acht.org>
- - - - -
4 changed files:
- bin/djm
- bin/reproducible_maintenance.sh
- bin/reproducible_system_health.sh
- logparse/reproducible.rules
Changes:
=====================================
bin/djm
=====================================
@@ -552,7 +552,7 @@ djm_do() {
# action
#
case $ACTION in
- reboot) ( ssh $NODE "sudo reboot || ( echo press enter ; read a ) " || true ) & sleep 1
+ reboot) ( ssh $NODE "sudo reboot" || xterm -T "$SHORTNODE / $ACTION failed" -class deploy-jenkins -bg $BG -fa 'DejaVuSansMono' -fs 10 -e "echo -e 'ssh to $NODE failed, thus rebooting failed.\n\npress enter to continue' ; read a " )
run_xterm2wait4node_comeback
;;
powercycle) case $SHORTNODE in
=====================================
bin/reproducible_maintenance.sh
=====================================
@@ -743,7 +743,11 @@ for i in $PBUIDS ; do
done
done
if [ -n "$PSCALL" ] ; then
- echo -e "Warning: processes found which should not be there and which could not be killed. Please fix manually:"
+ if [ $(ps -F -p "$PSCALL" | wc -l) -lt 10 ] ; then
+ echo "Info: ignoring less than ten processes found which should not be there and which could not be killed, because those are probably just a few harmless zombies, which can only be removed by rebooting...."
+ else
+ echo "Warning: found more than ten processes which should not be there and which could not be killed. Please investigate and reboot or ignore them...:"
+ fi
ps -F -p "$PSCALL"
echo
fi
=====================================
bin/reproducible_system_health.sh
=====================================
@@ -178,7 +178,7 @@ for JOB_NAME in $(ls -1d reproducible_* | sort ) ; do
small_note "session failed for user jenkins"
elif $(grep -q "etckeeper.service loaded failed" $LOG) ; then
small_note "etckeeper.service problem, manual intervention required"
- elif $(grep -E -q "^Warning: processes found which should not be there and which could not be killed." $LOG) ; then
+ elif $(grep -E -q "^Warning: found more than ten processes which should not be there" $LOG) ; then
small_note "unkillable unwanted processes"
elif $(grep -q "failed failed pbuilder_build" $LOG) ; then
small_note "pbuilder build scope failed"
=====================================
logparse/reproducible.rules
=====================================
@@ -7,7 +7,7 @@ warning /Warning: .+ contains invalid yaml, please fix./
warning /Warning: lock .+ still exists, exiting./
warning /^Warning: failed to end schroot session:/
warning /Warning: Tried, but failed to delete these/
-warning /Warning: processes found which should not be there/
+warning /Warning: found more than ten processes which should not be there/
warning /Warning: found reproducible_build.sh processes which have pid 1 as parent.+/
warning /Warning: Found files with bad permissions.+/
warning /Warning: .+ could not be fully removed.+/
View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/-/compare/3138d6fa96a043ba05bff226bf0a5993f5001d18...8d82b52996729618201d86b4e601b895a58e4661
--
View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/-/compare/3138d6fa96a043ba05bff226bf0a5993f5001d18...8d82b52996729618201d86b4e601b895a58e4661
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/qa-jenkins-scm/attachments/20230712/f808955f/attachment-0001.htm>
More information about the Qa-jenkins-scm
mailing list