[Qa-jenkins-scm] [Git][qa/jenkins.debian.net][master] 2 commits: reproducible node health check: automatically fix some failed systemd units
Holger Levsen
gitlab at salsa.debian.org
Wed Jul 22 19:45:48 BST 2020
Holger Levsen pushed to branch master at Debian QA / jenkins.debian.net
Commits:
052153a5 by Holger Levsen at 2020-07-22T20:27:55+02:00
reproducible node health check: automatically fix some failed systemd units
Signed-off-by: Holger Levsen <holger at layer-acht.org>
- - - - -
cbf6c3a8 by Holger Levsen at 2020-07-22T20:28:59+02:00
reproducible Debian: mark some armhf nodes down
Signed-off-by: Holger Levsen <holger at layer-acht.org>
- - - - -
2 changed files:
- bin/reproducible_node_health_check.sh
- jenkins-home/offline_nodes
Changes:
=====================================
bin/reproducible_node_health_check.sh
=====================================
@@ -146,10 +146,12 @@ fi
#
echo "$(date -u) - checking whether all services are running fine..."
if ! systemctl is-system-running > /dev/null; then
- if [ -n "$(systemctl list-units --state=error,failed | grep pbuilder_build)" ] ; then
- echo "$(date -u) - resetting failed services (once) as some failed pbuilder_build have been found..."
- sudo systemctl reset-failed
- fi
+ for UNIT in pbuilder_build acpid rc-local session- ; do
+ if [ -n "$(systemctl list-units --state=error,failed | grep '$UNIT')" ] ; then
+ echo "$(date -u) - resetting failed unit $UNIT..."
+ sudo systemctl reset-failed $UNIT | ( echo "Warning: failed to systemctl reset-failed $UNIT" ; DIRTY=true )
+ fi
+ done
if ! systemctl is-system-running > /dev/null; then
systemctl status|head -5
echo "Warning: systemd is reporting errors:"
=====================================
jenkins-home/offline_nodes
=====================================
@@ -13,7 +13,10 @@
# Also see https://pad.sfconservancy.org/p/rb-build-nodes-keep
-## stable problems
+# uncategorized problems
+odxu4c-armhf-rb.debian.net
+opi2b-armhf-rb.debian.net
+jtx1a-armhf-rb.debian.net
# Down here nodes are automatically added by the maintenance job when they have
# been failing their health check for too long.
View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/-/compare/a9e0a53b4aae916251ef0e226a556e02400f1ec0...cbf6c3a8a8e520ee1b4efe79fa13a2e72c17101d
--
View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/-/compare/a9e0a53b4aae916251ef0e226a556e02400f1ec0...cbf6c3a8a8e520ee1b4efe79fa13a2e72c17101d
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/qa-jenkins-scm/attachments/20200722/2446024e/attachment-0001.html>
More information about the Qa-jenkins-scm
mailing list