[Qa-jenkins-scm] [Git][qa/jenkins.debian.net][master] 2 commits: reproducible node health check: automatically fix some failed systemd units

Holger Levsen gitlab at salsa.debian.org
Wed Jul 22 19:45:48 BST 2020



Holger Levsen pushed to branch master at Debian QA / jenkins.debian.net


Commits:
052153a5 by Holger Levsen at 2020-07-22T20:27:55+02:00
reproducible node health check: automatically fix some failed systemd units

Signed-off-by: Holger Levsen <holger at layer-acht.org>

- - - - -
cbf6c3a8 by Holger Levsen at 2020-07-22T20:28:59+02:00
reproducible Debian: mark some armhf nodes down

Signed-off-by: Holger Levsen <holger at layer-acht.org>

- - - - -


2 changed files:

- bin/reproducible_node_health_check.sh
- jenkins-home/offline_nodes


Changes:

=====================================
bin/reproducible_node_health_check.sh
=====================================
@@ -146,10 +146,12 @@ fi
 #
 echo "$(date -u) - checking whether all services are running fine..."
 if ! systemctl is-system-running > /dev/null; then
-	if [ -n "$(systemctl list-units --state=error,failed | grep pbuilder_build)" ] ; then
-		echo "$(date -u) - resetting failed services (once) as some failed pbuilder_build have been found..."
-	        sudo systemctl reset-failed
-	fi
+	for UNIT in pbuilder_build acpid rc-local session- ; do
+		if [ -n "$(systemctl list-units --state=error,failed | grep '$UNIT')" ] ; then
+			echo "$(date -u) - resetting failed unit $UNIT..."
+		        sudo systemctl reset-failed $UNIT | ( echo "Warning: failed to systemctl reset-failed $UNIT" ; DIRTY=true )
+		fi
+	done
 	if ! systemctl is-system-running > /dev/null; then
 		systemctl status|head -5
 		echo "Warning: systemd is reporting errors:"


=====================================
jenkins-home/offline_nodes
=====================================
@@ -13,7 +13,10 @@
 
 # Also see https://pad.sfconservancy.org/p/rb-build-nodes-keep
 
-## stable problems
+# uncategorized problems
+odxu4c-armhf-rb.debian.net
+opi2b-armhf-rb.debian.net
+jtx1a-armhf-rb.debian.net
 
 # Down here nodes are automatically added by the maintenance job when they have
 # been failing their health check for too long.



View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/-/compare/a9e0a53b4aae916251ef0e226a556e02400f1ec0...cbf6c3a8a8e520ee1b4efe79fa13a2e72c17101d

-- 
View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/-/compare/a9e0a53b4aae916251ef0e226a556e02400f1ec0...cbf6c3a8a8e520ee1b4efe79fa13a2e72c17101d
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/qa-jenkins-scm/attachments/20200722/2446024e/attachment-0001.html>


More information about the Qa-jenkins-scm mailing list