[Qa-jenkins-scm] [Git][qa/jenkins.debian.net][master] 6 commits: change the slave configuration, so that the nodes are brought online only on demand

Mattia Rizzolo gitlab at salsa.debian.org
Thu Jun 14 13:07:44 BST 2018


Mattia Rizzolo pushed to branch master at Debian QA / jenkins.debian.net


Commits:
f936f7e8 by Mattia Rizzolo at 2018-06-14T13:48:07+02:00
change the slave configuration, so that the nodes are brought online only on demand

This will:
 * reduce the amount of used memory: currently any given slave.jar
   process takes something between 200 and 350 MB of resident memory,
   which easily goes over the 10 GB markā€¦
 * in general, reduce the number of running processes on the master,
   which can't possibly be a bad thing
 * make possible to not even start jobs on the slaves if a host goes offline
   for whatever reason
At the same time, starting slave.jar is not such an expensive process,
and the delay in starting the jobs from an offline node is totally
negligible

Signed-off-by: Mattia Rizzolo <mattia at debian.org>

- - - - -
32fd0174 by Mattia Rizzolo at 2018-06-14T13:55:22+02:00
update_jdn: install files with the correct permissions instead of fixing them up later

Signed-off-by: Mattia Rizzolo <mattia at debian.org>

- - - - -
20fa7690 by Mattia Rizzolo at 2018-06-14T13:55:57+02:00
Add a file with a list of known offline nodes

Signed-off-by: Mattia Rizzolo <mattia at debian.org>

- - - - -
5af41bfb by Mattia Rizzolo at 2018-06-14T13:58:17+02:00
Don't start the slave.jar if a node is marked as offline in the "black file"

Signed-off-by: Mattia Rizzolo <mattia at debian.org>

- - - - -
8494a038 by Mattia Rizzolo at 2018-06-14T13:58:53+02:00
reproducible debian: _worker.sh: stop the worker and don't try to build anything if any of the nodes in the pair if marked as offline

Signed-off-by: Mattia Rizzolo <mattia at debian.org>

- - - - -
11d5a5eb by Mattia Rizzolo at 2018-06-14T14:06:40+02:00
Mark ff2a-armhf-rb.debian.net as "known offline"

Signed-off-by: Mattia Rizzolo <mattia at debian.org>

- - - - -


5 changed files:

- INSTALL
- bin/reproducible_worker.sh
- bin/start-slave.sh
- + jenkins-home/offline_nodes
- update_jdn.sh


Changes:

=====================================
INSTALL
=====================================
--- a/INSTALL
+++ b/INSTALL
@@ -105,7 +105,9 @@ Process to follow to add a new node to jenkins:
   * 'Usage': select "Only build jobs with label expressions matching this node"
   * 'Launch method': select "Launch agent via execution of command on the master"
      * 'Launch command': `/srv/jenkins/bin/start-slave.sh`
-  * 'Availability': select "Keep this agent online as much as possible"
+  * 'Availability': select "Take this agent online when in demand, and offline when idle"
+    * 'In demand delay': 0 (so that builds will start right away)
+    * 'Idle delay': 5 (this is an arbitrary amount of time)
 
 The slave setup is done so that the slave.jar program doesn't get run on the remote nodes,
 to avoid needing Java available in there.


=====================================
bin/reproducible_worker.sh
=====================================
--- a/bin/reproducible_worker.sh
+++ b/bin/reproducible_worker.sh
@@ -57,6 +57,15 @@ while true ; do
 		echo "The lockfile $LOCKFILE is present, thus stopping this"
 		break
 	fi
+	JENKINS_OFFLINE_LIST="/var/lib/jenkins/offline_nodes"
+	if [ -f "$JENKINS_OFFLINE_LIST" ]; then
+		for n in "$NODE1" "$NODE2"; do
+			if grep -q "$n" "$JENKINS_OFFLINE_LIST"; then
+				echo "$n is currently marked as offline, stopping the worker."
+				break
+			fi
+		done
+	fi
 
 	# sleep up to 2.3 seconds (additionally to the random sleep reproducible_build.sh does anyway)
 	/bin/sleep $(echo "scale=1 ; $(shuf -i 1-23 -n 1)/10" | bc )


=====================================
bin/start-slave.sh
=====================================
--- a/bin/start-slave.sh
+++ b/bin/start-slave.sh
@@ -1,6 +1,20 @@
-#!/bin/bash
+#!/bin/sh
 
 # slave.jar has to be downloaded from http://localhost/jnlpJars/slave.jar
 
+# There doesn't seem to be any better way to figure out the slave name
+# from here, let's just hope all WORKSPACE have been set correctly
+NODE_NAME="$(basename ${WORKSPACE})"
+
+echo "Starting slave.jar for $NODE_NAME}..."
+
+f="/var/lib/jenkins/offline_nodes"
+if [ -f "$f" ]; then
+    if grep -q "$NODE_NAME" "$f"; then
+        echo "This node is currently marked as offline, not starting slave.jar"
+        exit 1
+    fi
+fi
+
 echo "This jenkins slave.jar will run as PID $$."
 exec java -jar /var/lib/jenkins/slave.jar


=====================================
jenkins-home/offline_nodes
=====================================
--- /dev/null
+++ b/jenkins-home/offline_nodes
@@ -0,0 +1,5 @@
+# The hosts listed below are known to be down atm.
+# No builds will be dispatched to those nodes, and the nodes will be marked
+# as offline in the jenkins UI.
+
+ff2a-armhf-rb.debian.net


=====================================
update_jdn.sh
=====================================
--- a/update_jdn.sh
+++ b/update_jdn.sh
@@ -597,16 +597,14 @@ else
 fi
 
 
-sudo mkdir -p /var/lib/jenkins/.ssh
+sudo mkdir -m 700 /var/lib/jenkins/.ssh
 if [ "$HOSTNAME" = "jenkins" ] ; then
-	sudo cp jenkins-home/procmailrc /var/lib/jenkins/.procmailrc
-	sudo cp jenkins-home/authorized_keys /var/lib/jenkins/.ssh/authorized_keys
+	sudo -u jenkins install -m 600 jenkins-home/authorized_keys /var/lib/jenkins/.ssh/authorized_keys
+	sudo -u jenkins cp jenkins-home/procmailrc /var/lib/jenkins/.procmailrc
+	sudo -u jenkins cp jenkins-home/offline_nodes /var/lib/jenkins/offline_nodes
 else
 	sudo cp jenkins-nodes-home/authorized_keys /var/lib/jenkins/.ssh/authorized_keys
 fi
-sudo chown -R jenkins:jenkins /var/lib/jenkins/.ssh
-sudo chmod 700 /var/lib/jenkins/.ssh
-sudo chmod 600 /var/lib/jenkins/.ssh/authorized_keys
 explain "scripts and configurations for jenkins updated."
 
 if [ "$HOSTNAME" = "jenkins" ] ; then



View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/compare/78e3a51a8ec44e6df94c7aa1935c8cbc0cb04c4d...11d5a5ebb8db08216d48185b914878a38bd86c0b

-- 
View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/compare/78e3a51a8ec44e6df94c7aa1935c8cbc0cb04c4d...11d5a5ebb8db08216d48185b914878a38bd86c0b
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/qa-jenkins-scm/attachments/20180614/c1065e79/attachment-0001.html>


More information about the Qa-jenkins-scm mailing list