[DRE-maint] unicorn: native systemd service

Christos Trochalakis yatiohi at ideopolis.gr
Fri Jun 26 11:41:29 UTC 2015


On Thu, Jun 25, 2015 at 11:26:26PM +0000, Eric Wong wrote:
>+Cc unicorn-public list
>Christos Trochalakis <christos at skroutz.gr> wrote:
>> Hello all,
>>
>> I have recently migrated our main ruby application to systemd implementing zero
>> downtime upgrades.
>>
>> systemd doesn't like replacing the binary on the fly. There is one exception to
>> this, services with PIDFile. When PIDFile is set, systemd reads it when the
>> main process exits and replaces the main process.  nginx also had this issue a
>> few months ago [0].
>>
>> So, in order to support zero-downtime upgrades we have to use a pid file.
>
>I don't think so.  You should be able to bind the listen socket in
>systemd and rely on the socket activation features (setting the
>UNICORN_FD environment variable to the created FD).
>
>You still need to have the matching "listen" directive in the unicorn
>config file so unicorn does not close it.
>
>With socket activation, you should just be able to kill unicorn using
>SIGQUIT (just master, or even all workers) and restart without ever
>dropping a connection.  I do NOT suggest using SIGTERM for unicorn,
>since that'll cause the master to kill all workers ASAP.
>

Yes, you are right socket activation is also an option! I have made some
experiments with a simple rack app to test it.

systemd uses the LISTEN_FDS env variable that is an integer indicating the
number of inherited file descriptors. Those FDs have consecutive numbers
starting from `SD_LISTEN_FDS_START` which is `3` (man sd_listen_fds).

So for example if LISTEN_FDS="2", UNICORN_FD should be "3,4". I used a
simple wrapper script for that. Here is the full configuration:

$ tail -n+1 /srv/uni/* /etc/systemd/system/uni.*

==> /srv/uni/config.ru <==
app = proc do |env|
  sleep 5
  [
    200,
    { 'Content-Type' => 'text/plain' },
    ["Socket Activated!\n", "pid:#{$$}\n", "ppid:#{Process.ppid}\n"]
  ]
end

run app

==> /srv/uni/unicorn.conf.rb <==
worker_processes 2
working_directory "/srv/uni"

# Keep in sync with uni.socket
listen 9000, :tcp_nopush => true
listen 9001, :tcp_nopush => true

==> /srv/uni/wrapper <==
#!/bin/bash

[ -z "$LISTEN_FDS" ] && exec $@

UNICORN_FD=""
for fd in `seq 3 $(($LISTEN_FDS+2))`; do
	UNICORN_FD="${UNICORN_FD}${fd},"
done
export UNICORN_FD

echo "wrapped fds: ${UNICORN_FD}"

exec $@

==> /etc/systemd/system/uni.service <==
[Unit]
Description=Unicorn Server

[Service]
ExecStart=/srv/uni/wrapper /usr/bin/unicorn -c /srv/uni/unicorn.conf.rb -d
KillSignal=SIGQUIT
KillMode=mixed

==> /etc/systemd/system/uni.socket <==
[Unit]
Description=Unicorn Socket

[Socket]
ListenStream=0.0.0.0:9000
ListenStream=0.0.0.0:9001

[Install]
WantedBy=sockets.target

Make sure to activate the systemd units:
chmod +x /srv/uni/wrapper
systemdctl daemon-reload
systemctl enable uni.socket
systemctl start  uni.socket

The application sleeps for 5secs before replying.

I run the following commands from 3 different terminals:

$ curl localhost:9000 [blocked for 5sec]
# systemctl stop uni.service [issues sigquit on the running unicorn, killing
                              the 2nd worker and waiting the 1st to finish]
$ curl localhost:9000 [blocked since there are no more workers to accept right now]

After the first request is served, unicorn dies and systemd respawns a new master.
The second request is accepted by the new master (notice the different ppid).

Some notes:

TCP socket options are not applied by unicorn on inherited sockets (TCPSocket
=== sock is false). systemd socket files have support for most options now but
we might want unicorn to `setsockopt` them as well. For example,
'DeferAcceptSec', 'KeepAliveIntervalSec', 'NoDelay' are supported since v216, so
they are not available in jessie (v215).

socket activation is a really interesting setup, but personally I would not run
it with a large application. Clients would have to wait for the new master to
be up and running before a reply is returned, and that could take tenths of
seconds. The USR2 rexec solves that problem since both old and new workers are
accepting on the socket and we can kill the old ones when we are ready. In that
case the PIDFile trick is handy to support zero downtime restarts with no
latency.




More information about the Pkg-ruby-extras-maintainers mailing list