[debian-mysql] Bug#881034: Sid: Galera-3 default configuration; nodes beyond primary will not connect.

Tue Nov 7 09:45:57 UTC 2017

Package: galera-3
Version: 25.3.19-2
Severity: minor
Tags: sid

The default configuration of Galera-3 systemd scripts for the mariaDB server in debian Sid is incorrect. The secondary (tertiary, etc.) nodes do not connect to the primary node.
The problem may also be with one of the 'systemd' or 'mariadb' packages, and the script may be provided upstream. This information is unknown to me.

Reproducing the issue:
1) Create 3 connected VMs or connect 3 physical machines on a LAN with a default debian install.
2) Install the galera-3, mariadb-client, mariadb-server, and rsync packages.
3) Configure /etc/mysql/conf.d/galera.cnf with the recommended settings. In my case, each of 3 servers had a configuration file like this (the two bottom lines are changed on a server-to-server basis):

[mysqld]
#mysql settings
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
innodb_doublewrite=1
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0

#galera settings
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="osdb_cluster"
wsrep_cluster_address=gcomm://10.0.40.111,10.0.40.112,10.0.40.113
wsrep_sst_method=rsync

wsrep_node_address="10.0.40.111"
wsrep_node_name="galera01"

4) Bootstrap the cluster by using the recommended script on the primary node, by calling
    $ galera_new_cluster
5) Using the recommended way, check whether the cluster is started:

MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 1     |
+--------------------+-------+
1 row in set (0.01 sec)

6) Now, on the second node, try the recommended way of starting this node (as described on galera's home page):
    $ systemctl start mysql
Here's where the bug happens. This operation fails with the mysql server not starting on the second node, while we're expecting a two-node cluster.  Below is the output passed to the file '/var/log/mysql/error.log'.

2017-11-06 17:10:13 139930521723456 [Note] WSREP: Read nil XID from storage engines, skipping position init
2017-11-06 17:10:13 139930521723456 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
2017-11-06 17:10:13 139930521723456 [Note] WSREP: wsrep_load(): Galera 3.19(rb98f92f) by Codership Oy <info at codership.com> loaded successfully.
2017-11-06 17:10:13 139930521723456 [Note] WSREP: CRC-32C: using hardware acceleration.
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootsrap: 1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 10.0.40.113; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ig
2017-11-06 17:10:13 139930521723456 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(00000000-0000-0000-0000-000000000000:-1)
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: wsrep_sst_grab()
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Start replication
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: protonet asio version 0
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Using CRC-32C for message checksums.
2017-11-06 17:10:13 139930521723456 [Note] WSREP: backend: asio
2017-11-06 17:10:13 139930521723456 [Note] WSREP: gcomm thread scheduling priority set to other:0
2017-11-06 17:10:13 139930521723456 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2017-11-06 17:10:13 139930521723456 [Note] WSREP: restore pc from disk failed
2017-11-06 17:10:13 139930521723456 [Note] WSREP: GMCast version 0
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: EVS version 0
2017-11-06 17:10:13 139930521723456 [Note] WSREP: gcomm: connecting to group 'osdb_cluster', peer '10.0.40.111:,10.0.40.112:,10.0.40.113:'
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 'tcp://0.0.0.0:4567') connection established to f877f818 tcp://10.0.40.113:4567
2017-11-06 17:10:13 139930521723456 [Warning] WSREP: (f877f818, 'tcp://0.0.0.0:4567') address 'tcp://10.0.40.113:4567' points to own listening address, blacklisting
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 'tcp://0.0.0.0:4567') connection established to f5e5497f tcp://10.0.40.112:4567
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 'tcp://0.0.0.0:4567') connection established to 71b33d37 tcp://10.0.40.111:4567
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2017-11-06 17:10:13 139930521723456 [Note] WSREP: declaring 71b33d37 at tcp://10.0.40.111:4567 stable
2017-11-06 17:10:13 139930521723456 [Note] WSREP: declaring f5e5497f at tcp://10.0.40.112:4567 stable
2017-11-06 17:10:13 139930521723456 [Note] WSREP: view(view_id(NON_PRIM,71b33d37,360) memb {
        71b33d37,0
        f5e5497f,0
        f877f818,0
} joined {
} left {
} partitioned {
        1d6324f7,0
        1e18a4f1,0
        2416f453,0
        3355448a,0
        5fc6bb86,0
        62a46c5c,0
        b1235628,0
})
2017-11-06 17:10:16 139930521723456 [Note] WSREP: (f877f818, 'tcp://0.0.0.0:4567') connection to peer f877f818 with addr tcp://10.0.40.113:4567 timed out, no messages seen in PT3S
2017-11-06 17:10:16 139930521723456 [Note] WSREP: (f877f818, 'tcp://0.0.0.0:4567') turning message relay requesting off
2017-11-06 17:10:39 139930521723456 [Note] WSREP: (f877f818, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.0.40.112:4567
2017-11-06 17:10:40 139930521723456 [Note] WSREP: (f877f818, 'tcp://0.0.0.0:4567') reconnecting to f5e5497f (tcp://10.0.40.112:4567), attempt 0
2017-11-06 17:10:43 139930521723456 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
         at gcomm/src/pc.cpp:connect():158
2017-11-06 17:10:43 139930521723456 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
2017-11-06 17:10:43 139930521723456 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1380: Failed to open channel 'osdb_cluster' at 'gcomm://10.0.40.111,10.0.40.112,10.0.40.113': -110 (Connection timed out)
2017-11-06 17:10:43 139930521723456 [ERROR] WSREP: gcs connect failed: Connection timed out
2017-11-06 17:10:43 139930521723456 [ERROR] WSREP: wsrep::connect(gcomm://10.0.40.111,10.0.40.112,10.0.40.113) failed: 7
2017-11-06 17:10:43 139930521723456 [ERROR] Aborting

As one can see; the node will not connect to the base node. Normally, this issue would be caused by a network problem. I checked using 'nmap', and the ports galera wants to use by default are not in use by another service or blocked by a firewall. This is a simple test setup without any firewall or password settings.

WORKAROUND:
I believe the problem is caused by an error in the systemctl script for starting mysql. I found a workaround for the problem like so:

7) On the second node, manually start mysqld with
$ mysqld --wsrep-new-cluster
8) Connect with a second terminal and manually kill the mysqld process. (E.g. kill -9 12741, where 12741 is whatever PID got assigned to mysql)
9) Now start mysql via the recommended way on node 2:
$ systemctl start mysql
10) Back on node 1; we can run the query to check that the new node was added to the cluster;

MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 2     |
+--------------------+-------+
1 row in set (0.01 sec)

This time, there is no error, and the second cluster node starts up correctly (this process can be repeated for the 3rd, 4th, etc. nodes).

CONFIGURATION:

Apart from the configuration files mentioned above, no configuration files have been modified (e.g. the config files are the unmodified defaults shipped with mariadb, which sets up a basic database server with no password protection).

$ uname -a
Linux st01 4.9.0-4-amd64 #1 SMP Debian 4.9.51-1 (2017-09-28) x86_64 GNU/Linux

$ cat /etc/issue
Debian GNU/Linux 9 \n \l
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-mysql-maint/attachments/20171107/f29ce04d/attachment-0001.html>