Bug#935948: can't start containers with different names but same prefix (xxxxxxxxxxx-1 and xxxxxxxxxxx-2)
Trent W. Buck
trentbuck at gmail.com
Wed Aug 28 12:11:22 BST 2019
Package: systemd-container
Version: 242-4
Severity: minor
Due to IFNAMSIZ, nspawn's network interfaces names are truncated.
The possibility of collisions should be clearly documented.
My test containers have reasonably long names:
root at not-omega:~# ls -l /var/lib/machines/
total 75
drwxr-xr-x 2 root root 2 May 15 13:59 alamo
drwxr-xr-x 21 1087766528 1087766528 21 Aug 8 13:25 dns-test
drwxr-xr-x 21 1075707904 1075707904 21 Jan 1 2019 my-new-container
-rw-r--r-- 1 root root 0 Aug 8 17:55 my-new-container.nspawn~
drwxr-xr-x 21 root root 21 Aug 7 19:46 nft-test-downstream
drwxr-xr-x 21 1242628096 1242628096 21 Aug 7 19:48 nft-test-upstream
drwxr-xr-x 21 1486684160 1486684160 21 May 15 15:32 not-alamo
drwxr-xr-x 21 1669005312 1669005312 21 Jan 1 2019 test-alloc-1566986334
drwxr-xr-x 21 1024851968 1024851968 21 Jan 1 2019 test-alloc-1566988389
drwxr-xr-x 21 1678049280 1678049280 21 Aug 9 01:36 upstream-container
-rw-rw-r-- 1 root root 0 Aug 8 17:55 upstream-container.nspawn
I noticed that the interfaces created by systemd-nspawn do not use the full name:
root at not-omega:~# machinectl status test-alloc-1566986334 | grep Iface
Iface: ve-test-alloc-
root at not-omega:~# ip -o l | grep alloc
11: ve-test-alloc- at if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000\ link/ether 26:b1:44:74:31:e8 brd ff:ff:ff:ff:ff:ff link-netnsid 0
I wondered what would happen if the "unique" interface happened to
already be in use by another container.
The answer is that systemd-nspawn just crashes with a dumb error:
+ machinectl start test-alloc-1566988389
Job for systemd-nspawn at test-alloc-1566988389.service failed because the control process exited with error code.
See "systemctl status systemd-nspawn at test-alloc-1566988389.service" and "journalctl -xe" for details.
root at not-omega:~# journalctl -u systemd-nspawn at test-alloc-1566988389.service
-- Logs begin at Wed 2019-05-01 20:15:10 AEST, end at Wed 2019-08-28 20:36:53 AEST. --
Aug 28 20:36:53 not-omega systemd[1]: Starting Container test-alloc-1566988389...
Aug 28 20:36:53 not-omega systemd-nspawn[5357]: Failed to add new veth interfaces (ve-test-alloc-:host0): File exists
Aug 28 20:36:53 not-omega systemd[1]: systemd-nspawn at test-alloc-1566988389.service: Main process exited, code=exited, status=1/FAILURE
Aug 28 20:36:53 not-omega systemd[1]: systemd-nspawn at test-alloc-1566988389.service: Failed with result 'exit-code'.
Aug 28 20:36:53 not-omega systemd[1]: Failed to start Container test-alloc-1566988389.
Aug 28 20:36:53 not-omega systemd[1]: systemd-nspawn at test-alloc-1566988389.service: Consumed 339ms CPU time, no IP traffic.
This limitation is not obvious (to me).
In the systemd-nspawn manpage, it indicates ve-X should machine the machine name.
I THINK this is happening due to IFNAMSIZ in
src/nspawn/nspawn-network.c:setup_veth(), which is:
src/basic/linux/if.h:32:#define IFNAMSIZ 16
If this is an unavoidable limitation due to Linux, please at least
warn about it in the systemd-nspawn manpage.
Maybe systemd-nspawn or machinectl could even look for this collision
and specifically warn about it, e.g.
systemd-nspawn: cannot create interface "ve-X" for container X-2, because another container (X-1) is already using it. Either rename a container, or use non-default networking (i.e. don't use --network-veth).
A quick test of a (non-systemd) client suggests this is indeed a fundamental constraint:
root at not-omega:~# ip link add waffle type veth peer jaffa
root at not-omega:~# ip link set waffle name wafflexxxxxxxxxxxxxx
Error: argument "wafflexxxxxxxxxxxxxx" is wrong: "name" not a valid ifname
root at not-omega:~# ip link set waffle name wafflexxxxxxxxxxx
Error: argument "wafflexxxxxxxxxxx" is wrong: "name" not a valid ifname
root at not-omega:~# ip link set waffle name wafflexxxxxxxx
root at not-omega:~#
PS: systemd-nspawn is picky about container names (e.g. can't have a
underscore). If this is ultimately based on RFC 952, note that
RFC 952 allows up to 24 bytes (longer than IFNAMSIZ).
More information about the Pkg-systemd-maintainers
mailing list