<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:"courier new",monospace"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Aug 10, 2024 at 12:58 PM Simon McVittie <<a href="mailto:smcv@debian.org">smcv@debian.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Control: reassign -1 autopkgtest<br>

Control: retitle -1 autopkgtest-virt-podman: document how to give systemd CAP_SYS_ADMIN<br>

Control: severity -1 wishlist<br>

Control: forwarded -1 <a href="https://salsa.debian.org/ci-team/autopkgtest/-/merge_requests/396" rel="noreferrer" target="_blank">https://salsa.debian.org/ci-team/autopkgtest/-/merge_requests/396</a><br>

Control: tags -1 + help<br>

<br>

On Sat, 10 Aug 2024 at 11:24:51 -0400, Reinhard Tartler wrote:<br>

> I personally find that wording a bit too strong. How about something like<br>

> this:<br>

> <br>

...<br>

> >     However, this also introduces an additional<br>

> >     attack surface in the<br>

> >     kernel if malicious code tried to escape the container sandbox.<br>

<br>

Are you thinking here of malicious code in a systemd service inside the<br>

container trying to escape from systemd's sandboxing to be privileged<br>

within the podman container, or are you thinking about malicious code<br>

inside the podman container (possibly as unconfined root) escaping<br>

from the podman container to harm the host system?<br></blockquote><div><br></div><div class="gmail_default" style="font-family:"courier new",monospace">I'm mostly thinking of maintainer scripts (e.g., postinst scripts) from dependencies</div><div class="gmail_default" style="font-family:"courier new",monospace">that are being pulled in, for instance via the option --add-apt-source. I don't find</div><div class="gmail_default" style="font-family:"courier new",monospace">it is unreasonable to expect podman to provide a sandbox that protects against</div><div class="gmail_default" style="font-family:"courier new",monospace">(potentially) harmful packages in maintainer script code in such a situation.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

In the autopkgtest use-case, I think in general we trust (or distrust!)<br>

everything inside the podman container equally: they're all coming from<br>

the same apt source(s), and can execute arbitrary code as container root<br>

via their maintainer scripts. The only reason that systemd's sandboxing<br>

of system services matters to us is that one of the things we ideally<br>

want to test is that the maintainer of the package under test didn't<br>

configure overly-strict systemd sandboxing that breaks their service's<br>

intended functionality.<br></blockquote><div><br></div><div><div class="gmail_default" style="font-family:"courier new",monospace">If you trust all maintainer scripts and other code as much as you trust systemd in</div><div class="gmail_default" style="font-family:"courier new",monospace">the container, then running the whole container with CAP_SYS_ADMIN should be fine.</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> > (It might also be appropriate to add a shorthand form for that, to avoid<br>

> > needing to use the "pass arbitrary options to podman-run" mechanism,<br>

> > but that would need some more design to choose a suitable name for<br>

> > that option. --trust-root-in-testbed, perhaps, if my understanding of<br>

> > the impact of CAP_SYS_ADMIN is correct.)<br>

> <br>

> I'd love to see such a shortcut, but it is not obvious to my how to name it.<br>

> Your suggestion seems too strong to me, because there are typically still<br>

> other<br>

> security features in play, such as seccomp, selinux or apparmor.<br>

<br>

Yes, hence my question about how dangerous it is to allow CAP_SYS_ADMIN.<br>

<br>

One way to phrase it is: are the other security mechanisms that podman<br>

uses (seccomp and LSMs) meant to be sufficiently strong that, if container<br>

root can escape to the host (even with CAP_SYS_ADMIN), that would justify<br>

a CVE in either podman or the kernel? If yes, then denying CAP_SYS_ADMIN<br>

is just hardening, rather than being security-critical in its own right.<br></blockquote><div><br></div><div><div class="gmail_default" style="font-family:"courier new",monospace">Ok, if you phrase the question that way, then I'd say "just hardening".</div></div><div class="gmail_default" style="font-family:"courier new",monospace">I don't see an obvious way how to escape the container with CAP_SYS_ADMIN,</div><div class="gmail_default" style="font-family:"courier new",monospace">AFAIUI you'd need to exploit at least some other podman or kernel bug, which</div><div class="gmail_default" style="font-family:"courier new",monospace">I would fully expect to be documented with a CVE and addressed with a </div><div class="gmail_default" style="font-family:"courier new",monospace">stable release and/or security update for stable releases.</div><div class="gmail_default" style="font-family:"courier new",monospace"><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> Re-reading through <a href="https://github.com/systemd/systemd/issues/29860" rel="noreferrer" target="_blank">https://github.com/systemd/systemd/issues/29860</a> clarifies<br>

> that systemd has a number of additional security hardening features, such as<br>

> DynamicUsers, but also things like PrivateDevices=`, `ProtectHome=`,<br>

> `ProtectSystem=`, `MountFlags=`, `PrivateTmp=`, `ReadWriteDirectories=`,<br>

> `ReadOnlyDirectories=`, `InaccessibleDirectories=`, and `MountFlags=`.<br>

<br>

Yes. Some of these are orthogonal to CAP_SYS_ADMIN; some of them need<br>

CAP_SYS_ADMIN to be effective, but are automatically disabled (with<br>

a warning) in its absence; and some need CAP_SYS_ADMIN, and service<br>

startup fails in its absence. It's the inconsistency between those<br>

last two categories that initially led me to think that this could be<br>

a systemd bug.<br>

<br>

> > If nothing is going to be done about this in systemd, and nothing can be<br>

> > done about it in podman, then it'll probably have to end up as a<br>

> > documentation improvement in autopkgtest-virt-podman(1).<br>

> <br>

> I tend to agree.<br>

<br>

Reassigning to autopkgtest(-virt-podman) for that.</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> I personally would be comfortable running containers<br>

> that have systemd inside with CAP_SYS_ADMIN because that is closer to<br>

> how systemd runs on a real system. Also, podman provides other additional<br>

> security features, such as seccomp and apparmor/selinux.<br>

<br>

Thanks, that's a useful data point.<br></blockquote><div><br></div><div class="gmail_default" style="font-family:"courier new",monospace">Awesome, thanks for reaching out!</div><div class="gmail_default" style="font-family:"courier new",monospace"><br></div></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature">regards,<br>    Reinhard</div></div>