[pkg-gnupg-maint] Bug#840669: Bug#840669: Need way to avoid agent, or reliable way to kill agent

Fri Oct 14 16:58:17 UTC 2016

Hi Ian--

On Fri 2016-10-14 07:10:29 -0400, Ian Jackson wrote:
> Daniel Kahn Gillmor writes ("Re: [pkg-gnupg-maint] Bug#840669: Need way to avoid agent,	or reliable way to kill agent"):
>> Does your test suite delete its test homedirs after it is finished?  If
>> not, would you be willing to include removal of the test homedirs as
>> part of the tests, or as a final post-test cleanup phase?
>
> No.  It does not do anything after it is finished.
>
> Each script mentioned in debian/tests/control simply exits zero and
> then (according to DEP-8) it's the responsibility of the test runner
> to delete the tree if it cares.
>
> When run under adt-run with a filesystem snapshotting virt server,
> there is to need (of course) to delete the temporary area, since it's
> in a filesystem a snapshot which is going to be deleted.

i think you mean "no need" here, not "to need", right?

I'm sympathentic with your use case, but am ambivalent about whether
major re-architecting of gpg-agent is way to solve it.

There are multiple ways that processes from a given test run could
potentially linger with files open in that snapshotted filesystem.
gpg-agent happens to be one of them.  what do you propose to do about
the others?

If you want a truly bounded/scoped test run setup, you probably want
something like cgroups and a supervised way of terminating all spawned
processes before unmounting the filesystem.  Asking gpg-agent for
a significantly-modified behavior isn't going to solve the general
problem that your use case implies.

>> One thing worth observing is that if the agent's socket is deleted, it
>> will eventually terminate itself after a minute or two anyway.  This
>> will happen even without inotify (but obviously the inotify trigger
>> should really work to automate cleanup on platforms that support inotify)
>
> That's not really fast enough.

agreed, that's why i'm encouraging upstream to fix their inotify
detection:

   https://bugs.gnupg.org/gnupg/issue2756

> Then perhaps it should take a lock.  I assume there must be some kind
> of locking anyway, or concurrent startups would occasionally fail.

I'll let Werner (who i hope is reading this) answer whether locking is
actually happening and what these tradeoffs might be.  I'd be pretty
unhappy if complicating this ends up with an increased possibility of
gpg deadlock, though. :/

> I propose a design something like this:
>
>  * When GNUPG_AGENT_LIFETIME_FD is set:
>     - gpg-agent does not create a rendezvous socket.  Instead,
>       it selects/polls this fd for reading.
>     - When the fd is readable, gpg-agent expect to receive one end
>       of an AF_UNIX socketpair over it by fd passing.  That
>       received fd is treated the same way the answer from accept()
>       would be.
>     - If the master fd signals EOF, gpg-agent exits.
>     - gpg-agent should nevertheless take out a lock on a common
>       name in $GNUPGHOME.
>
>  * When gpg needs an agent but none is running discoverable in the
>    current global way [1]:
>    If GNUPG_AGENT_LIFETIME_FD is not already set,
>     - gpg creates an AF_UNIX anonymous socketpair.  One end will
>       become the "agent" fd and is set ~CLOEXEC.  The other end
>       becomes gpg's and is set CLOEXEC.
>     - gpg sets GNUPG_AGENT_LIFETIME_FD to the "agent" end of
>       the socketpair
>     - gpg spawns the agent
>     - Now GNUPG_AGENT_LIFETIME_FD is set.
>    When gpg wants to connect to the agent, it creates a new
>    socketpair and uses sendmsg to pass one end to the agent;
>    it then closes that end and uses the other end as if it had just
>    been connect()ed.
>
>  * Creating a socketpair and setting GNUPG_AGENT_LIFETIME_FD should be
>    documented as a way to get a privately-scoped gnupg.  For example,
>    dgit might like to do this so that it can make its three signatures
>    with one user authorisation and then terminate the authorisation
>    when it is done.  The document should say that:
>      - Programs must not set GNUPG_AGENT_LIFETIME_FD if it is already
>        set, so they must check it first.  (Otherwise deadlock will
>        occur.)
>      - Programs need not check for the existence of an `ambient' gnupg
>        agent in the user's account/session/environment; if there is
>        one, the GNUPG_AGENT_LIFETIME_FD setting will be ignored.

If dgit is willing to do this work, it should also be willing to just
invoke "gpgconf --kill gpg-agent" when it's ready to terminate the
agent, right?  Why craft all this new mechanism and guidance?

>  * The result is that the scope of an auto-spawned agent is a single
>    GNUPG_AGENT_LIFETIME_FD (normally, a single gnupg2 gpg program).
>    Other concurrent calls to gpg will block.

yuck.  I really don't like the idea that all concurrent calls to gpg
should block.

>  * [1] If you don't think this behaviour is good by default, it could
>    be a config option.  But I would argue that it ought to be the
>    default is the environment has not provided an agent, because it
>    provides the same authorisation lifetime as was implemented in
>    gnupg1.

This proposal represents a major change in how gpg-agent works right
now. It also involves a bunch of mechanism that may or may not be
portable; upstream gpg intends to be portable, as i understand it.
Furthermore, it seems likely that this will be complex and difficult for
most people to use, even moreso than saying "please exec 'gpgconf --kill
gpg-agent' when you're done".  It's also not a pattern i've seen
elsewhere, which will likely limit its adoption.

While i think this is an interesting proposal, it sounds to me like it
will likely cause as many problems as it solves, so i'm reluctant to
endorse it.

>> > In particular, `gpgconf --kill gpg-agent' should be documented.
>> 
>> Where would you like it documented?  it's in gpgconf(1):
>
> In gpg-agent(1) !

I've just submitted a documentation patch upstream:

   https://lists.gnupg.org/pipermail/gnupg-devel/2016-October/031833.html

> For now I have bodged this with a new schroot script
> /etc/schroot/setup.d/71killagent.  Attached.

This looks like an unusual way to do "killall gpg-agent", but that's ok
with me.

> This doesn't solve the problem when the test scripts are run ad-hoc.
> It won't solve the problem for other programs which set HOME (or
> GNUPGHOME) as part of test suites or whatever.

right, i agree that there is a general problem here, which is how to
deal with test suites that involve manipulation of gpg secret key
material.  It would be great for upstream to produce a clear set of
instructions that say "if you're dealing with gpg secret key material in
a test suite, here's what we recommend you do".

Regards,

  --dkg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 930 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/pkg-gnupg-maint/attachments/20161014/49a90331/attachment.sig>