[Cupt-devel] dpkg transactions

Guillem Jover guillem at debian.org
Tue Jan 13 20:39:36 UTC 2015


[ Adding apt and cupt back into the CCs, as this is frontend related,
  as I'm not sure if you all are subscribed to debian-dpkg. ]

Hi!

Ok, coming back to this, now that the big stuff has passed by. Be warned
this is a long reply, though. :)

On Sun, 2014-11-23 at 18:40:32 +0100, David Kalnischkies wrote:
> On Sat, Nov 15, 2014 at 06:24:16PM +0100, Guillem Jover wrote:
> > And w/o wanting to get tiresome with this, take into account that
> > frontends that use any of the dpkg --force-* options as normal course
> > of action, will most probably produce intermediate broken states. For
> 
> Well, that is kinda the point to have intermediate broken states. If
> not, we would all be just doing dpkg -i *.deb, right? But no, we ask for
> the unpack/configuration of specific packages to have these packages out
> of the "broken" state space early as for a user/system there isn't much
> of a difference if dpkg is called a thousand times and in between these
> calls the system state is broken or if dpkg is called only once, but the
> call just takes an hour to complete in which the system is broken.
> The important/visible difference is which packages are broken for how long.

My point is that there's never a need for intermediate broken states,
those are artificially inserted there by frontends. And dpkg
internally never produces those broken states.

For me frontends are there just to: 1) track and fetch package indices
from external repositories, 2) generate an upgrade solution that the
user is satisfied with from whatever logic seems best, 3) fetch the
selected packages from the external repository, 4) feed those packages
or archives to dpkg (possibly in an appropriate order) so that it can
install, upgrade or remove them.

> Also, you tend to disregard all disadvantages of the usage of
> selections: Many systems have all sorts of strange dormant selections
> already you really don't want to pick up, but also don't want to just
> discard – after all its user configuration.

Selections are an integral part of how dpkg works, the longer that
frontends keep ignoring them, their world views keep diverging.

I've to point out thought that there's quite a distinction between the
different selection states.

  hold:
    This one is going to be taken into account on upgrades and
    similar already anyway (except with --force-hold).
  deinstall, purge:
    These ones are also going to be taken into account already, and
    packages will be removed by dpkg if it finds some other action can
    only be performed by removing the offending package.
  install:
    This one can be split into two different instances. One I'll call
    immediate, which comes from the archives passed on the command
    line, and it's used by dpkg to know if it should perform some
    actions given that the package is or will be processed during this
    run. The other instance I'll call prospective, which is pretty
    much ignored by dpkg, as it cannot do anything about it, frontends
    are the ones that can provide the .deb archives from the repository
    to fulfill this selection, basically only dselect does that
    currently, AFAIK.

> As a user on the other hand
> I don't expect apt to do more than it was asked for, which it does if we
> would do a "dpkg --remove --pending" and the user happened to have some
> deinstall selections already.¹

There's a difference between an explicit action like «apt install» or
«apt remove» and «apt full-upgrade». Only something like the latter
should probably be doing a «dpkg --remove --pending».

> Besides, its hard to get the information
> I have feed into it back out again: Without ever setting deinstall
> I have e.g. plenty of them on my system already (= all rc packages).

I'm not sure I understand this comment, packages in «rc» have been
removed, and as such where requested for deinstall, which is the
current state they are in.

> Which leads me to my last point: I think its a horrible interface.

It's maybe a bad interface if including the "install" states because
there's also only support for one available file, and one possible
version candidate from that available file, but you can probably ignore
those for now. This part of the interaction between dpkg and frontends
does not worry me much for now. I'd like to ideally provide something
better in the future, but that will require design work. What I'm
proposing is already there.

Or are there other parts of the interface that you do feel are not
good? I'd be really interested to know.

> It is not like I would be telling dpkg beforehand which packages I
> am going to install, so why are removes that special?

As mentioned above, what to install is mostly information for the
frontend. But what to remove is a very important part of what dpkg
needs to know beforehand, because it might need to decide that it
can solve a dependency problem or similar by removing the offending
package, w/o requiring apt to place the packaging system in a broken
state with --force-* options. dpkg is mostly concerned with what is
in front of it, stuff that it does not see (like future packages to
unpack) are mostly irrelevant to it, because they don't help right
now when, say, satisfying dependencies and similar.

Telling dpkg beforehand what we want to remove allows for example to
safely and cleanly switch from mail-transport-agents w/o any force
option, nor any scary message from dpkg.

And if frontends started setting up selections for the list of packages
to remove, dpkg would stop emitting those scary aforementioned messages
on upgrade, which are yet another sign of the problematic usage of dpkg.

> And why do I have to share this information with everyone else on
> the machine, including the user?

I'm not sure I understand this comment either. You are just telling
dpkg, because at the end it's the one that has to remove them, and
might need to do that earlier than expected.

> You keep mentioning "dpkg transactions": I have really no idea what you
> mean by that, but my personal vision is that it means "dear dpkg, please
> install A, B & C and remove D, E & F. Do it in any order you see fit. If
> you are done, they should be all in the requested states, otherwise tell
> me with a failure (expect if I said --no-triggers, then the trigger
> states are fine, too)." And while I am dreaming, lets say iteration two
> can also express "Oh, and if you can, prefer doing B first" (because
> that is openssh-server and user wants it operational at all times).
> That would allow apt (and presumably all other dpkg frontends) to drop
> all of the ordering code² and replace it with one giant dpkg call and
> everyone would be very happy as we now don't have more or less the same
> code in every frontend (+ dpkg itself), but in one place.

When I talk about “dpkg transactions” (probably by abusing a bit the
terminology :), I've been meaning giving dpkg enough information so
that it can perform its actions on this current frontend run w/o any
force options. This implies basically feeding deinstall and purge
selections (setting up the “transaction”), before proceeding with
any other part of the frontend run. Which is then kind of committing
the changes into dpkg state and filesystem. Hope that clarifies. Or I
can stop using that terminology and just saying selections. :)

The only thing that dpkg cannot (currently) do by itself is to get
packages being Pre-Depended on, fully installed first. And that's one
of the things that a frontend needs to spoon feed to dpkg. After those
are in place, yes, a huge dpkg call unpacking everything else, and then
another configuring everything else should be all that's needed.

The case you mention about wanting to reduce the time between unpack
and configure for specific packages (like openssh-server), I think it's
purely a packaging decision, if one wants minimal downtime, then the
packaging should be done in a way that the daemon gets just restarted
on configure, and not stopped and then started after a while.

> In reality on the other hand I have to micromanage dpkg with individual
> commands in the right order, which I can only guess as I don't have all
> the information (triggers, maintainerscripts, fileconflicts, …) at hand
> to do it properly. And now, you are suggesting that I should make some
> of these commands be global "orders of the day" where I have no idea
> when they are executed, but it is nonetheless my responsibility to cancel
> the orders if something goes wrong during the day while micromanaging the
> rest.³ So that this is effectively increasing the amount of micromanaging
> I have to perform as I have to keep track of the status of the orders of
> the day now as well…

I'm not sure where I seemed to suggest that, but that's not it. I'd
like to get frontends to reverse course, and in fact would like them
to stop micromanaging dpkg.

The way to interact with dpkg is broadly speaking:

  * Feed deinstall and purge selections for the current run to
    «dpkg --set-selections».
  * Deal with Pre-Depend'ed packages:
    - Unpack any packages (and their transitive dependencies) being
      Pre-Depended on by any package to be installed.
    - Configure them with «dpkg --configure --pending».
  * Unpack the rest in a big dpkg call.
  * Configure the rest simply with «dpkg --configure --pending».

That's it, there should be no need to micromanage (except maybe for
the Pre-Depends part), no need for force options, etc. The system will
always be in a consistent state, one where it can be rebooted at any
moment, and from where dpkg can recover as well at any point. Most of
the ordering is performed by dpkg, dealing with much of the stuff.

Ideally even the Pre-Depends case would be handled by dpkg, but that
would require support for true transactions, probably with help from
--command-fd or similar. But the above would be a huge improvement over
the current situation.

> ¹ On an intellectual level I have the same issue with all --pending
> operations, just that apt is described as going from one good state to
> another, so I have less problems saying that its okay for it to fix up
> all the mess currently present, rather than limiting itself to what it
> was explicitly told to do, with e.g. --configure --pending as this
> improves the situation. Unneeded removals on the other hand…

Well something I've always found a bit perplexing is that when apt
finds that the dpkg status is in an inconsistent state it requests
the user to run «dpkg --cofigure -a», which seems odd to me (instead
of doing it itself :).

Packages selected for removal should have been requested by the user
at some point, if frontends would be taking those into account.

> ³ not to mention that apts orders are promoted by e.g. a power failure
> to direct user orders because I obviously can't revert them in that
> case while all install requests are lost. Not exactly ideal.

I'm not sure I follow what you mean here. But it feels to me you are
not happy that apt is creating broken dpkg states, so it needs to
reduce the time those are active, and as such has to create small
installation groups, in effect increasing the self-inflicted pain? :)

Also in the dpkg world, the normal way to fix install/upgrades/remove
problems is usually to just upgrade to a fixed packages, and not to
revert the transaction, because to begin with, downgrades are not
really supported. But maybe I misunderstood and you meant something
else.

In any case due to the force usage, currently I've seen apt do stuff
like:

  force remove libaudit0
  install libaudit1
  upgrade package linked against libaudit0

which makes not only the packaging state broken but also the filesystem,
and a power failure before the upgrade of the affected package will
make it inoperational.

> Also, temporary removes become very permanent this way. Very bad.

dpkg should never perform "temporary removes", it should only possibly
remove things that it has been told will no longer be needed at the end
of the install/upgrade/remove run. Or did I musunderstand this?


I'll be documenting the ideal way for frontends to interact with dpkg
in /usr/share/doc/dpkg-dev/frontend.txt, but please let me know if
there are specific implications or details of how dpkg works internally
that are not clear, and I'll be glad to clarify or document properly.

Thanks,
Guillem



More information about the Cupt-devel mailing list