[Debian-ha-maintainers] Split of HA agents into multiple binary packages

Lucas Kanashiro kanashiro at ubuntu.com
Thu Mar 23 13:08:57 GMT 2023


Hi,

First of all, thanks for the feedback Valentin!

Em 22/03/2023 19:13, Valentin Vidic escreveu:
> Yes, I agree the current setup for agents is a simplistic one, but it
> has also quite flexible and has worked for many years. I don't remember
> users opening bugs about this before, maybe there are requests for this
> from Ubuntu users? As for RedHat, I think they have more resources to
> work on this, so maybe a more complex setup is acceptable there.

In the Ubuntu side, we are trying to improve the HA ecosystem, and I 
believe we can work together with the Debian maintainers (as we are 
already doing with some upstream projects, like pcs) to benefit both 
projects, sharing the maintenance load. I really think many users are 
already used to the RedHat way of doing things, so at least try to 
follow what they are doing is a good start IMHO.

> In general, my thinking is that a production cluster setup is not something
> you can do in an hour just installing a bunch of packages on a few
> machines. It requires a lot of time for testing and tuning the setup.
> Also since many machines are required, you probably need some form of
> automation like Ansible to set it up and there is not much benefit from
> helping these advanced users here as they would just spend more time
> fighting the more rigid packaging setup.

I see your point. On the other hand, in Debian, we try to make things 
work out-of-the-box right after installation (thinking about other 
packages, we start services and so on), so I believe we should ship 
binary packages that will just work after installation. The advanced 
users will know how to workaround their situation.  About pulling 
dependencies for instance, we can make them as Recommends, so the 
advanced users can avoid installing them by apt options/config. Now 
let's say there is a new/junior sysadmin learning the HA bits, it is 
harder for them to identify what is needed by themselves, they will 
install a package and even the needed dependencies will not be in-place 
(it would be frustrating IMO, as it was to me when I started to touch 
those packages).

> I would think it will be very difficult to test and curate fence-agents
> because most of them require special hardware and can't be automated
> easily. Debian definitely does not have resources to do this.

I agree, what we curated so far does not require special hardware or we 
were able to simulate it in some way. However, this is something I can 
try to sort it out in the Ubuntu side if needed.

> Also, I don't see that resource-agents are split in the upstream spec
> file and not sure how this would work at all? For example if there is
> resource-agents-nginx and resource-agents-apache they will probably
> fail to install at the same time because of the port conflict. Maybe
> the proposal here is to remove resource-agents package and only go
> with specific (more than 100) packages? Another approach would be
> to have a resource-agents-nginx package that depends on resource-agents
> and nginx. But the question again is, how do users find out this package
> exists at all and is it really useful if you can do the same thing in
> few lines of Ansible (or some other automation).

Right, the nature of resource-agents is indeed different than 
fence-agents. In fence-agents, I believe there is a need to have 
everything in place (multiple agents do not depend on services running 
on the same port like we have in resource-agents) because you will run 
the script in a remote node, which might not have what it is needed, to 
"shoot in the head" of the target node, so I think the needed 
dependencies should be installed by the package to facilitate the setup 
of the fencing node. On the other hand, in resource-agents, the agent 
will be running likely in the same node of the resource in question, 
which will likely already have everything set up. For instance, if we 
want to manage a nginx instance the sysadmin will likely set it up 
previously and then when the agent is installed there is no need to pull 
the nginx package again. I believe this is the reasoning used by RedHat 
maintainers to not split the resource-agents.

> One advantage of the current resource-agents package is also that you
> don't need to use the Debian package for e.g. nginx if you need a newer
> version for some reason. With a strict dependency this is not possible
> anymore.

Makes sense. This seems to be a valid use case to be considered.


> For fence-agents this introduces around 70 new packages. I don't think
> this will help with discoverability and most users will just install
> fence-agents that will pull in everything again. Not to mention that
> pacemaker recommends fence-agents, again causing everything to be
> installed. Also since the size of fence-agents is really small (250KB),
> even with dependencies I don't expect more than few MB to be saved, so
> the disk usage argument also does not hold.

I think a good point here is the compatibility with the RedHat world, 
people managing multiple nodes running multiple OSes will not have a 
cognitive load issue to think the difference between them. I have not 
checked the dependencies of pacemaker in RedHat but we could try to 
follow them as well if we think it is reasonable.

 From what I have seen from users, some of them really want to have 
installed just the things that they are using, so giving them the 
possibility to filter out from their system everything they are not 
using (in this case, agents not needed) is a good thing. So users will 
still be able to install all fence-agents at once (fence-agents became a 
metapackage in my proposal) but the ones needing/requiring just a few of 
them will be able to be satisfied as well. And for discoverability, we 
will of course add those changes to the release notes and maybe to the 
HA docs, the users with more constraints will go after this information 
I guess, and apt search is their friend as well :)

> Additionally, every time a new agent is introduced we need to go through
> the NEW queue for approval, further complicating the package updates for
> us. Same holds for agent removal since we need to file a bug to remove a
> binary package from the pool.

This is true. However, I do not remember when a new agent was added TBH, 
it might have been a while ago. As I mentioned, the upstream project is 
quite stable and I do not think it will require much of this type of 
work. To be fair, I have seen some deprecation (which would lead to a 
removal) but still, it does not happen really often. Correct me if I am 
wrong please, this is just my feeling since I started to track those 
things.

> The change looks ok to me, but as I explained above, I see a lot of
> complexity and downsides with this going forward and not that much
> benefit. So for Debian I would choose a setup that is less complex and
> easier to maintain, but still works for the majority of users.

OK. Thanks for taking a look at the salsa MR anyway. I tried to give you 
another perspective above, I hope you could reconsider it. IMHO it would 
enhance the current state and we will be able to satisfy a "bigger" 
majority of our users if I can say so :)

> Instead, what I would like to have more is automated testing with
> autopkgtests, Ansible or some other tool that we could use to test the
> whole stack in more complex scenarios. For example, I only recently
> discovered through manual testing that ocfs2 does not work anymore.

We can definitely work together on that. As I pointed in my initial 
email we are already trying to do that, starting from less complex 
scenarios but the goal is to cover even more. We can start a separate 
thread to discuss this in more details.

Thanks for reading and considering everything, appreciated!

-- 
Lucas Kanashiro




More information about the Debian-ha-maintainers mailing list