How to cope with patches sanely

Mon Feb 25 10:18:15 UTC 2008

On Mon, 25 Feb 2008 10:33:48 +0100, Pierre Habouzit <madcoder at debian.org> said: 

> On Mon, Feb 25, 2008 at 03:37:07AM +0000, Manoj Srivastava wrote:
>> On Sun, 24 Feb 2008 21:17:10 -0500, David Nusinow
>> <dnusinow at speakeasy.net> said:
>> 
>> > On Sun, Feb 24, 2008 at 06:08:17PM -0800, Russ Allbery wrote:
>> >> David Nusinow <dnusinow at speakeasy.net> writes:
>> >> 
>> >> > The problem is that you and Manoj assume that this is the only
>> >> > way to do things. I don't believe this. Pierre Habouzit has been
>> >> > experimenting with an alternative method of feature branches
>> >> > that exports to a linear stack of diffs just fine. Just because
>> >> > Manoj is doing something one way right now doesn't mean it's the
>> >> > only or even the correct way to do it.
>> 
>> I would be interested in details of this, and whether this approach
>> works with pure feature branches where the features are being
>> developed contemporaneously with each other an upstream development;
>> and thus the branches overlap both temporally and in code space.

>   I'm planning to write a textual version of what I demonstrated at
> FOSDEM, with some more ideas that I had talking with Julien Cristau on
> the grass after.

>   You developped them contemporaneously, okay, but in the end you
> merge them one after the other.

        No, I do not. I developed feature A a bit, merged that. Then I
 developed feature B a bit, and merged that. Then I developed feature A
 some more. The there came an upstream version. Then feature B ...

> If you're doing criss-cross merges, well, I can nothing for you, and
> you're creating really messy histories, and yes, you need an SCM to
> represent that in a satisfying way.

        Thanks. So most of my packages will not get any help from the
 tool you are talking about -- and thus, it can't be made into a policy
 requirement.

> But if you really merge one feature branch on top of the other, and
> it's in my experience *way enough* for what we need in Debian
> packaging, then multiple branches are just multiple series to be
> applied in a specific relative order.

        But that is not how development happens in long running sets of
 features, which have been under development, incrementally, over a
 large number of upstream versions.


>   When it comes to specific patches of yours, I really believe that
> topic branches like you advertise them are the best answer. Git makes
> merging easy (s/Git/reasonnable $DSCM/ for this matter btw) in the
> sense that merging is fast enough, and easy enough when the branches
> you merge have not diverged too much. I mean, no matter which SCM you
> use, merging from a branch that is _very_ old, and still not merged
> upstream is jut a pain.

        Depends. I keep the topic branches updated with each upstream
 release, and I have carried fvwm/make/flex patches around for years and
 several upstream upgrades, and not had much problems.  Indeed, most
 upstream upgrades have taken _no_ manual effort.

> And it's again not an SCM issue. A patch queue _is_ a branch in
> itself. Really. There are two ways to look at that. Either you say, I
> always want to remember I started from this point, and then you merge
> and merge and merge, and your history looks like that:

> R are uptream releases, M your repeated merges to keep the feature
> branch current.

> -o---o---o--[...]--R--[...]--R--[...]--o--
>   \ \ \
>    p--p--p--p-------M---------M----...[feature branch]

        Right.

> Well with this approach, upstream will have to take a messy history
> with a _lot_ of merge points they don't care about, and won't be able
> to try your feature branch on top of their current work and maybe
> eventually adopt it. And worse, if you have to add new patches along
> the way, you get an history with a mixed suite of patches and merges,
> which is unreadable to upstream.

        Heh. Most of my upstream are fed just a patch, since lots of
 them are using CVS.

        In Git terms, I always rebase my patch on each upstream version,
 and can then feed a nice, coherent, minimal patch with no real complex
 history.

> The other way is to forget about giving depth *in* the SCM to the
> patches history. Because it's what it's about. What you really want
> IMSHO is: I have this patch queue [pq] and at upstream R0 it was in
> state pq0, in upstream R1 it was in state pq1 and so on. Without any
> useless merge points in it. This way your feature branch is a free (as
> in only attached to history by its base) branch that you rewrite for
> each new upstream, serialize under debian/patches/<featurebranch>. In
> git, there is this awesome git-quiltimport command that allow you to
> rebuild a branch from a quilt series in a trivial way. If you want to
> work on the patch queue, you just need to make it a branch again, do
> your stuff, serialize it again, and you're done.

        Err, I see little benefit in doing that; and I think that I
 prefer my current  feature branch mechanism as being less hassle. I
 periodically build and test each feature branch (this is why having
 ./debian as a submodule is very useful), and this is easily done as a
 feature branch. 

>   While doing that, your workflow allow people to do meaningful
> changes to your package (by adding patches to a given queue), that
> you'll transparently *painlessly* import into your workflow. Whereas
> with your current one, you'll have to extract whatever the NMUer did
> that is a flat debdiff, and split it. It's horrible for you, don't
> please pretend otherwise, I won't believe you.

        NMU's to my package happen seldom enough, and the patches
 applied are simple enough, for this not to be a consideration for me.

> The other gain, is that upstream can look at a current, unencumbered
> patch queue about the feature you added, and can take a decent
> decision about the fact that it's good to take upstream or not, and
> it's trivial to export such a branch to upstream:

        It is even easier to generate a feature patch, since each of my
 branches is built and tested, and generating a diff  against the
 upstream branch is trivial. This is something that I think I can do
 better with a feature branch than trying to extract a patch from the
 middle of a liner set of patches that all depend on every other feature
 prior to it in the series.

>   Last but not least, what I recommend here for packaging would be a
> complete hell if you diverge a _lot_ from your upstream. But if you
> diverge a lot, then you should rather officially start a fork, and do
> your feature branches and stuff here, because for long features and
> complicated ones I _do_ agree that it's the sole approach that makes
> sense, and then you'll package the fork. Of course, I don't pretend
> that for patch queues with 100 patches in it, rebasing is nice. It
> isn't. But I also assumed that you never have 100-patches-long
> features branches at all.

        I have been packaging Debian packages for 12.5 years. Some of my
 packages have indeed diverged a lot, before the patches were merged
 upstream (flex, for one). As to the number of changes in my feature
 branches, well, not all of them have been trivial (espescially for
 packages where we have effectively become upstream). As they have
 developed over the years, I probably have made dozens of commits to
 some of the feature branches. I could split those changes into a long
 series -- but I genrally feed them upstream as a single patch, I have
 not calculated this metric.

>   FWIW I'm willing to write a git-quiltexport tool generating
> basically those kind of stuff for you.

        Thanks. I currently use arch, but I have plans of using
 git-submodules to base my debian packaging on in the future -- perhaps
 around the time we release lenny.

        manoj
-- 
Do not underestimate the power of the Force.
Manoj Srivastava <srivasta at debian.org> <http://www.debian.org/~srivasta/>  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C