[Debian-med-packaging] Bug#914814: spades: FTBFS with jemalloc/experimental

Andreas Tille andreas at fam-tille.de
Sat Jan 5 22:54:53 GMT 2019


Hi Anton,

thanks for your quick response.  I admit if we have some evidence that
the modified source provides some relevant performance gain I think we
need to reconsider using the Debian packaged version.

Could you please point us to some relevant discussion with jemalloc
upstream?  We would include this into the package documentation to
provide reasons why we derive in this case from Debian policy.  On
the other hand if you could provide your changes as patches to the
recent jemalloc version we might be able to convince the jemalloc
Debian maintainer to accept these patches.

Kind regards

       Andreas.

On Sat, Jan 05, 2019 at 05:08:03PM +0300, Anton Korobeynikov wrote:
> Hi Andreas,
> 
> Thanks for the report.
> 
> SPAdes ships its own version of jemalloc for a reason, because we
> modified it and configured exactly for SPAdes needs. And these changes
> do make a huge difference in terms of performance & users' experience
> of doing genome assemblies with SPAdes.
> 
> Given that SPAdes package that Debian ships is already broken &
> limited due to Debian local patches and the vendor is not willing to
> hear about our reasons & design choices we actually do not care about
> these patches anymore.
> 
> Users could always either build SPAdes from sources, use pre-built
> binaries or use other package managers like linuxbrew or conda.
> 
> Thanks for your understansing.
> 
> On Sat, Jan 5, 2019 at 2:35 PM Andreas Tille <andreas at an3as.eu> wrote:
> >
> > Hi,
> >
> > the Debian package is using the Debian packaged jemalloc as per Debian
> > policy which says you should not use code copies of packaged software.
> > Since there was a conflict with the new jemalloc ABI there was a bug
> > report
> >
> >     http://bugs.debian.org/914814
> >
> > about a break with jemalloc 5.1 which will be soon the default in
> > Debian.
> >
> > The attached mail is giving some hints for the spades developers which
> > I'm hereby forwarding.
> >
> > Kind regards
> >
> >       Andreas.
> >
> > ----- Forwarded message from Faidon Liambotis <paravoid at debian.org> -----
> >
> > Date: Sat, 5 Jan 2019 11:05:38 +0200
> > From: Faidon Liambotis <paravoid at debian.org>
> > To: Andreas Tille <tille at debian.org>, 914814 at bugs.debian.org
> > Cc: Adam Borowski <kilobyte at angband.pl>
> > Subject: Re: Bug#914814: spades: FTBFS with jemalloc/experimental
> >
> > tags 914814 - help + patch
> > thanks
> >
> > Hi there,
> >
> > jemalloc (wannabe) maintainer here :) Thanks Adam for filing this!
> >
> > On Sat, Dec 01, 2018 at 07:34:45AM +0100, Andreas Tille wrote:
> > > Spades is originally carrying a code copy.  I'm afraid upstream will
> > > give the advise to use the code they are shipping which we want to
> > > avoid.  Could you possibly provide some patch which I could use and
> > > provide to upstream once tested?  Unfortunately I have no idea about
> > > jemalloc.
> >
> > It looks like spades is using rallocm(), which was an experimental API
> > that was deprecated in 3.5.0 in favor of the *allocx variants, and
> > eventually removed in 4.0. (Debian currently carries 3.6.0, and the
> > intention is to move to 5.1.0.)
> >
> > The whole block is a bit odd, as it's trying to grow the allocation
> > in-place and then fall back to malloc/memcpy/free... which is really
> > something that the allocator (any allocator!) can just do automatically
> > with (je_)realloc(). Unfortunately the class' method is already poorly
> > named as "realloc" and conflicts with a realloc() invocation. While
> > upstream renaming that method to avoid the conflict would be a trivial
> > fix, it would be a more invasive patch that I'd be comfortable with to
> > carry as a Debian patch.
> >
> > Therefore attached is a patch that addresses this using the je_rallocx
> > API. It also deals with the deprecation of the stats.cactive statistic
> > (as of jemalloc 4.0), using the stats.active instead.
> >
> > The patch should be 3.6.0-compatible as well and can go in anytime and
> > ahead of a potential jemalloc transition. It is lightly tested (i.e.
> > builds and runs the test suite).
> >
> > More broadly, my recommendation to upstream would be to drop the ancient
> > (and patched?) embedded copy of jemalloc, use up-to-date standard API
> > interfaces (malloc/realloc) and opportunistically use the jemalloc
> > allocator when present without any special calls to it other than
> > perhaps the mallctl stats.
> >
> > Regards,
> > Faidon
> >
> > --- a/src/common/adt/kmer_vector.hpp
> > +++ b/src/common/adt/kmer_vector.hpp
> > @@ -26,18 +26,12 @@ private:
> >
> >      ElTy *realloc() {
> >  #ifdef SPADES_USE_JEMALLOC
> > -        // First, try to expand in-place
> > -        if (storage_ && sizeof(ElTy) * capacity_ * el_sz_ > 4096 &&
> > -            je_rallocm((void **) &storage_, NULL, sizeof(ElTy) * capacity_ * el_sz_, 0, ALLOCM_NO_MOVE) ==
> > -            ALLOCM_SUCCESS)
> > -            return storage_;
> > -
> > -        // Failed, do usual malloc / memcpy / free cycle
> > -        ElTy *res = (ElTy *) je_malloc(sizeof(ElTy) * capacity_ * el_sz_);
> > -        if (storage_)
> > -            std::memcpy(res, storage_, size_ * sizeof(ElTy) * el_sz_);
> > -        je_free(storage_);
> > -        storage_ = res;
> > +        // could simply be je_realloc(), if the realloc name wasn't redefined here
> > +        if (storage_) {
> > +            storage_ = (ElTy *) je_rallocx(storage_, sizeof(ElTy) * capacity_ * el_sz_, 0);
> > +        } else {
> > +            storage_ = (ElTy *) je_malloc(sizeof(ElTy) * capacity_ * el_sz_);
> > +        }
> >  #else
> >          // No JEMalloc, no cookies
> >          ElTy *res = new ElTy[capacity_ * el_sz_];
> > --- a/src/common/utils/logger/logger_impl.cpp
> > +++ b/src/common/utils/logger/logger_impl.cpp
> > @@ -106,17 +106,13 @@ void logger::log(level desired_level, const char* file, size_t line_num, const c
> >    size_t max_rss;
> >
> >  #ifdef SPADES_USE_JEMALLOC
> > -  const size_t *cmem = 0;//, *cmem_max = 0;
> > +  size_t cmem = 0;
> >    size_t clen = sizeof(cmem);
> >
> > -  je_mallctl("stats.cactive", &cmem, &clen, NULL, 0);
> > -  //je_mallctl("stats.cactive_max", &cmem_max, &clen, NULL, 0);
> > -  mem = (*cmem) / 1024;
> > -  //max_rss = (*cmem_max) / 1024;
> > -  max_rss = utils::get_max_rss();
> > -#else
> > -  max_rss = utils::get_max_rss();
> > +  je_mallctl("stats.active", (void *)&cmem, &clen, NULL, 0);
> > +  mem = cmem / 1024;
> >  #endif
> > +  max_rss = utils::get_max_rss();
> >
> >    for (auto it = writers_.begin(); it != writers_.end(); ++it)
> >      (*it)->write_msg(time, mem, max_rss, desired_level, file, line_num, source, msg);
> > --- a/src/common/utils/memory_limit.cpp
> > +++ b/src/common/utils/memory_limit.cpp
> > @@ -83,11 +83,11 @@ size_t get_max_rss() {
> >
> >  size_t get_used_memory() {
> >  #ifdef SPADES_USE_JEMALLOC
> > -    const size_t *cmem = 0;
> > +    size_t cmem;
> >      size_t clen = sizeof(cmem);
> >
> > -    je_mallctl("stats.cactive", &cmem, &clen, NULL, 0);
> > -    return *cmem;
> > +    je_mallctl("stats.active", &cmem, &clen, NULL, 0);
> > +    return cmem;
> >  #else
> >      return get_max_rss();
> >  #endif
> > --
> > 2.20.1
> >
> >
> >
> > ----- End forwarded message -----
> >
> > --
> > http://fam-tille.de
> 
> 
> 
> -- 
> With best regards, Anton Korobeynikov
> Department of Statistical Modelling, Saint Petersburg State University
> 

-- 
http://fam-tille.de



More information about the Debian-med-packaging mailing list