[med-svn] [Git][med-team/libmaus2][master] 9 commits: routine-update: New upstream version
Étienne Mollier (@emollier)
gitlab at salsa.debian.org
Thu Sep 2 18:33:11 BST 2021
Étienne Mollier pushed to branch master at Debian Med / libmaus2
Commits:
5b9566e3 by Étienne Mollier at 2021-08-29T22:10:52+02:00
routine-update: New upstream version
- - - - -
ca1d2846 by Étienne Mollier at 2021-08-29T22:10:53+02:00
New upstream version 2.0.801+dfsg
- - - - -
055b2a61 by Étienne Mollier at 2021-08-29T22:10:59+02:00
Update upstream source from tag 'upstream/2.0.801+dfsg'
Update to upstream version '2.0.801+dfsg'
with Debian dir 894a8b6a171399fe3c7dc619f14a42ba64a500f4
- - - - -
9f989f99 by Étienne Mollier at 2021-08-30T20:55:38+02:00
routine-update: New upstream version
- - - - -
30ae1f9f by Étienne Mollier at 2021-08-30T20:55:39+02:00
New upstream version 2.0.802+dfsg
- - - - -
3c4b8c9e by Étienne Mollier at 2021-08-30T20:55:45+02:00
Update upstream source from tag 'upstream/2.0.802+dfsg'
Update to upstream version '2.0.802+dfsg'
with Debian dir eda50e9aa7959e88cc4c87da93dd753b4db52e78
- - - - -
877df6b7 by Étienne Mollier at 2021-08-30T20:55:51+02:00
Remove duplicate line from changelog.
Changes-By: lintian-brush
- - - - -
1b99e86b by Étienne Mollier at 2021-08-30T21:16:26+02:00
routine-update: Ready to upload to unstable
- - - - -
0cbf640b by Étienne Mollier at 2021-08-30T21:26:02+02:00
update changelog
- - - - -
20 changed files:
- .gitignore
- ChangeLog
- configure.ac
- debian/changelog
- − debian/patches/fix-build-on-non-x86.patch
- − debian/patches/series
- src/Makefile.am
- src/libmaus2/lcs/AlignmentOneAgainstManyAVX2.cpp
- src/libmaus2/lz/BgzfDeflateHeaderFunctions.cpp
- src/libmaus2/parallel/threadpool/ThreadPool.hpp
- src/libmaus2/parallel/threadpool/bam/BamHeaderSeqData.hpp
- src/libmaus2/parallel/threadpool/bgzf/BgzfBlockInfoEnqueDecompressHandler.hpp
- + src/libmaus2/parallel/threadpool/bgzf/BgzfCompressContext.cpp
- + src/libmaus2/parallel/threadpool/bgzf/BgzfCompressContext.hpp
- + src/libmaus2/parallel/threadpool/bgzf/BgzfDecompressContext.cpp
- src/libmaus2/parallel/threadpool/bgzf/BgzfDecompressContext.hpp
- + src/libmaus2/sorting/RankSearch.hpp
- src/libmaus2/vcf/VCFParser.hpp
- src/test/testThreadPoolCramEncode.cpp
- src/test/testsort.cpp
Changes:
=====================================
.gitignore
=====================================
@@ -44,10 +44,12 @@ src/test[a-zA-Z]*
*/*/.dirstamp
*/*/*/.dirstamp
*/*/*/*/.dirstamp
+*/*/*/*/*/.dirstamp
*/.deps
*/*/.deps
*/*/*/.deps
*/*/*/*/.deps
+*/*/*/*/*/.deps
log*
compile_gcc_*.sh
compile.sh
=====================================
ChangeLog
=====================================
@@ -1,3 +1,32 @@
+libmaus2 (2.0.802-1) unstable; urgency=medium
+
+ * Fix compilation without libdeflate
+
+ -- German Tischler-Höhle <germant at miltenyibiotec.de> Mon, 30 Aug 2021 10:29:07 +0200
+
+libmaus2 (2.0.801-1) unstable; urgency=medium
+
+ * Do not try to instantiate non existant context class in AlignmentOneAgainstManyAVX2 if LIBMAUS2_HAVE_ALIGNMENT_ONE_TO_MANY_AVX2 is not set
+ * Add thread pool based sorting interface in testsort
+ * Add ensureDispatcherRegistered method in ThreadPool
+ * Add delayed search merging in testsort
+ * Add more sorting/testing code in testsort.cpp
+ * Add planing code for parallel swapping ob non adjacent blocks in testsort
+ * Make error message more useful in VCFParser
+ * Add timing test for new sorting code
+ * Add some new sorting code in testsort
+ * Add computation of compressBound in BgzfCompressContext
+ * Add BgzfCompressContext class
+ * Reformat BgzfDecompressContext
+ * Replace numerical value by symbol one in BgzfDeflateHeaderFunctions
+ * Construct missing REF_CACHE elements if required and possible in testThreadPoolCramEncode
+ * Use context interface for crc32 in BgzfBlockInfoEnqueDecompressHandler
+ * Add BgzfDecompressContext.cpp in Makefile.am
+ * Add support for libdeflate in BgzfDecompressContext
+ * Add sequencesMissing method in BamHeaderSeqData
+
+ -- German Tischler-Höhle <germant at miltenyibiotec.de> Fri, 27 Aug 2021 11:03:24 +0200
+
libmaus2 (2.0.800-1) unstable; urgency=medium
* Rename GetObject to GetCObject to avoid name clash on w32/w64
=====================================
configure.ac
=====================================
@@ -1,5 +1,5 @@
-AC_INIT(libmaus2,2.0.800,[germant at miltenyibiotec.de],[libmaus2],[https://gitlab.com/german.tischler/libmaus2])
-LIBRARY_VERSION=2:800:0
+AC_INIT(libmaus2,2.0.802,[germant at miltenyibiotec.de],[libmaus2],[https://gitlab.com/german.tischler/libmaus2])
+LIBRARY_VERSION=2:802:0
AC_MSG_NOTICE([Configuring for source in directory ${srcdir}])
AC_CANONICAL_SYSTEM
AC_CANONICAL_HOST
=====================================
debian/changelog
=====================================
@@ -1,3 +1,10 @@
+libmaus2 (2.0.802+dfsg-1) unstable; urgency=medium
+
+ * New upstream version
+ * removed fix-build-on-non-x86.patch; bug fixed upstream.
+
+ -- Étienne Mollier <emollier at debian.org> Mon, 30 Aug 2021 20:57:35 +0200
+
libmaus2 (2.0.800+dfsg-3) unstable; urgency=medium
* d/control: Breaks: biobambam2 (<< 2.0.183~); testing migration caught that
=====================================
debian/patches/fix-build-on-non-x86.patch deleted
=====================================
@@ -1,24 +0,0 @@
-Description: fix build failures on non-x86 platforms
-Author: Étienne Mollier <emollier at debian.org>
-Forwarded: https://gitlab.com/german.tischler/libmaus2/-/issues/37
-Last-Update: 2021-08-26
----
-This patch header follows DEP-3: http://dep.debian.net/deps/dep3/
---- libmaus2.orig/src/libmaus2/lcs/AlignmentOneAgainstManyAVX2.cpp
-+++ libmaus2/src/libmaus2/lcs/AlignmentOneAgainstManyAVX2.cpp
-@@ -312,7 +312,6 @@
- }
- }
- };
--#endif
-
- libmaus2::lcs::AlignmentOneAgainstManyAVX2::AlignmentOneAgainstManyAVX2()
- : context(new AlignmentOneAgainstManyAVX2Context,[](auto p){delete reinterpret_cast<AlignmentOneAgainstManyAVX2Context *>(p);})
-@@ -324,6 +323,7 @@
- throw lme;
- #endif
- }
-+#endif
-
- void libmaus2::lcs::AlignmentOneAgainstManyAVX2::process(
- uint8_t const * qa,
=====================================
debian/patches/series deleted
=====================================
@@ -1 +0,0 @@
-fix-build-on-non-x86.patch
=====================================
src/Makefile.am
=====================================
@@ -219,7 +219,9 @@ libmaus2_la_SOURCES = libmaus2/StaticInitialization.cpp \
libmaus2/aio/SecrecyInputStreamFactory.cpp \
libmaus2/aio/SecrecyOutputStreamFactory.cpp \
libmaus2/bambam/Scram.c \
- libmaus2/bambam/parallel/ScramCramEncoding.cpp
+ libmaus2/bambam/parallel/ScramCramEncoding.cpp \
+ libmaus2/parallel/threadpool/bgzf/BgzfDecompressContext.cpp \
+ libmaus2/parallel/threadpool/bgzf/BgzfCompressContext.cpp
libmaus2_la_CPPFLAGS = ${AM_CPPFLAGS} @LIBMAUSCPPFLAGS@ @SNAPPYCPPFLAGS@ @ZLIBCPPFLAGS@ @NETTLECPPFLAGS@ @IGZIPCPPFLAGS@ @GMPCPPFLAGS@ @LIBMAUS2_LZMA_CPPFLAGS@ @LIBDEFLATECPPFLAGS@ @LIBMAUS2_LIBCURL_CPPFLAGS@ @LIBMAUS2_LIBSECRECY_CPPFLAGS@ @LIBMAUS2_PARASAIL_CPPFLAGS@ @IOLIBCPPFLAGS@
libmaus2_la_CXXFLAGS = @LIBMAUSCXXFLAGS@ ${AM_CXXFLAGS} @WARNCXXFLAGS@
libmaus2_la_LDFLAGS = @LIBMAUSLDFLAGS@ @SNAPPYLDFLAGS@ -version-info ${LIBRARY_VERSION} @LIBMAUSSTATIC@ @IGZIPLDFLAGS@ @NETTLELDFLAGS@ @GMPLDFLAGS@ @LIBDEFLATELDFLAGS@ @IOLIBLDFLAGS@
@@ -1762,7 +1764,8 @@ libmaus2sorting_include_HEADERS=\
libmaus2/sorting/InterleavedRadixSort.hpp \
libmaus2/sorting/ParallelRunLengthRadixUnsort.hpp \
libmaus2/sorting/ParallelRunLengthRadixSort.hpp \
- libmaus2/sorting/ParallelExternalRadixSort.hpp
+ libmaus2/sorting/ParallelExternalRadixSort.hpp \
+ libmaus2/sorting/RankSearch.hpp
libmaus2hashing_includedir=$(includedir)/libmaus2/hashing
libmaus2hashing_include_HEADERS=\
=====================================
src/libmaus2/lcs/AlignmentOneAgainstManyAVX2.cpp
=====================================
@@ -315,7 +315,11 @@ struct AlignmentOneAgainstManyAVX2Context
#endif
libmaus2::lcs::AlignmentOneAgainstManyAVX2::AlignmentOneAgainstManyAVX2()
-: context(new AlignmentOneAgainstManyAVX2Context,[](auto p){delete reinterpret_cast<AlignmentOneAgainstManyAVX2Context *>(p);})
+: context(
+ #if defined(LIBMAUS2_HAVE_ALIGNMENT_ONE_TO_MANY_AVX2)
+ new AlignmentOneAgainstManyAVX2Context,[](auto p){delete reinterpret_cast<AlignmentOneAgainstManyAVX2Context *>(p);}
+ #endif
+ )
{
#if! defined(LIBMAUS2_HAVE_ALIGNMENT_ONE_TO_MANY_AVX2)
libmaus2::exception::LibMausException lme;
=====================================
src/libmaus2/lz/BgzfDeflateHeaderFunctions.cpp
=====================================
@@ -33,7 +33,7 @@ uint8_t const * libmaus2::lz::BgzfDeflateHeaderFunctions::fillHeaderFooter(
)
{
// set size of compressed block in header
- unsigned int const blocksize = getBgzfHeaderSize()/*header*/+8/*footer*/+payloadsize-1;
+ unsigned int const blocksize = getBgzfHeaderSize()/*header*/+getBgzfFooterSize()/*footer*/+payloadsize-1;
assert ( blocksize < getBgzfMaxBlockSize() );
outbuf[16] = (blocksize >> 0) & 0xFF;
outbuf[17] = (blocksize >> 8) & 0xFF;
=====================================
src/libmaus2/parallel/threadpool/ThreadPool.hpp
=====================================
@@ -60,6 +60,7 @@ namespace libmaus2
std::atomic<int> panicFlag;
std::atomic<std::size_t> seq;
+ std::mutex dispatcherMapLock;
libmaus2::avl::AtomicAVLPtrValueMap<std::type_index,ThreadPoolDispatcher> dispatcherMap;
libmaus2::parallel::AtomicPtrStack<libmaus2::exception::LibMausException> panicStack;
libmaus2::parallel::AtomicPtrStack< std::thread > Vthread;
@@ -100,9 +101,16 @@ namespace libmaus2
terminatedFlag = 1;
}
+ static void staticDispatch(ThreadPool * ptr)
+ {
+ ptr->dispatch();
+ }
+
template<typename data_type>
void registerDispatcher(libmaus2::util::shared_ptr<ThreadPoolDispatcher> ptr)
{
+ std::lock_guard<decltype(dispatcherMapLock)> slock(dispatcherMapLock);
+
std::type_index const type(typeid(data_type));
auto const it = dispatcherMap.find(type);
@@ -118,9 +126,29 @@ namespace libmaus2
dispatcherMap.insert(std::type_index(typeid(data_type)),ptr);
}
- static void staticDispatch(ThreadPool * ptr)
+ template<typename data_type>
+ void ensureDispatcherRegistered(libmaus2::util::shared_ptr<ThreadPoolDispatcher> ptr)
{
- ptr->dispatch();
+ std::lock_guard<decltype(dispatcherMapLock)> slock(dispatcherMapLock);
+
+ std::type_index const type(typeid(data_type));
+
+ auto const it = dispatcherMap.find(type);
+
+ if ( it == dispatcherMap.end() )
+ dispatcherMap.insert(std::type_index(typeid(data_type)),ptr);
+ }
+
+ ThreadPoolDispatcher * getDispatcher(std::type_index const & TI)
+ {
+ std::lock_guard<decltype(dispatcherMapLock)> slock(dispatcherMapLock);
+
+ auto it = dispatcherMap.find(TI);
+
+ if ( it == dispatcherMap.end() )
+ return nullptr;
+
+ return it->second.load().get();
}
void dispatch()
@@ -142,9 +170,14 @@ namespace libmaus2
assert ( pack );
assert ( pack->data.load() );
- auto it = dispatcherMap.find(pack->data.load()->getTypeIndex());
+ auto * dispatcher = getDispatcher(pack->data.load()->getTypeIndex());
- if ( it == dispatcherMap.end() )
+ if ( dispatcher )
+ {
+ dispatcher->dispatch(pack,this);
+ putPackage(pack);
+ }
+ else
{
std::string const packageTypeName = pack->data.load()->getPackageTypeName();
putPackage(pack);
@@ -154,12 +187,6 @@ namespace libmaus2
lme.finish();
throw lme;
}
- else
- {
- it->second.load()->dispatch(pack,this);
- putPackage(pack);
- }
-
}
else if ( terminatedFlag.load() )
break;
=====================================
src/libmaus2/parallel/threadpool/bam/BamHeaderSeqData.hpp
=====================================
@@ -20,6 +20,7 @@
#include <cassert>
#include <regex>
+#include <filesystem>
#include <libmaus2/exception/LibMausException.hpp>
#include <libmaus2/aio/InputStreamInstance.hpp>
@@ -97,6 +98,38 @@ namespace libmaus2
}
}
+
+ bool sequencesMissing(std::filesystem::path const & ref_cache_path) const
+ {
+ bool needfascan = false;
+
+ for ( auto P : M )
+ {
+ auto const id = P.first;
+ auto const LM = P.second;
+
+ auto const it_M5 = LM.find("M5");
+
+ if ( it_M5 == LM.end() )
+ {
+ libmaus2::exception::LibMausException lme;
+ lme.getStream() << "[E] no M5 attribute found for " << id << std::endl;
+ lme.finish();
+ throw lme;
+ }
+
+ std::string const m5 = it_M5->second;
+
+ std::filesystem::path dictpath = ref_cache_path;
+ dictpath /= m5;
+
+ if ( ! std::filesystem::exists(dictpath) )
+ needfascan = true;
+ }
+
+ return needfascan;
+ }
+
BamHeaderSeqData(std::istream & ISI)
{
init(ISI);
=====================================
src/libmaus2/parallel/threadpool/bgzf/BgzfBlockInfoEnqueDecompressHandler.hpp
=====================================
@@ -339,12 +339,7 @@ namespace libmaus2
blocklen
);
- uint32_t lcrc = crc32(0,0,0);
- lcrc = crc32(
- lcrc,
- reinterpret_cast<Bytef const *>(ddata->B.load().get() + ddata->o.load()),
- blocklen
- );
+ uint32_t const lcrc = context->crc32(ddata->B.load().get() + ddata->o.load(), blocklen);
if ( lcrc != crc )
{
=====================================
src/libmaus2/parallel/threadpool/bgzf/BgzfCompressContext.cpp
=====================================
@@ -0,0 +1,165 @@
+/*
+ libmaus2
+ Copyright (C) 2020 German Tischler-Höhle
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>..
+*/
+#include <libmaus2/parallel/threadpool/bgzf/BgzfCompressContext.hpp>
+#include <libmaus2/lz/BgzfConstants.hpp>
+
+#if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+#include <libdeflate.h>
+#else
+#include <zlib.h>
+#endif
+
+#if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+static struct libdeflate_compressor * allocateCompressor(int level)
+{
+ if ( level < 0 )
+ level = 6;
+
+ auto * lcompressor = libdeflate_alloc_compressor(level);
+
+ if ( ! lcompressor )
+ throw libmaus2::exception::LibMausException("[E] BgzfCompressContext: libdeflate_alloc_compressor failed");
+
+ return lcompressor;
+}
+#else
+std::shared_ptr<zlib_z_stream> allocateStream(int level)
+{
+ if ( level < 0 )
+ level = Z_DEFAULT_COMPRESSION;
+
+ std::shared_ptr<z_stream> stream(new z_stream);
+ std::memset(stream.get(),0,sizeof(z_stream));
+ stream->zalloc = nullptr;
+ stream->zfree = nullptr;
+ stream->opaque = nullptr;
+ stream->avail_in = 0;
+ stream->next_in = nullptr;
+ int const r = deflateInit2(stream.get(), level, Z_DEFLATED, -15 /* window size */, 8 /* mem level, gzip default */, Z_DEFAULT_STRATEGY);
+
+ if ( r != Z_OK )
+ throw libmaus2::exception::LibMausException("[E] BgzfCompressContext: deflateInit2 failed");
+
+ return stream;
+}
+#endif
+
+std::size_t libmaus2::parallel::threadpool::bgzf::BgzfCompressContext::computeBound()
+{
+ std::size_t const maxpayload = libmaus2::lz::BgzfConstants::getBgzfMaxPayLoad();
+
+ std::size_t low = 0;
+ std::size_t high = maxpayload;
+
+ assert ( compressBound(0) < maxpayload );
+
+ while ( high-low > 1 )
+ {
+ std::size_t const m = (low+high)/2;
+
+ assert ( m > low );
+ assert ( m < high );
+
+ std::size_t const b = compressBound(m);
+
+ // bound too large?
+ if ( b > maxpayload )
+ high = m;
+ else
+ low = m;
+ }
+
+ return 0;
+}
+
+libmaus2::parallel::threadpool::bgzf::BgzfCompressContext::BgzfCompressContext(int level)
+:
+ #if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ compressor(
+ allocateCompressor(level),
+ [](auto p) {libdeflate_free_compressor(p);}
+
+ )
+ #else
+ stream(allocateStream(level))
+ #endif
+ ,
+ bound(computeBound())
+{
+}
+
+libmaus2::parallel::threadpool::bgzf::BgzfCompressContext::~BgzfCompressContext()
+{
+ #if !defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ deflateEnd(stream.get());
+ #endif
+}
+
+std::size_t libmaus2::parallel::threadpool::bgzf::BgzfCompressContext::compressBound(std::size_t const s)
+{
+ #if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ return libdeflate_deflate_compress_bound(compressor.get(),s);
+ #else
+ return deflateBound(stream.get(),s);
+ #endif
+}
+
+std::size_t libmaus2::parallel::threadpool::bgzf::BgzfCompressContext::compress(
+ char * in,
+ size_t n_in,
+ char * out,
+ size_t n_out
+)
+{
+ #if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ return libdeflate_deflate_compress(compressor.get(),in,n_in,out,n_out);
+ #else
+ assert ( stream );
+ if ( deflateReset(stream.get()) != Z_OK )
+ throw libmaus2::exception::LibMausException("[E] BgzfCompressContext::decompress: deflateReset failed");
+
+ stream->avail_in = n_in;
+ stream->next_in = reinterpret_cast<Bytef*>(in);
+ stream->avail_out = n_out;
+ stream->next_out = reinterpret_cast<Bytef*>(out);
+
+ int const r = deflate(stream.get(),Z_FINISH);
+
+ bool const r_ok = (r == Z_STREAM_END);
+
+ // return 0 on failure
+ if ( !r_ok )
+ return 0;
+
+ return n_out - stream->avail_out;
+ #endif
+}
+
+uint32_t libmaus2::parallel::threadpool::bgzf::BgzfCompressContext::crc32(
+ char const * in,
+ size_t const n_in
+)
+{
+ #if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ return libdeflate_crc32(0 /* init value */, in, n_in);
+ #else
+ uint32_t lcrc = ::crc32(0,0,0);
+ lcrc = ::crc32(lcrc,reinterpret_cast<Bytef const *>(in),n_in);
+ return lcrc;
+ #endif
+}
=====================================
src/libmaus2/parallel/threadpool/bgzf/BgzfCompressContext.hpp
=====================================
@@ -0,0 +1,78 @@
+/*
+ libmaus2
+ Copyright (C) 2020 German Tischler-Höhle
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>..
+*/
+#if ! defined(LIBMAUS2_PARALLEL_THREADPOOL_BGZF_BGZFCOMPRESSCONTEXT_HPP)
+#define LIBMAUS2_PARALLEL_THREADPOOL_BGZF_BGZFCOMPRESSCONTEXT_HPP
+
+#include <memory>
+#include <cstdlib>
+#include <cstring>
+#include <cassert>
+#include <atomic>
+#include <libmaus2/exception/LibMausException.hpp>
+
+#if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+struct libdeflate_compressor;
+#else
+typedef struct z_stream_s zlib_z_stream;
+#endif
+
+namespace libmaus2
+{
+ namespace parallel
+ {
+ namespace threadpool
+ {
+ namespace bgzf
+ {
+ /* zlib decompression context */
+ struct BgzfCompressContext
+ {
+ typedef BgzfCompressContext this_type;
+ typedef std::shared_ptr<this_type> shared_ptr_type;
+
+ private:
+ BgzfCompressContext & operator=(BgzfCompressContext const & O) = delete;
+ std::size_t compressBound(std::size_t);
+ std::size_t computeBound();
+
+ #if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ std::shared_ptr<struct libdeflate_compressor> compressor;
+ #else
+ std::shared_ptr<zlib_z_stream> stream;
+ #endif
+
+ std::size_t const bound;
+
+ public:
+ BgzfCompressContext(int level = -1);
+ ~BgzfCompressContext();
+
+
+ /**
+ * raw deflate. Returns on number of compressed bytes on success
+ * and 0 on failure (most likely explanation: compressed data does not
+ * fit in output buffer).
+ **/
+ std::size_t compress(char * in, size_t n_in, char * out, size_t n_out);
+ uint32_t crc32(char const * in, std::size_t const n);
+ };
+ }
+ }
+ }
+}
+#endif
=====================================
src/libmaus2/parallel/threadpool/bgzf/BgzfDecompressContext.cpp
=====================================
@@ -0,0 +1,117 @@
+/*
+ libmaus2
+ Copyright (C) 2020 German Tischler-Höhle
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>..
+*/
+#include <libmaus2/parallel/threadpool/bgzf/BgzfDecompressContext.hpp>
+
+#if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+#include <libdeflate.h>
+#else
+#include <zlib.h>
+#endif
+
+libmaus2::parallel::threadpool::bgzf::BgzfDecompressContext::BgzfDecompressContext()
+{
+ #if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ auto * ldecompressor = libdeflate_alloc_decompressor();
+
+ if ( ! ldecompressor )
+ throw libmaus2::exception::LibMausException("[E] BgzfDecompressContext: libdeflate_alloc_decompressor failed");
+
+ decompressor = std::shared_ptr<struct libdeflate_decompressor>(
+ ldecompressor,
+ [](auto p) {libdeflate_free_decompressor(p);}
+ );
+ #else
+ stream = std::shared_ptr<z_stream>(new z_stream);
+ std::memset(stream.get(),0,sizeof(z_stream));
+ stream->zalloc = nullptr;
+ stream->zfree = nullptr;
+ stream->opaque = nullptr;
+ stream->avail_in = 0;
+ stream->next_in = nullptr;
+ int const r = inflateInit2(stream.get(),-15);
+ if ( r != Z_OK )
+ {
+ throw libmaus2::exception::LibMausException("[E] BgzfDecompressContext: inflateInit2 failed");
+ }
+ #endif
+}
+
+libmaus2::parallel::threadpool::bgzf::BgzfDecompressContext::~BgzfDecompressContext()
+{
+ #if !defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ inflateEnd(stream.get());
+ #endif
+}
+
+void libmaus2::parallel::threadpool::bgzf::BgzfDecompressContext::decompress(
+ char * in,
+ size_t n_in,
+ char * out,
+ size_t n_out
+)
+{
+ #if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ std::size_t u_out = 0;
+ auto const r = libdeflate_deflate_decompress(decompressor.get(),in,n_in,out,n_out,&u_out);
+
+ bool const ok =
+ (r == LIBDEFLATE_SUCCESS)
+ &&
+ (u_out == n_out);
+
+ if ( !ok )
+ {
+ throw ::libmaus2::exception::LibMausException("[E] BgzfDecompressContext::decompress: libdeflate_deflate_decompress");
+ }
+ #else
+ assert ( stream );
+ if ( inflateReset(stream.get()) != Z_OK )
+ throw libmaus2::exception::LibMausException("[E] BgzfDecompressContext::decompress: inflateReset failed");
+
+ stream->avail_in = n_in;
+ stream->next_in = reinterpret_cast<Bytef*>(in);
+ stream->avail_out = n_out;
+ stream->next_out = reinterpret_cast<Bytef*>(out);
+
+ int const r = inflate(stream.get(),Z_FINISH);
+
+ bool const r_ok = (r == Z_STREAM_END);
+ bool const out_ok = (stream->avail_out == 0);
+ bool const in_ok = (stream->avail_in == 0);
+ bool const ok = r_ok && out_ok && in_ok;
+
+ if ( !ok )
+ {
+ throw ::libmaus2::exception::LibMausException("[E] BgzfDecompressContext::decompress: inflate failed");
+ }
+ #endif
+}
+
+uint32_t libmaus2::parallel::threadpool::bgzf::BgzfDecompressContext::crc32(
+ char const * in,
+ size_t const n_in
+)
+{
+ #if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ return libdeflate_crc32(0 /* init value */, in, n_in);
+ #else
+ uint32_t lcrc = ::crc32(0,0,0);
+ lcrc = ::crc32(lcrc,reinterpret_cast<Bytef const *>(in),n_in);
+ return lcrc;
+ #endif
+}
\ No newline at end of file
=====================================
src/libmaus2/parallel/threadpool/bgzf/BgzfDecompressContext.hpp
=====================================
@@ -18,7 +18,6 @@
#if ! defined(LIBMAUS2_PARALLEL_THREADPOOL_BGZF_BGZFDECOMPRESSCONTEXT_HPP)
#define LIBMAUS2_PARALLEL_THREADPOOL_BGZF_BGZFDECOMPRESSCONTEXT_HPP
-#include <zlib.h>
#include <memory>
#include <cstdlib>
#include <cstring>
@@ -26,6 +25,12 @@
#include <atomic>
#include <libmaus2/exception/LibMausException.hpp>
+#if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+struct libdeflate_decompressor;
+#else
+typedef struct z_stream_s zlib_z_stream;
+#endif
+
namespace libmaus2
{
namespace parallel
@@ -43,58 +48,23 @@ namespace libmaus2
private:
BgzfDecompressContext & operator=(BgzfDecompressContext const & O) = delete;
- std::shared_ptr<z_stream> stream;
+ #if defined(LIBMAUS2_HAVE_LIBDEFLATE)
+ std::shared_ptr<struct libdeflate_decompressor> decompressor;
+ #else
+ std::shared_ptr<zlib_z_stream> stream;
+ #endif
public:
- BgzfDecompressContext()
- : stream(new z_stream)
- {
- std::memset(stream.get(),0,sizeof(z_stream));
- stream->zalloc = nullptr;
- stream->zfree = nullptr;
- stream->opaque = nullptr;
- stream->avail_in = 0;
- stream->next_in = nullptr;
- int const r = inflateInit2(stream.get(),-15);
- if ( r != Z_OK )
- {
- throw libmaus2::exception::LibMausException("[E] BgzfDecompressContext: inflateInit2 failed");
- }
- }
-
- ~BgzfDecompressContext()
- {
- inflateEnd(stream.get());
- }
+ BgzfDecompressContext();
+ ~BgzfDecompressContext();
void decompress(
char * in,
size_t n_in,
char * out,
size_t n_out
- )
- {
- assert ( stream );
- if ( inflateReset(stream.get()) != Z_OK )
- throw libmaus2::exception::LibMausException("[E] BgzfDecompressContext::decompress: inflateReset failed");
-
- stream->avail_in = n_in;
- stream->next_in = reinterpret_cast<Bytef*>(in);
- stream->avail_out = n_out;
- stream->next_out = reinterpret_cast<Bytef*>(out);
-
- int const r = inflate(stream.get(),Z_FINISH);
-
- bool const r_ok = (r == Z_STREAM_END);
- bool const out_ok = (stream->avail_out == 0);
- bool const in_ok = (stream->avail_in == 0);
- bool const ok = r_ok && out_ok && in_ok;
-
- if ( !ok )
- {
- throw ::libmaus2::exception::LibMausException("[E] BgzfDecompressContext::decompress: inflate failed");
- }
- }
+ );
+ uint32_t crc32(char const * in, std::size_t const n);
};
}
}
=====================================
src/libmaus2/sorting/RankSearch.hpp
=====================================
@@ -0,0 +1,2077 @@
+/*
+ libmaus2
+ Copyright (C) 2021 German Tischler-Höhle
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>..
+*/
+#if ! defined(LIBMAUS2_SORTING_RANKSEARCH_HPP)
+#define LIBMAUS2_SORTING_RANKSEARCH_HPP
+
+#include <libmaus2/parallel/threadpool/ThreadPool.hpp>
+
+namespace libmaus2
+{
+ namespace sorting
+ {
+ struct RankSearch
+ {
+ struct SortOp
+ {
+ virtual ~SortOp() {}
+ virtual void dispatch() = 0;
+ };
+
+ template<
+ typename iterator,
+ typename order_type
+ >
+ struct SortRequest : public SortOp
+ {
+ iterator a;
+ iterator e;
+ order_type const * order;
+
+ SortRequest() : a(), e(), order(nullptr)
+ {
+ }
+ SortRequest(
+ iterator ra,
+ iterator re,
+ order_type const * const rorder
+ ) : a(ra), e(re), order(rorder) {}
+
+ void dispatch()
+ {
+ ::std::sort(a,e,*order);
+ }
+ };
+
+ template<
+ typename iterator,
+ typename order_type
+ >
+ struct StableSortRequest : public SortOp
+ {
+ iterator a;
+ iterator e;
+ order_type const * order;
+
+ StableSortRequest() : a(), e(), order(nullptr)
+ {
+ }
+ StableSortRequest(
+ iterator ra,
+ iterator re,
+ order_type const * const rorder
+ ) : a(ra), e(re), order(rorder) {}
+
+ void dispatch()
+ {
+ ::std::stable_sort(a,e,*order);
+ }
+ };
+
+ template<
+ typename from_iterator,
+ typename to_iterator
+ >
+ struct CopyRequest : public SortOp
+ {
+ from_iterator from_b;
+ from_iterator from_e;
+ to_iterator to_b;
+
+ CopyRequest()
+ {
+
+ }
+ CopyRequest(
+ from_iterator rfrom_b,
+ from_iterator rfrom_e,
+ to_iterator rto_b
+ ) : from_b(rfrom_b), from_e(rfrom_e), to_b(rto_b)
+ {
+
+ }
+
+ void dispatch()
+ {
+ std::copy(from_b,from_e,to_b);
+ }
+ };
+
+ template<
+ typename const_iterator,
+ typename temp_iterator,
+ typename order_type
+ >
+ struct MergeRequest : public SortOp
+ {
+ const_iterator ab;
+ const_iterator ae;
+ const_iterator bb;
+ const_iterator be;
+ temp_iterator itt;
+ order_type const * order;
+
+ MergeRequest()
+ {}
+ MergeRequest(
+ const_iterator rab,
+ const_iterator rae,
+ const_iterator rbb,
+ const_iterator rbe,
+ temp_iterator ritt,
+ order_type const * rorder
+ ) : ab(rab), ae(rae), bb(rbb), be(rbe), itt(ritt), order(rorder)
+ {}
+
+ void dispatch()
+ {
+ std::merge(
+ ab,ae,
+ bb,be,
+ itt,
+ *order
+ );
+ }
+ };
+
+ template<
+ typename const_iterator,
+ typename temp_iterator,
+ typename order_type
+ >
+ struct MergeRequestDelayedSearch : public SortOp
+ {
+ const_iterator ab;
+ const_iterator ae;
+ const_iterator bb;
+ const_iterator be;
+ temp_iterator itt;
+ std::size_t from;
+ std::size_t to;
+ order_type const * order;
+
+ MergeRequestDelayedSearch()
+ {}
+ MergeRequestDelayedSearch(
+ const_iterator rab,
+ const_iterator rae,
+ const_iterator rbb,
+ const_iterator rbe,
+ temp_iterator ritt,
+ std::size_t const rfrom,
+ std::size_t const rto,
+ order_type const * rorder
+ ) : ab(rab), ae(rae), bb(rbb), be(rbe), itt(ritt), from(rfrom), to(rto), order(rorder)
+ {}
+
+ void dispatch()
+ {
+ std::vector< std::pair<const_iterator,const_iterator> > const Vsplitin(
+ {
+ std::make_pair(ab,ae),
+ std::make_pair(bb,be)
+ }
+ );
+ auto const Vlow = rankSearchMulti(Vsplitin,from);
+ auto const Vhigh = rankSearchMulti(Vsplitin,to);
+ auto const off = std::distance(ab,Vlow.front()) + std::distance(bb,Vlow.back());
+ assert ( static_cast<std::size_t>(off) == from );
+
+ std::merge(
+ Vlow.front(),Vhigh.front(),
+ Vlow.back(),Vhigh.back(),
+ itt + off,
+ *order
+ );
+ }
+
+ std::string toString(const_iterator ref) const
+ {
+ std::ostringstream ostr;
+
+ ostr << "MergeRequestDelayedSearch("
+ << ab-ref << ","
+ << ae-ref << ","
+ << bb-ref << ","
+ << be-ref << ","
+ << from << ","
+ << to
+ << ")";
+
+ return ostr.str();
+ }
+ };
+
+ template<
+ typename const_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ const_iterator
+ >::value_type
+ >
+ >
+ static std::pair<const_iterator,const_iterator> rankSearch(
+ const_iterator ab, const_iterator ae,
+ const_iterator bb, const_iterator be,
+ typename std::iterator_traits<const_iterator>::difference_type const p,
+ order_type const & order = order_type()
+ )
+ {
+ const_iterator low_a = ab;
+ const_iterator high_a = ae;
+ const_iterator low_b = bb;
+ const_iterator high_b = be;
+
+ bool diff_a = (high_a != low_a) && order(*low_a,high_a[-1]);
+ bool diff_b = (high_b != low_b) && order(*low_b,high_b[-1]);
+
+ assert ( (low_a-ab) + (low_b-bb) <= p );
+
+ while ( diff_a || diff_b )
+ {
+ if ( diff_a )
+ {
+ auto const ln = std::distance(low_a,high_a);
+ auto P = std::equal_range(low_a,high_a,low_a[ln/2],order);
+
+ const_iterator ac;
+
+ if ( P.first == low_a )
+ ac = P.second;
+ else
+ ac = P.first;
+
+ assert ( ac != ae );
+
+ auto const & e = *ac;
+ auto const it = std::lower_bound(low_b,high_b,e,order);
+ typename std::iterator_traits<const_iterator>::difference_type const z = (it - bb) + (ac-ab);
+
+ if ( p >= z )
+ low_a = ac;
+ else
+ high_a = ac;
+
+ assert (
+ (low_a-ab) + (low_b-bb) <= p
+ );
+
+
+ diff_a = order(*low_a,high_a[-1]);
+ }
+ if ( diff_b )
+ {
+ auto const ln = std::distance(low_b,high_b);
+ auto P = std::equal_range(low_b,high_b,low_b[ln/2],order);
+
+ const_iterator bc;
+
+ if ( P.first == low_b )
+ bc = P.second;
+ else
+ bc = P.first;
+
+ assert ( bc != be );
+
+ auto const & e = *bc;
+ auto const it = std::lower_bound(low_a,high_a,e,order);
+ typename std::iterator_traits<const_iterator>::difference_type const z = (it - ab) + (bc-bb);
+
+ if ( p >= z )
+ low_b = bc;
+ else
+ high_b = bc;
+
+ assert ( (low_a-ab) + (low_b-bb) <= p );
+
+ diff_b = order(*low_b,high_b[-1]);
+ }
+ }
+
+ auto av_a = high_a - low_a;
+ auto av_b = high_b - low_b;
+
+ while (
+ (low_a-ab) + (low_b-bb) < p
+ &&
+ (av_a || av_b)
+ )
+ {
+ auto mis = p - ((low_a-ab) + (low_b-bb));
+
+ if ( ! av_a )
+ {
+ auto const av = av_b;
+ auto const use = std::min(av,mis);
+ low_b += use;
+ }
+ else if ( ! av_b )
+ {
+ auto const av = av_a;
+ auto const use = std::min(av,mis);
+ low_a += use;
+ }
+ else if ( order(*low_a,*low_b) )
+ {
+ auto const av = av_a;
+ auto const use = std::min(av,mis);
+ low_a += use;
+ av_a -= use;
+ }
+ else
+ {
+ auto const av = av_b;
+ auto const use = std::min(av,mis);
+ low_b += use;
+ av_b -= use;
+ }
+
+ assert ( (low_a-ab) + (low_b-bb) <= p );
+ }
+
+ return std::make_pair(low_a,low_b);
+ }
+
+ template<
+ typename const_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ const_iterator
+ >::value_type
+ >
+ >
+ static std::vector<const_iterator> rankSearchMulti(
+ std::vector< std::pair<const_iterator,const_iterator> > const & V,
+ typename std::iterator_traits<const_iterator>::difference_type const p,
+ order_type const & order = order_type()
+ )
+ {
+ std::vector< std::pair<const_iterator,const_iterator> > VLH(V);
+ std::vector< bool > Vdiff(V.size());
+ bool anydiff = false;
+
+ for ( std::size_t i = 0; i < VLH.size(); ++i )
+ {
+ const_iterator const & low = VLH[i].first;
+ const_iterator const & high = VLH[i].second;
+ Vdiff[i] = (high != low) && order(*low,high[-1]);
+ anydiff = anydiff || Vdiff[i];
+ }
+
+ typename std::iterator_traits<const_iterator>::difference_type s = 0;
+
+ assert ( s <= p );
+
+ while ( anydiff )
+ {
+ anydiff = false;
+
+ for ( std::size_t i = 0; i < Vdiff.size(); ++i )
+ if ( Vdiff[i] )
+ {
+ const_iterator const & low = VLH[i].first;
+ const_iterator const & high = VLH[i].second;
+
+ auto const ln = std::distance(low,high);
+ auto P = std::equal_range(low,high,low[ln/2],order);
+
+ const_iterator ac;
+
+ if ( P.first == low )
+ ac = P.second;
+ else
+ ac = P.first;
+
+ assert ( ac != V[i].second );
+
+ auto const & e = *ac;
+
+ typename std::iterator_traits<const_iterator>::difference_type z = ac-V[i].first;
+
+ for ( std::size_t j = 0; j < Vdiff.size(); ++j )
+ if ( j != i )
+ {
+ auto const it = std::lower_bound(VLH[j].first,VLH[j].second,e,order);
+ z += it - V[j].first;
+ }
+
+ if ( p >= z )
+ VLH[i].first = ac;
+ else
+ VLH[i].second = ac;
+
+ Vdiff[i] = order(*(VLH[i].first),VLH[i].second[-1]);
+
+ anydiff = anydiff || Vdiff[i];
+ }
+
+ s = 0;
+ for ( std::size_t i = 0; i < VLH.size(); ++i )
+ s += VLH[i].first - V[i].first;
+
+ assert ( s <= p );
+ }
+
+ std::vector < std::size_t> VI(V.size());
+ for ( std::size_t i = 0; i < V.size(); ++i )
+ VI[i] = i;
+
+ // sort remaining intervals (by index) in ascending order
+ std::sort(
+ VI.begin(),
+ VI.end(),
+ [&VLH,&order](auto const i, auto const j)
+ {
+ bool const empty_i = VLH[i].second == VLH[i].first;
+ bool const empty_j = VLH[j].second == VLH[j].first;
+
+ if ( empty_j )
+ return true;
+ else if ( empty_i )
+ return false;
+ else if ( order(*(VLH[i].first),*(VLH[j].first)) )
+ return true;
+ else if ( order(*(VLH[j].first),*(VLH[i].first)) )
+ return false;
+ else
+ return i < j;
+ }
+ );
+
+ for ( std::size_t j = 0; s < p && j < VI.size(); ++j )
+ {
+ std::size_t const i = VI[j];
+ auto const av = VLH[i].second - VLH[i].first;
+ auto const mis = p - s;
+ auto const use = std::min(av,mis);
+
+ VLH[i].first += use;
+ s += use;
+ }
+
+ std::vector<const_iterator> R(VLH.size());
+
+ for ( std::size_t i = 0; i < VLH.size(); ++i )
+ R[i] = VLH[i].first;
+
+ return R;
+ }
+
+ template<typename iterator>
+ struct SwapReverse
+ {
+ iterator ab;
+ iterator ae;
+ iterator be;
+
+ SwapReverse(){}
+ SwapReverse(
+ iterator rab,
+ iterator rae,
+ iterator rbe
+ ) : ab(rab), ae(rae), be(rbe)
+ {
+ assert ( ae >= ab );
+ assert ( be >= ae );
+ }
+
+ void dispatch()
+ {
+ while ( ab != ae )
+ std::swap(*ab++,*--be);
+ }
+
+ std::string toString(iterator ref) const
+ {
+ auto const an = std::distance(ab,ae);
+ std::ostringstream ostr;
+ ostr << "SwapReverse([" << ab-ref << "," << ae-ref << "),[" << ((be-an)-ref) << "," << be-ref << "))";
+
+ return ostr.str();
+ }
+
+ void pushIntervals(std::vector< std::pair<iterator,iterator> > & V) const
+ {
+ auto const d = std::distance(ab,ae);
+ V.push_back(std::make_pair(ab,ae));
+ V.push_back(std::make_pair(be-d,be));
+ }
+ };
+
+ template<typename iterator>
+ static void swap(
+ iterator itab,
+ iterator itae,
+ iterator itbb,
+ iterator itbe
+ )
+ {
+ auto const an = std::distance(itab,itae);
+ auto const bn = std::distance(itbb,itbe);
+
+ SwapReverse(itab,itab+an/2,itae).dispatch();
+ SwapReverse(itbb,itbb+bn/2,itbe).dispatch();
+
+ if ( an <= bn )
+ {
+ SwapReverse(itab,itab+an,itbe).dispatch();
+ itab += an;
+ itbe -= an;
+
+ auto const rest = bn - an;
+ auto const rest2 = rest / 2;
+
+ SwapReverse(itbb,itbb+rest2,itbe).dispatch();
+ }
+ else
+ {
+ SwapReverse(itab,itab+bn,itbe).dispatch();
+ itab += bn;
+ itbe -= bn;
+
+ auto const rest = an - bn;
+ auto const rest2 = rest / 2;
+
+ SwapReverse(itab,itab+rest2,itae).dispatch();
+ }
+ }
+
+ template<typename iterator>
+ static void planReverse(
+ iterator itab,
+ iterator itae,
+ std::size_t const threads,
+ std::vector< SwapReverse<iterator> > & V
+ )
+ {
+ auto const an = std::distance(itab,itae);
+
+ if ( an > 1 )
+ {
+ std::size_t const san = an/2;
+ std::size_t const packsize = (san + threads - 1)/threads;
+ std::size_t const numpacks = (san + packsize - 1)/packsize;
+
+ // std::cerr << "an=" << an << " san=" << san << " packsize=" << packsize << std::endl;
+
+ std::size_t rest = san;
+
+ for ( std::size_t i = 0; i < numpacks; ++i )
+ {
+ std::size_t const use = std::min(rest,packsize);
+
+ // std::cerr << "use=" << use << std::endl;
+
+ V.push_back(SwapReverse<iterator>(itab,itab+use,itae));
+
+ itab += use;
+ itae -= use;
+ rest -= use;
+ }
+
+ assert (
+ itab == itae
+ ||
+ itab+1 == itae
+ );
+ }
+ }
+
+ template<typename iterator>
+ static void planSwapReverse(
+ iterator itab,
+ iterator itae,
+ iterator itbe,
+ std::size_t const threads,
+ std::vector< SwapReverse<iterator> > & V
+ )
+ {
+ auto const an = std::distance(itab,itae);
+
+ if ( ! an )
+ return;
+
+ assert ( std::distance(itae,itbe) >= an );
+
+ std::size_t const san = an;
+ std::size_t const packsize = (san + threads - 1)/threads;
+ std::size_t const numpacks = (san + packsize - 1)/packsize;
+
+ std::size_t rest = san;
+
+ for ( std::size_t i = 0; i < numpacks; ++i )
+ {
+ std::size_t const use = std::min(rest,packsize);
+ V.push_back(
+ SwapReverse<iterator>(
+ itab,
+ itab+use,
+ itbe
+ )
+ );
+
+ rest -= use;
+ itab += use;
+ itbe -= use;
+ }
+
+ assert ( rest == 0 );
+ }
+
+ template<typename iterator>
+ static std::pair< std::vector < SwapReverse<iterator> >,std::vector < SwapReverse<iterator> > > planSwap(
+ iterator itab,
+ iterator itae,
+ iterator itbb,
+ iterator itbe,
+ std::size_t const threads
+ )
+ {
+ assert ( itae == itbb );
+
+ auto const an = std::distance(itab,itae);
+ auto const bn = std::distance(itbb,itbe);
+ // decltype(an) dthreads = threads;
+
+ // reverse A and B
+ std::vector < SwapReverse<iterator> > Vfirst;
+ // reverse RARB to obtain BA
+ std::vector < SwapReverse<iterator> > Vsecond;
+
+ planReverse(itab,itae,threads,Vfirst);
+ planReverse(itbb,itbe,threads,Vfirst);
+
+ if ( an <= bn )
+ {
+ planSwapReverse(itab,itab+an,itbe,threads,Vsecond);
+
+ itab += an;
+ itbe -= an;
+
+ planReverse(itbb,itbe,threads,Vsecond);
+ }
+ else
+ {
+ planSwapReverse(itab,itab+bn,itbe,threads,Vsecond);
+
+ itab += bn;
+ itbe -= bn;
+
+ planReverse(itab,itae,threads,Vsecond);
+ }
+
+ std::vector < std::pair<iterator,iterator> > VIfirst;
+ std::vector < std::pair<iterator,iterator> > VIsecond;
+
+ for ( auto R : Vfirst )
+ R.pushIntervals(VIfirst);
+ for ( auto R : Vsecond )
+ R.pushIntervals(VIsecond);
+
+ std::sort(VIfirst.begin(),VIfirst.end());
+ std::sort(VIsecond.begin(),VIsecond.end());
+
+ for ( std::size_t i = 1; i < VIfirst.size(); ++i )
+ assert ( VIfirst[i-1].second <= VIfirst[i].first );
+ for ( std::size_t i = 1; i < VIsecond.size(); ++i )
+ assert ( VIsecond[i-1].second <= VIsecond[i].first );
+
+ return std::make_pair(Vfirst,Vsecond);
+ }
+
+ template<typename iterator>
+ static void swapParallel(
+ iterator itab,
+ iterator itae,
+ iterator itbb,
+ iterator itbe,
+ std::size_t const threads
+ )
+ {
+ assert ( itae == itbb );
+
+ // #define SWAP_PARALLEL_DEBUG
+
+ #if defined(SWAP_PARALLEL_DEBUG)
+ typedef typename std::iterator_traits<iterator>::value_type value_type;
+ std::vector<value_type> VA(itab,itae);
+ std::vector<value_type> VB(itbb,itbe);
+ std::vector<value_type> V;
+ for ( auto i : VB )
+ V.push_back(i);
+ for ( auto i : VA )
+ V.push_back(i);
+ #endif
+
+ auto P = planSwap(itab,itae,itbb,itbe,threads);
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( std::size_t i = 0; i < P.first.size(); ++i )
+ P.first[i].dispatch();
+
+ #if defined(SWAP_PARALLEL_DEBUG)
+ std::vector<value_type> VAR = VA;
+ std::vector<value_type> VBR = VB;
+ std::reverse(VAR.begin(),VAR.end());
+ std::reverse(VBR.begin(),VBR.end());
+ std::vector<value_type> VR;
+ for ( auto i : VAR )
+ VR.push_back(i);
+ for ( auto i : VBR )
+ VR.push_back(i);
+
+ for ( std::size_t i = 0; i < VR.size(); ++i )
+ assert ( VR[i] == itab[i] );
+
+ for ( std::size_t i = 0; i < P.second.size(); ++i )
+ std::cerr << P.second[i].toString(itab) << std::endl;
+ #endif
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( std::size_t i = 0; i < P.second.size(); ++i )
+ P.second[i].dispatch();
+
+ #if defined(SWAP_PARALLEL_DEBUG)
+ std::reverse(VR.begin(),VR.end());
+ for ( std::size_t i = 0; i < V.size(); ++i )
+ assert ( VR[i] == V[i] );
+
+ bool ok = true;
+ for ( std::size_t i = 0; i < V.size(); ++i )
+ ok = ok && ( itab[i] == V[i] );
+
+ if ( !ok )
+ {
+ for ( std::size_t i = 0; i < V.size(); ++i )
+ std::cerr << "V[" << i << "]=" << V[i] << " itab[" << i << "]=" << itab[i] << std::endl;
+
+ assert ( ok );
+ }
+ #endif
+ }
+
+ template<
+ typename iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static void mergeInPlace(iterator ita, iterator itm, iterator ite, order_type const & order = order_type())
+ {
+ typedef std::tuple<iterator,iterator,iterator> element;
+ std::stack<element> S;
+ auto const d = std::distance(ita,ite);
+
+ if ( d > 1 )
+ S.push(element(ita,itm,ite));
+
+ while ( ! S.empty() )
+ {
+ element E = S.top();
+ S.pop();
+
+ std::vector<std::pair<iterator,iterator> > V(
+ {
+ std::make_pair(std::get<0>(E),std::get<1>(E)),
+ std::make_pair(std::get<1>(E),std::get<2>(E))
+ }
+ );
+
+ auto const d = std::distance(std::get<0>(E),std::get<2>(E));
+
+ assert ( d > 1 );
+
+ std::vector<iterator> const R = rankSearchMulti(V,d/2,order);
+
+ auto const I0 = std::make_pair(V.front().first,R.front());
+ auto const I1 = std::make_pair(R.front(),V.front().second);
+ auto const I2 = std::make_pair(V.back().first,R.back());
+ auto const I3 = std::make_pair(R.back(),V.back().second);
+
+ assert ( I0.second == I1.first );
+ assert ( I1.second == I2.first );
+ assert ( I2.second == I3.first );
+
+ swap(I1.first,I1.second,I1.second,I2.second);
+
+ // I0,I2,I1,I3
+ auto const n_0 = std::distance(I0.first,I0.second);
+ auto const n_1 = std::distance(I1.first,I1.second);
+ auto const n_2 = std::distance(I2.first,I2.second);
+ auto const n_3 = std::distance(I3.first,I3.second);
+ auto const n_left = n_0 + n_2;
+ auto const n_right = n_1 + n_3;
+
+ if ( n_right > 1 )
+ {
+ S.push(element(I0.first + n_0 + n_2,I0.first + n_0 + n_2 + n_1, I0.first + n_0 + n_2 + n_1 + n_3));
+ }
+ if ( n_left > 1 )
+ {
+ S.push(element(I0.first,I0.first + n_0,I0.first + n_0 + n_2));
+ }
+ }
+ }
+
+ template<typename iterator>
+ struct MergeInfo
+ {
+ iterator a;
+ iterator m;
+ iterator e;
+ std::size_t parts;
+
+ MergeInfo()
+ {
+
+ }
+
+ MergeInfo(
+ iterator ra,
+ iterator rm,
+ iterator re,
+ std::size_t const rparts
+ ) : a(ra), m(rm), e(re), parts(rparts)
+ {
+
+ }
+
+ template<typename order_type>
+ void dispatch(order_type const & order)
+ {
+ mergeInPlace(a,m,e,order);
+ }
+
+ template<typename temp_iterator, typename order_type>
+ void dispatch(
+ temp_iterator itt,
+ std::size_t const limit,
+ order_type const & order
+ )
+ {
+ mergeLimited(a,m,e,itt,limit,order);
+ }
+ };
+
+
+ template<
+ typename iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static void mergeInPlaceParallel(
+ iterator ita,
+ iterator itm,
+ iterator ite,
+ std::size_t const threads,
+ order_type const & order = order_type()
+ )
+ {
+ std::stack < MergeInfo<iterator> > S;
+ std::vector < MergeInfo<iterator> > V;
+ S.push(MergeInfo<iterator>(ita,itm,ite,threads));
+
+ while ( !S.empty() )
+ {
+ MergeInfo<iterator> M = S.top();
+ S.pop();
+
+ if ( M.parts > 1 )
+ {
+ std::size_t const left = (M.parts + 1) / 2;
+ std::size_t const right = M.parts - left;
+ std::size_t const ln = M.e-M.a;
+ std::size_t const split = (left * ln + M.parts- 1) / M.parts;
+ assert ( split <= ln );
+
+ #if 0
+ std::cerr << "parts=" << M.parts
+ << " left=" << left
+ << " right=" << right
+ << " ln=" << ln
+ << " split=" << split
+ << std::endl;
+ #endif
+
+ auto const Vsplit =
+ rankSearchMulti(
+ std::vector< std::pair<iterator,iterator> >(
+ {
+ std::make_pair(M.a,M.m),
+ std::make_pair(M.m,M.e)
+ }
+ ),
+ split,
+ order
+ );
+
+ iterator const ab = M.a;
+ iterator const am = Vsplit.front();
+ iterator const ae = M.m;
+ iterator const bb = M.m;
+ iterator const bm = Vsplit.back();
+ iterator const be = M.e;
+
+ std::size_t const r0 = am-ab;
+ std::size_t const r1 = ae-am;
+ std::size_t const r2 = bm-bb;
+ std::size_t const r3 = be-bm;
+
+ #if 0
+ std::cerr << "r0=" << r0 << std::endl;
+ std::cerr << "r1=" << r1 << std::endl;
+ std::cerr << "r2=" << r2 << std::endl;
+ std::cerr << "r3=" << r3 << std::endl;
+ #endif
+
+ swapParallel(am,ae,bb,bm,threads);
+
+ S.push(MergeInfo<iterator>(ab+r0+r2,ab+r0+r2+r1,ab+r0+r2+r1+r3,right));
+ S.push(MergeInfo<iterator>(ab,ab+r0,ab+r0+r2,left));
+ }
+ else
+ {
+ V.push_back(M);
+ }
+ }
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( std::size_t i = 0; i < V.size(); ++i )
+ V[i].dispatch(order);
+ }
+
+ template<
+ typename iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static void mergeLimitedParallel(
+ iterator ita,
+ iterator itm,
+ iterator ite,
+ temp_iterator itt,
+ std::size_t const tmpsize,
+ std::size_t const threads,
+ order_type const & order = order_type()
+ )
+ {
+ std::stack < MergeInfo<iterator> > S;
+ std::vector < MergeInfo<iterator> > V;
+ S.push(MergeInfo<iterator>(ita,itm,ite,threads));
+
+ while ( !S.empty() )
+ {
+ MergeInfo<iterator> M = S.top();
+ S.pop();
+
+ if ( M.parts > 1 )
+ {
+ std::size_t const left = (M.parts + 1) / 2;
+ std::size_t const right = M.parts - left;
+ std::size_t const ln = M.e-M.a;
+ std::size_t const split = (left * ln + M.parts- 1) / M.parts;
+ assert ( split <= ln );
+
+ #if 0
+ std::cerr << "parts=" << M.parts
+ << " left=" << left
+ << " right=" << right
+ << " ln=" << ln
+ << " split=" << split
+ << std::endl;
+ #endif
+
+ auto const Vsplit =
+ rankSearchMulti(
+ std::vector< std::pair<iterator,iterator> >(
+ {
+ std::make_pair(M.a,M.m),
+ std::make_pair(M.m,M.e)
+ }
+ ),
+ split,
+ order
+ );
+
+ iterator const ab = M.a;
+ iterator const am = Vsplit.front();
+ iterator const ae = M.m;
+ iterator const bb = M.m;
+ iterator const bm = Vsplit.back();
+ iterator const be = M.e;
+
+ std::size_t const r0 = am-ab;
+ std::size_t const r1 = ae-am;
+ std::size_t const r2 = bm-bb;
+ std::size_t const r3 = be-bm;
+
+ #if 0
+ std::cerr << "r0=" << r0 << std::endl;
+ std::cerr << "r1=" << r1 << std::endl;
+ std::cerr << "r2=" << r2 << std::endl;
+ std::cerr << "r3=" << r3 << std::endl;
+ #endif
+
+ swapParallel(am,ae,bb,bm,threads);
+
+ S.push(MergeInfo<iterator>(ab+r0+r2,ab+r0+r2+r1,ab+r0+r2+r1+r3,right));
+ S.push(MergeInfo<iterator>(ab,ab+r0,ab+r0+r2,left));
+ }
+ else
+ {
+ V.push_back(M);
+ }
+ }
+
+ std::size_t const tmpperthread = (tmpsize + threads - 1)/threads;
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( std::size_t i = 0; i < V.size(); ++i )
+ {
+ std::size_t const tmplow = i * tmpperthread;
+ std::size_t const tmphigh = std::min(tmplow + tmpperthread, tmpsize);
+
+ V[i].dispatch(itt+tmplow,tmphigh-tmplow,order);
+ }
+ }
+
+
+ template<
+ typename iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static void mergeLimited(
+ iterator ita,
+ iterator itm,
+ iterator ite,
+ temp_iterator itt,
+ std::size_t const limit,
+ order_type const & order = order_type()
+ )
+ {
+ typedef std::tuple<iterator,iterator,iterator> element;
+ std::stack<element> S;
+ auto const d = std::distance(ita,ite);
+
+ if ( d > 1 )
+ S.push(element(ita,itm,ite));
+
+ while ( ! S.empty() )
+ {
+ element E = S.top();
+ S.pop();
+
+ auto const d = std::distance(std::get<0>(E),std::get<2>(E));
+
+ if ( d <= static_cast<decltype(d)>(limit) )
+ {
+ std::merge(std::get<0>(E),std::get<1>(E),std::get<1>(E),std::get<2>(E),itt);
+ std::copy(itt,itt+d,std::get<0>(E));
+ }
+ else
+ {
+ std::vector<std::pair<iterator,iterator> > V(
+ {
+ std::make_pair(std::get<0>(E),std::get<1>(E)),
+ std::make_pair(std::get<1>(E),std::get<2>(E))
+ }
+ );
+
+
+ assert ( d > 1 );
+
+ std::vector<iterator> const R = rankSearchMulti(V,d/2,order);
+
+ auto const I0 = std::make_pair(V.front().first,R.front());
+ auto const I1 = std::make_pair(R.front(),V.front().second);
+ auto const I2 = std::make_pair(V.back().first,R.back());
+ auto const I3 = std::make_pair(R.back(),V.back().second);
+
+ assert ( I0.second == I1.first );
+ assert ( I1.second == I2.first );
+ assert ( I2.second == I3.first );
+
+ swap(I1.first,I1.second,I1.second,I2.second);
+
+ // I0,I2,I1,I3
+ auto const n_0 = std::distance(I0.first,I0.second);
+ auto const n_1 = std::distance(I1.first,I1.second);
+ auto const n_2 = std::distance(I2.first,I2.second);
+ auto const n_3 = std::distance(I3.first,I3.second);
+ auto const n_left = n_0 + n_2;
+ auto const n_right = n_1 + n_3;
+
+ if ( n_right > 1 )
+ {
+ S.push(element(I0.first + n_0 + n_2,I0.first + n_0 + n_2 + n_1, I0.first + n_0 + n_2 + n_1 + n_3));
+ }
+ if ( n_left > 1 )
+ {
+ S.push(element(I0.first,I0.first + n_0,I0.first + n_0 + n_2));
+ }
+ }
+ }
+ }
+
+
+ template<
+ typename const_iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ const_iterator
+ >::value_type
+ >
+ >
+ static std::vector<
+ MergeRequest<
+ const_iterator,
+ temp_iterator,
+ order_type
+ >
+ > planMerge(
+ const_iterator ab,
+ const_iterator ae,
+ const_iterator bb,
+ const_iterator be,
+ temp_iterator itt,
+ std::size_t const parts,
+ bool const parallelSearch,
+ order_type const & order = order_type()
+ )
+ {
+ auto const n = std::distance(ab,ae);
+ auto const m = std::distance(bb,be);
+ auto const s = n + m;
+
+ if ( ! s )
+ return std::vector< MergeRequest<const_iterator,temp_iterator,order_type> >();
+
+ auto const perthread = (s + parts - 1 ) / parts;
+ auto const numpack = (s + perthread - 1) / perthread;
+
+ std::vector< std::pair<const_iterator,const_iterator> > const Vsplitin(
+ {
+ std::make_pair(ab,ae),
+ std::make_pair(bb,be)
+ }
+ );
+
+ std::vector < std::vector<const_iterator> > VV(numpack+1);
+
+ if ( parallelSearch )
+ {
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(parallelSearch ? parts : 1) schedule(dynamic,1)
+ #endif
+ for ( std::size_t t = 0; t < numpack; ++t )
+ VV[t] = rankSearchMulti(Vsplitin,t * perthread);
+ }
+ else
+ {
+ for ( std::size_t t = 0; t < numpack; ++t )
+ VV[t] = rankSearchMulti(Vsplitin,t * perthread);
+ }
+ VV[numpack] = rankSearchMulti(Vsplitin,s,order);
+
+ std::vector < MergeRequest<const_iterator,temp_iterator,order_type> > VREQ;
+ for ( std::size_t t = 0; t < numpack; ++t )
+ {
+ auto const lab = VV.at(t+0).front();
+ auto const lae = VV.at(t+1).front();
+ auto const lbb = VV.at(t+0).back();
+ auto const lbe = VV.at(t+1).back();
+ auto const o = t * perthread;
+
+ assert ( (lab-ab) + (lbb-bb) == static_cast<decltype(n)>(o) );
+
+ VREQ.push_back(MergeRequest<const_iterator,temp_iterator,order_type>(lab,lae,lbb,lbe,itt + o,&order));
+ }
+
+ return VREQ;
+ }
+
+ template<
+ typename const_iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ const_iterator
+ >::value_type
+ >
+ >
+ static std::vector<
+ MergeRequestDelayedSearch<
+ const_iterator,
+ temp_iterator,
+ order_type
+ >
+ > planMergeDelayedSearch(
+ const_iterator ab,
+ const_iterator ae,
+ const_iterator bb,
+ const_iterator be,
+ temp_iterator itt,
+ std::size_t const parts,
+ order_type const & order = order_type()
+ )
+ {
+ auto const n = std::distance(ab,ae);
+ auto const m = std::distance(bb,be);
+ auto const s = n + m;
+
+ if ( ! s )
+ return std::vector< MergeRequestDelayedSearch<const_iterator,temp_iterator,order_type> >();
+
+ auto const perthread = (s + parts - 1 ) / parts;
+ auto const numpack = (s + perthread - 1) / perthread;
+
+ std::vector < MergeRequestDelayedSearch<const_iterator,temp_iterator,order_type> > VREQ;
+ for ( std::size_t t = 0; t < numpack; ++t )
+ {
+ auto const from = t * perthread;
+ auto const to = (t+1 == numpack) ? s : (from+perthread);
+ VREQ.push_back(MergeRequestDelayedSearch<const_iterator,temp_iterator,order_type>(ab,ae,bb,be,itt,from,to,&order));
+ }
+
+ return VREQ;
+ }
+
+ template<
+ typename const_iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ const_iterator
+ >::value_type
+ >
+ >
+ static std::size_t planMergeDelayedSearchP(
+ std::vector < std::shared_ptr<SortOp> > & Vop,
+ const_iterator ab,
+ const_iterator ae,
+ const_iterator bb,
+ const_iterator be,
+ temp_iterator itt,
+ std::size_t const parts,
+ order_type const & order = order_type()
+ )
+ {
+ auto const n = std::distance(ab,ae);
+ auto const m = std::distance(bb,be);
+ auto const s = n + m;
+
+ if ( ! s )
+ return 0;
+
+ auto const perthread = (s + parts - 1 ) / parts;
+ auto const numpack = (s + perthread - 1) / perthread;
+
+ for ( std::size_t t = 0; t < numpack; ++t )
+ {
+ auto const from = t * perthread;
+ auto const to = (t+1 == numpack) ? s : (from+perthread);
+ std::shared_ptr< SortOp > ptr(
+ new MergeRequestDelayedSearch<const_iterator,temp_iterator,order_type>(ab,ae,bb,be,itt,from,to,&order)
+ );
+ Vop.emplace_back(ptr);
+ }
+
+ return numpack;
+ }
+
+ template<
+ typename const_iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ const_iterator
+ >::value_type
+ >
+ >
+ static std::vector<
+ MergeRequest<
+ const_iterator,
+ temp_iterator,
+ order_type
+ >
+ > planMerge(
+ const_iterator ab,
+ const_iterator ac,
+ const_iterator ae,
+ temp_iterator itt,
+ std::size_t const parts,
+ bool const parallelSearch,
+ order_type const & order = order_type()
+ )
+ {
+ return planMerge(ab,ac,ac,ae,itt,parts,parallelSearch,order);
+ }
+
+ template<
+ typename const_iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ const_iterator
+ >::value_type
+ >
+ >
+ static std::vector<
+ MergeRequestDelayedSearch<
+ const_iterator,
+ temp_iterator,
+ order_type
+ >
+ > planMergeDelayedSearch(
+ const_iterator ab,
+ const_iterator ac,
+ const_iterator ae,
+ temp_iterator itt,
+ std::size_t const parts,
+ order_type const & order = order_type()
+ )
+ {
+ return planMergeDelayedSearch(ab,ac,ac,ae,itt,parts,order);
+ }
+
+ template<
+ typename iterator,
+ typename temp_iterator
+ >
+ static std::size_t
+ planCopyBack(
+ std::vector < std::shared_ptr<SortOp> > & Vop,
+ iterator ab,
+ iterator ae,
+ temp_iterator itt,
+ std::size_t const parts
+ )
+ {
+ auto const n = std::distance(ab,ae);
+
+ if ( ! n )
+ return 0;
+
+ std::size_t const perthread = (n + parts - 1)/parts;
+ std::size_t const numpacks = (n + perthread - 1)/perthread;
+
+ for ( std::size_t i = 0; i < numpacks; ++i )
+ {
+ auto const low = i * perthread;
+ auto const high = (i+1 == numpacks) ? n : low+perthread;
+
+ std::shared_ptr<SortOp> ptr(
+ new CopyRequest<temp_iterator,iterator>(
+ itt + low,
+ itt + high,
+ ab + low
+ )
+ );
+
+ Vop.emplace_back(ptr);
+ }
+
+ return numpacks;
+ }
+
+ template<
+ typename iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static
+ std::pair
+ <
+ std::vector
+ <
+ MergeRequest<
+ iterator,
+ temp_iterator,
+ order_type
+ >
+ >,
+ std::vector
+ <
+ CopyRequest<
+ temp_iterator,
+ iterator
+ >
+ >
+ >
+ planMergeCopyBack(
+ iterator ab,
+ iterator ac,
+ iterator ae,
+ temp_iterator itt,
+ std::size_t const parts,
+ bool const parallelSearch,
+ order_type const & order = order_type()
+ )
+ {
+ auto const Vmerge = planMerge(ab,ac,ae,itt,parts,parallelSearch,order);
+ auto const gn = std::distance(ab,ae);
+ typename std::remove_const<decltype(gn)>::type o = 0;
+ std::vector< CopyRequest<temp_iterator,iterator> > Vcopy(Vmerge.size());
+
+ for ( std::size_t i = 0; i < Vmerge.size(); ++i )
+ {
+ auto const & req = Vmerge[i];
+ auto const n = (req.ae-req.ab) + (req.be-req.bb);
+
+ auto const from = o;
+ auto const to = from + n;
+
+ Vcopy[i] = CopyRequest<temp_iterator,iterator>(itt+from,itt+to,ab+from);
+
+ o = to;
+ }
+
+ assert ( o == gn );
+
+ return std::make_pair(Vmerge,Vcopy);
+ }
+
+ template<
+ typename const_iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ const_iterator
+ >::value_type
+ >
+ >
+ static void mergeParallel(
+ const_iterator ab,
+ const_iterator ae,
+ const_iterator bb,
+ const_iterator be,
+ temp_iterator itt,
+ std::size_t const threads,
+ bool const parallelSearch,
+ order_type const & order = order_type()
+ )
+ {
+ auto VREQ = planMerge(ab,ae,bb,be,itt,threads,parallelSearch,order);
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( std::size_t t = 0; t < VREQ.size(); ++t )
+ VREQ[t].dispatch();
+ }
+
+ template<
+ typename const_iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ const_iterator
+ >::value_type
+ >
+ >
+ static void mergeParallelDelayedSearch(
+ const_iterator ab,
+ const_iterator ae,
+ const_iterator bb,
+ const_iterator be,
+ temp_iterator itt,
+ std::size_t const threads,
+ order_type const & order = order_type()
+ )
+ {
+ auto VREQ = planMergeDelayedSearch(ab,ae,bb,be,itt,threads,order);
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( std::size_t t = 0; t < VREQ.size(); ++t )
+ VREQ[t].dispatch();
+ }
+
+ template<
+ typename const_iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ const_iterator
+ >::value_type
+ >
+ >
+ static void mergeParallelCopyBack(
+ const_iterator ab,
+ const_iterator ac,
+ const_iterator ae,
+ temp_iterator itt,
+ std::size_t const threads,
+ bool const parallelSearch,
+ order_type const & order = order_type()
+ )
+ {
+ auto PREQCOPY = planMergeCopyBack(ab,ac,ae,itt,threads,parallelSearch,order);
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( std::size_t t = 0; t < PREQCOPY.first.size(); ++t )
+ PREQCOPY.first[t].dispatch();
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( std::size_t t = 0; t < PREQCOPY.second.size(); ++t )
+ PREQCOPY.second[t].dispatch();
+ }
+
+ struct SortFinishedInfoInterface
+ {
+ virtual ~SortFinishedInfoInterface() {}
+ };
+
+ struct SortFinishedInterface
+ {
+ virtual ~SortFinishedInterface() {}
+ virtual void sortFinished(std::shared_ptr<SortFinishedInfoInterface>) = 0;
+ };
+
+ struct SortInfo
+ {
+ std::vector < std::vector < std::shared_ptr<SortOp> > > Vop;
+
+ void dispatchOMP(std::size_t const threads)
+ {
+ std::atomic<int> parfailed(0);
+
+ for ( std::size_t i = 0; (! parfailed.load()) && i < Vop.size(); ++i )
+ {
+ std::vector < std::shared_ptr<SortOp> > & Lop = Vop.at(i);
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( std::size_t t = 0; t < Lop.size(); ++t )
+ {
+ try
+ {
+ Lop[t]->dispatch();
+ }
+ catch(...)
+ {
+ parfailed.store(1);
+ }
+ }
+ }
+ }
+ };
+
+ struct SortControl
+ {
+ std::shared_ptr<SortInfo> SI;
+ std::vector < std::shared_ptr < std::atomic < std::size_t > > > VA;
+
+ SortControl(std::shared_ptr<SortInfo> rSI)
+ : SI(rSI), VA(SI->Vop.size())
+ {
+ for ( std::size_t i = 0; i < SI->Vop.size(); ++i )
+ {
+ std::shared_ptr< std::atomic<std::size_t> > ptr(
+ new std::atomic<std::size_t>(SI->Vop[i].size())
+ );
+ VA[i] = ptr;
+ }
+ }
+
+ bool dec(std::size_t const j)
+ {
+ std::atomic<std::size_t> & A = *(VA[j]);
+ std::size_t const r = --A;
+ return r == 0;
+ }
+ };
+
+ struct SortPackageData : public libmaus2::parallel::threadpool::ThreadWorkPackageData
+ {
+ libmaus2::util::atomic_shared_ptr<SortControl> SC;
+ std::pair < std::atomic<std::size_t>, std::atomic<std::size_t> > P;
+ std::shared_ptr<SortFinishedInfoInterface> finishedInfo;
+ std::atomic<SortFinishedInterface *> finishedInterface;
+
+ SortPackageData() : SC(), P(std::make_pair(0,0)), finishedInfo(), finishedInterface(nullptr) {}
+ };
+
+ struct SortPackageDispatcher : public libmaus2::parallel::threadpool::ThreadPoolDispatcher
+ {
+ virtual void dispatch(
+ libmaus2::util::shared_ptr<libmaus2::parallel::threadpool::ThreadWorkPackage> package,
+ libmaus2::parallel::threadpool::ThreadPool * pool
+ )
+ {
+ SortPackageData & SPD = dynamic_cast<SortPackageData &>(*(package->data.load()));
+
+ std::size_t const first = SPD.P.first.load();
+ std::size_t const second = SPD.P.second.load();
+
+ SPD.SC.load()->SI->Vop.at(first).at(second)->dispatch();
+
+ bool const leveldone = SPD.SC.load()->dec(first);
+
+ if ( leveldone )
+ {
+ std::size_t const nextlevel = first+1;
+
+ if ( nextlevel < SPD.SC.load()->SI->Vop.size() )
+ {
+ std::size_t const levelsize = SPD.SC.load()->SI->Vop.at(nextlevel).size();
+
+ for ( size_t i = 0; i < levelsize; ++i )
+ {
+ libmaus2::util::shared_ptr<libmaus2::parallel::threadpool::ThreadWorkPackage> newpack(pool->getPackage<SortPackageData>());
+ SortPackageData & newSPD = dynamic_cast<SortPackageData &>(*(newpack->data.load()));
+ newSPD.SC.store(SPD.SC.load());
+ newSPD.P.first.store(nextlevel);
+ newSPD.P.second.store(i);
+ newSPD.finishedInfo = SPD.finishedInfo;
+ newSPD.finishedInterface.store(SPD.finishedInterface.load());
+ pool->enqueue(newpack);
+ }
+ }
+ else
+ {
+ SPD.finishedInterface.load()->sortFinished(SPD.finishedInfo);
+ }
+ }
+ }
+ };
+
+ template<
+ typename iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static void sortWithTempSpace(
+ iterator ab,
+ iterator ae,
+ temp_iterator itt,
+ libmaus2::parallel::threadpool::ThreadPool & TP,
+ std::shared_ptr<SortFinishedInfoInterface> finishedInfo,
+ SortFinishedInterface * finishedInterface,
+ order_type const & order = order_type()
+ )
+ {
+ auto const n = std::distance(ab,ae);
+
+ TP.ensureDispatcherRegistered<SortPackageData>(libmaus2::util::shared_ptr<libmaus2::parallel::threadpool::ThreadPoolDispatcher>(new SortPackageDispatcher));
+
+ if ( ! n )
+ finishedInterface->sortFinished(finishedInfo);
+
+ std::shared_ptr<SortInfo> SI = planSortWithTempSpace(ab,ae,itt,TP.threads,order);
+ libmaus2::util::shared_ptr<SortControl> SC(new SortControl(SI));
+
+ assert ( SI->Vop.size() );
+
+ std::size_t const nextlevel = 0;
+ std::size_t const levelsize = SI->Vop.at(nextlevel).size();
+
+ for ( size_t i = 0; i < levelsize; ++i )
+ {
+ libmaus2::util::shared_ptr<libmaus2::parallel::threadpool::ThreadWorkPackage> newpack(TP.getPackage<SortPackageData>());
+ SortPackageData & newSPD = dynamic_cast<SortPackageData &>(*(newpack->data.load()));
+ newSPD.SC.store(SC);
+ newSPD.P.first.store(nextlevel);
+ newSPD.P.second.store(i);
+ newSPD.finishedInfo = finishedInfo;
+ newSPD.finishedInterface.store(finishedInterface);
+ TP.enqueue(newpack);
+ }
+ }
+
+ template<
+ typename iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static void sortWithTempSpaceViaThreadPool(
+ iterator ab,
+ iterator ae,
+ temp_iterator itt,
+ std::size_t const threads,
+ order_type const & order = order_type()
+ )
+ {
+ libmaus2::parallel::threadpool::ThreadPool TP(threads);
+ TP.registerDispatcher<SortPackageData>(libmaus2::util::shared_ptr<libmaus2::parallel::threadpool::ThreadPoolDispatcher>(new SortPackageDispatcher));
+
+ struct SortFinishedInfo : public SortFinishedInfoInterface
+ {
+ libmaus2::parallel::threadpool::ThreadPool * TP;
+
+ SortFinishedInfo(libmaus2::parallel::threadpool::ThreadPool * rTP)
+ : TP(rTP)
+ {
+ }
+ };
+
+ std::shared_ptr<SortFinishedInfoInterface> info(new SortFinishedInfo(&TP));
+
+ struct SortFinished : public SortFinishedInterface
+ {
+ virtual void sortFinished(std::shared_ptr<SortFinishedInfoInterface> ptr)
+ {
+ SortFinishedInfo & info = dynamic_cast<SortFinishedInfo &>(*ptr);
+ info.TP->terminate();
+ }
+ };
+
+ SortFinished SF;
+
+ sortWithTempSpace(ab,ae,itt,TP,info,&SF,order);
+
+ TP.join();
+ }
+
+
+ template<
+ typename iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static std::shared_ptr<SortInfo> planSortWithTempSpace(
+ iterator ab,
+ iterator ae,
+ temp_iterator itt,
+ std::size_t const threads,
+ order_type const & order = order_type()
+ )
+ {
+ std::shared_ptr<SortInfo> SI(new SortInfo);
+
+ auto const n = std::distance(ab,ae);
+
+ if ( ! n )
+ return SI;
+
+ typedef typename std::remove_const<decltype(n)>::type index_type;
+
+ index_type const packsize = (n + threads - 1) / threads;
+ index_type const numpacks = (n + packsize - 1) / packsize;
+
+ std::vector < std::pair<index_type,index_type> > Vblock(numpacks);
+
+ std::vector < std::vector < std::shared_ptr<SortOp> > > & Vop = SI->Vop;
+
+ {
+ std::vector < std::shared_ptr<SortOp> > Lop(numpacks);
+
+ for ( index_type t = 0; t < numpacks; ++t )
+ {
+ index_type const low = t * packsize;
+ index_type const high = std::min(static_cast<index_type>(low+packsize),n);
+
+ Vblock[t] = std::make_pair(low,high);
+
+ std::shared_ptr<SortOp> ptr(
+ new SortRequest<iterator,order_type>(ab+low,ab+high,&order)
+ );
+
+ Lop[t] = ptr;
+ }
+
+ Vop.emplace_back(Lop);
+ }
+
+
+ while ( Vblock.size() > 1 )
+ {
+ std::vector < std::pair<index_type,index_type> > Nblock;
+
+ std::vector < std::shared_ptr<SortOp> > Lopmerge;
+ std::vector < std::shared_ptr<SortOp> > Lopcopy;
+
+ std::size_t i = 0;
+
+ for ( ; i + 1 < Vblock.size(); i += 2 )
+ {
+ index_type const low = Vblock[i].first;
+ index_type const middle = Vblock[i].second;
+ assert ( middle == Vblock[i+1].first );
+ index_type const high = Vblock[i+1].second;
+
+ planMergeDelayedSearchP(
+ Lopmerge,
+ ab+low,
+ ab+middle,
+ ab+middle,
+ ab+high,
+ itt+low,
+ threads,
+ order
+ );
+
+ planCopyBack(
+ Lopcopy,
+ ab+low,
+ ab+high,
+ itt+low,
+ threads
+ );
+
+ Nblock.push_back(std::make_pair(low,high));
+ }
+
+ Vop.emplace_back(Lopmerge);
+ Vop.emplace_back(Lopcopy);
+
+ if ( i < Vblock.size() )
+ Nblock.push_back(Vblock[i++]);
+
+ assert ( i == Vblock.size() );
+
+ Nblock.swap(Vblock);
+ }
+
+ return SI;
+ }
+
+ template<
+ typename iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static void sortWithTempSpace(
+ iterator ab,
+ iterator ae,
+ temp_iterator itt,
+ std::size_t const threads,
+ order_type const & order = order_type()
+ )
+ {
+ std::shared_ptr<SortInfo> SI = planSortWithTempSpace(ab,ae,itt,threads,order);
+ SI->dispatchOMP(threads);
+ }
+
+ template<
+ typename iterator,
+ typename temp_iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static void sortLimited(
+ iterator ab,
+ iterator ae,
+ temp_iterator itt,
+ std::size_t const limit,
+ std::size_t const threads,
+ order_type const & order = order_type()
+ )
+ {
+ auto const n = std::distance(ab,ae);
+
+ typedef typename std::remove_const<decltype(n)>::type index_type;
+
+ if ( ! n )
+ return;
+
+ index_type const packsize = (n + threads - 1) / threads;
+ index_type const numpacks = (n + packsize - 1) / packsize;
+
+ std::vector < std::pair<index_type,index_type> > Vblock(numpacks);
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( index_type t = 0; t < numpacks; ++t )
+ {
+ index_type const low = t * packsize;
+ index_type const high = std::min(static_cast<index_type>(low+packsize),n);
+
+ Vblock[t] = std::make_pair(low,high);
+
+ std::sort(ab + low, ab + high, order);
+ }
+
+ while ( Vblock.size() > 1 )
+ {
+ std::vector < std::pair<index_type,index_type> > Nblock;
+
+ std::size_t i = 0;
+
+ for ( ; i + 1 < Vblock.size(); i += 2 )
+ {
+ index_type const low = Vblock[i].first;
+ index_type const middle = Vblock[i].second;
+ assert ( middle == Vblock[i+1].first );
+ index_type const high = Vblock[i+1].second;
+
+ mergeLimitedParallel(
+ ab + low,ab + middle,ab + high,
+ itt,
+ limit,
+ threads,
+ order);
+
+ Nblock.push_back(std::make_pair(low,high));
+ }
+
+ if ( i < Vblock.size() )
+ Nblock.push_back(Vblock[i++]);
+
+ assert ( i == Vblock.size() );
+
+ Nblock.swap(Vblock);
+ }
+ }
+
+ template<
+ typename iterator,
+ typename order_type =
+ std::less<
+ typename std::iterator_traits<
+ iterator
+ >::value_type
+ >
+ >
+ static void sortInPlace(
+ iterator ab,
+ iterator ae,
+ std::size_t const threads,
+ order_type const & order = order_type()
+ )
+ {
+ auto const n = std::distance(ab,ae);
+
+ typedef typename std::remove_const<decltype(n)>::type index_type;
+
+ if ( ! n )
+ return;
+
+ index_type const packsize = (n + threads - 1) / threads;
+ index_type const numpacks = (n + packsize - 1) / packsize;
+
+ std::vector < std::pair<index_type,index_type> > Vblock(numpacks);
+
+ #if defined(_OPENMP)
+ #pragma omp parallel for num_threads(threads) schedule(dynamic,1)
+ #endif
+ for ( index_type t = 0; t < numpacks; ++t )
+ {
+ index_type const low = t * packsize;
+ index_type const high = std::min(static_cast<index_type>(low+packsize),n);
+
+ Vblock[t] = std::make_pair(low,high);
+
+ std::sort(ab + low, ab + high, order);
+ }
+
+ while ( Vblock.size() > 1 )
+ {
+ std::vector < std::pair<index_type,index_type> > Nblock;
+
+ std::size_t i = 0;
+
+ for ( ; i + 1 < Vblock.size(); i += 2 )
+ {
+ index_type const low = Vblock[i].first;
+ index_type const middle = Vblock[i].second;
+ assert ( middle == Vblock[i+1].first );
+ index_type const high = Vblock[i+1].second;
+
+ mergeInPlaceParallel(
+ ab + low,ab + middle,ab + high,
+ threads,
+ order);
+
+ Nblock.push_back(std::make_pair(low,high));
+ }
+
+ if ( i < Vblock.size() )
+ Nblock.push_back(Vblock[i++]);
+
+ assert ( i == Vblock.size() );
+
+ Nblock.swap(Vblock);
+ }
+ }
+ };
+ }
+}
+#endif
=====================================
src/libmaus2/vcf/VCFParser.hpp
=====================================
@@ -1274,7 +1274,9 @@ namespace libmaus2
if ( TSC0.size() != TSC1.size() )
{
libmaus2::exception::LibMausException lme;
- lme.getStream() << "[E] malformed VCF line " <<
+ lme.getStream() << "[E] malformed VCF line (sample data for sample "
+ << samplenum << " has " << TSC1.size() << " fields while format field has "
+ << TSC0.size() << ") " <<
std::string(a,T.get(T.size()-1,a).second) << std::endl;
lme.finish();
throw lme;
=====================================
src/test/testThreadPoolCramEncode.cpp
=====================================
@@ -21,6 +21,9 @@
#include <libmaus2/util/GetFileSize.hpp>
#include <libmaus2/parallel/NumCpus.hpp>
#include <regex>
+#include <filesystem>
+#include <libmaus2/fastx/FastAReader.hpp>
+#include <libmaus2/digest/md5.hpp>
int main(int argc, char * argv[])
{
@@ -48,7 +51,7 @@ int main(int argc, char * argv[])
return EXIT_FAILURE;
}
- char * ref_cache = getenv("REF_CACHE");
+ char const * ref_cache = getenv("REF_CACHE");
if ( ! ref_cache )
{
@@ -56,8 +59,75 @@ int main(int argc, char * argv[])
return EXIT_FAILURE;
}
+ std::filesystem::path ref_cache_path(ref_cache);
+ std::set<std::filesystem::path> Spathseen;
+
+ while ( std::filesystem::exists(ref_cache_path) && std::filesystem::is_symlink(ref_cache_path) )
+ {
+ if ( Spathseen.find(ref_cache_path) != Spathseen.end() )
+ {
+ std::cerr << "[E] circular loop of symlinks detected for REF_CACHE path" << std::endl;
+ return EXIT_FAILURE;
+ }
+
+ Spathseen.insert(ref_cache_path);
+
+ auto n_ref_cache_path = std::filesystem::read_symlink(ref_cache_path);
+
+ ref_cache_path = n_ref_cache_path;
+ }
+
+ if ( !std::filesystem::exists(ref_cache_path) )
+ {
+ std::cerr << "[E] REF_CACHE directory " << ref_cache_path << " does not exist" << std::endl;
+ return EXIT_FAILURE;
+ }
+
+ // std::cerr << "[V] using REF_CACHE path " << ref_cache_path << std::endl;
+
libmaus2::parallel::threadpool::bam::BamHeaderSeqData seqdata(dictfn);
+ if ( seqdata.sequencesMissing(ref_cache_path) )
+ {
+ std::cerr << "[V] scanning FastA file " << ref << " as REF_CACHE is incomplete" << std::endl;
+
+ libmaus2::fastx::FastAReader FA(ref);
+ libmaus2::fastx::FastAReader::pattern_type pat;
+
+ while ( FA.getNextPatternUnlocked(pat) )
+ {
+ std::size_t o = 0;
+ std::string & s = pat.spattern;
+ for ( std::size_t i = 0; i < s.size(); ++i )
+ if ( i >= 33 || i <= 126 )
+ s[o++] = ::toupper(s[i]);
+ s.resize(o);
+
+ std::string digest;
+ bool const ok = libmaus2::util::MD5::md5(s,digest);
+
+ if ( ! ok )
+ {
+ std::cerr << "[E] failed to compute md5 checksum for " << pat.sid << " in " << ref << std::endl;
+ return EXIT_FAILURE;
+ }
+
+ std::filesystem::path dictpath = ref_cache_path;
+ dictpath /= digest;
+
+ std::cerr << "[V] writing " << dictpath << " for " << pat.sid << " in " << ref << std::endl;
+ libmaus2::aio::OutputStreamInstance OSI(static_cast<std::string>(dictpath));
+ OSI << s;
+ OSI.flush();
+ }
+ }
+
+ if ( seqdata.sequencesMissing(ref_cache_path) )
+ {
+ std::cerr << "[E] unable to create required REF_CACHE files (mismatch between dictionary and FastA?)" << std::endl;
+ return EXIT_FAILURE;
+ }
+
std::size_t const hwthreads = libmaus2::parallel::NumCpus::getNumLogicalProcessors();
std::size_t argthreads = arg.getParsedArgOrDefault<std::size_t>("t", hwthreads);
=====================================
src/test/testsort.cpp
=====================================
@@ -16,7 +16,7 @@
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>..
*/
-
+#include <libmaus2/sorting/RankSearch.hpp>
#include <libmaus2/huffman/LFSupportDecoder.hpp>
#include <libmaus2/huffman/LFRankPosEncoder.hpp>
@@ -767,9 +767,673 @@ void testQuickSort()
}
}
+
int main()
{
+ {
+ std::cerr << "[V] running sort..." << std::endl;
+
+ std::size_t const numtries = 4096;
+ std::size_t const maxlen = 64*1024;
+ int const mod = 8192;
+
+ for ( std::size_t z = 0; z < numtries; ++z )
+ {
+ std::size_t const n = libmaus2::random::Random::rand64() % (maxlen+1);
+
+ // std::size_t const r = n-l;
+
+ std::vector<int> V;
+ for ( std::size_t j = 0; j < n; ++j )
+ V.push_back(libmaus2::random::Random::rand64() % mod);
+ decltype(V) const VR = V;
+
+ decltype(V) VT = VR;
+
+ std::size_t const threads = 10;
+
+ for ( std::size_t z = 0; z < 3; ++z )
+ {
+ V = VR;
+ decltype(V) VT = VR;
+
+ if ( z == 0 )
+ {
+ std::shared_ptr<int[]> T(new int[n]);
+ libmaus2::sorting::RankSearch::sortWithTempSpace(
+ V.begin(),
+ V.end(),
+ T.get(),
+ threads
+ );
+ }
+ else if ( z == 1 )
+ {
+ std::size_t const ln = 1024;
+ std::shared_ptr<int[]> T(new int[ln]);
+ libmaus2::sorting::RankSearch::sortLimited(
+ V.begin(),
+ V.end(),
+ T.get(),
+ ln,
+ threads
+ );
+ }
+ else if ( z == 2 )
+ {
+ libmaus2::sorting::RankSearch::sortInPlace(
+ V.begin(),
+ V.end(),
+ threads
+ );
+ }
+ else if ( z == 3 )
+ {
+ std::shared_ptr<int[]> T(new int[n]);
+ libmaus2::sorting::RankSearch::sortWithTempSpaceViaThreadPool(
+ V.begin(),
+ V.end(),
+ T.get(),
+ threads
+ );
+
+ }
+
+
+ for ( std::size_t j = 1; j < n; ++j )
+ assert ( V[j-1] <= V[j] );
+
+ std::sort(VT.begin(),VT.end());
+ for ( std::size_t j = 0; j < n; ++j )
+ assert ( VT[j] == V[j] );
+ }
+
+ std::size_t const zp = z+1;
+
+ if ( zp % 1024 == 0 )
+ std::cerr << zp << "/" << numtries << std::endl;
+ }
+
+ std::cerr << "[V] running sort done" << std::endl;
+ }
+
+
+ {
+ std::cerr << "[V] running mergeParallelDelayedSearch..." << std::endl;
+
+ std::size_t const numtries = 4096;
+ std::size_t const maxlen = 64*1024;
+ int const mod = 8192;
+
+ for ( std::size_t z = 0; z < numtries; ++z )
+ {
+ std::size_t const n = libmaus2::random::Random::rand64() % (maxlen+1);
+ std::size_t const l = libmaus2::random::Random::rand64() % (n+1);
+
+ // std::size_t const r = n-l;
+
+ std::vector<int> V;
+ for ( std::size_t j = 0; j < n; ++j )
+ V.push_back(libmaus2::random::Random::rand64() % mod);
+
+ std::sort(V.begin(),V.begin()+l);
+ std::sort(V.begin()+l,V.end());
+
+ std::shared_ptr<int[]> T(new int[n]);
+
+ std::size_t const threads = 10;
+ libmaus2::sorting::RankSearch::mergeParallelDelayedSearch(
+ V.begin(),V.begin()+l,
+ V.begin()+l,V.end(),
+ T.get(),
+ threads
+ );
+
+ for ( std::size_t j = 1; j < n; ++j )
+ assert ( T[j-1] <= T[j] );
+
+ std::sort(V.begin(),V.end());
+ for ( std::size_t j = 0; j < n; ++j )
+ assert ( T[j] == V[j] );
+
+ std::size_t const zp = z+1;
+
+ if ( zp % 1024 == 0 )
+ std::cerr << zp << "/" << numtries << std::endl;
+ }
+
+ std::cerr << "[V] running mergeParallel done" << std::endl;
+ }
+
+
+ {
+ std::cerr << "running short mergeInPlaceParallel test...";
+
+ std::vector<int> VA({1,3,5,7,9,9});
+ std::vector<int> VB({2,4,6,6,8,10});
+ std::vector<int> V;
+ for ( auto i : VA )
+ V.push_back(i);
+ for ( auto i : VB )
+ V.push_back(i);
+
+ libmaus2::sorting::RankSearch::mergeInPlaceParallel(
+ V.begin(),
+ V.begin()+VA.size(),
+ V.end(),
+ 2
+ );
+
+ for ( std::size_t i = 1; i < V.size(); ++i )
+ assert ( V[i-1] <= V[i] );
+
+ std::cerr << "done." << std::endl;
+ }
+
+ {
+ std::cerr << "running random mergeInPlaceParallel test...";
+
+ for ( std::size_t parts = 2; parts <= 16; ++parts )
+ {
+ std::size_t const n0 = 128;
+ std::size_t const n1 = 245;
+ std::vector<int> VA;
+ std::vector<int> VB;
+
+ {
+ int s = libmaus2::random::Random::rand64() % 64;
+
+ for ( std::size_t i = 0; i < n0; ++i )
+ {
+ VA.push_back(s);
+ s += libmaus2::random::Random::rand64() % 4;
+ }
+ }
+ {
+ int s = libmaus2::random::Random::rand64() % 64;
+
+ for ( std::size_t i = 0; i < n1; ++i )
+ {
+ VB.push_back(s);
+ s += libmaus2::random::Random::rand64() % 4;
+ }
+ }
+
+ std::vector<int> V;
+ for ( auto i : VA )
+ V.push_back(i);
+ for ( auto i : VB )
+ V.push_back(i);
+ std::vector<int> VT = V;
+
+ libmaus2::sorting::RankSearch::mergeInPlaceParallel(
+ V.begin(),
+ V.begin()+VA.size(),
+ V.end(),
+ parts
+ );
+
+ for ( std::size_t i = 1; i < V.size(); ++i )
+ assert ( V[i-1] <= V[i] );
+
+ std::sort(VT.begin(),VT.end());
+ for ( std::size_t i = 0; i < V.size(); ++i )
+ assert ( V[i] == VT[i] );
+ }
+
+ std::cerr << "done." << std::endl;
+ }
+
+ {
+ std::cerr << "running random mergeLimitedParallel test...";
+
+ std::size_t const tmpn = 1024;
+ std::shared_ptr<int[]> tptr(new int[tmpn]);
+
+ for ( std::size_t parts = 2; parts <= 16; ++parts )
+ {
+ std::size_t const n0 = 128;
+ std::size_t const n1 = 245;
+ std::vector<int> VA;
+ std::vector<int> VB;
+
+ {
+ int s = libmaus2::random::Random::rand64() % 64;
+
+ for ( std::size_t i = 0; i < n0; ++i )
+ {
+ VA.push_back(s);
+ s += libmaus2::random::Random::rand64() % 4;
+ }
+ }
+ {
+ int s = libmaus2::random::Random::rand64() % 64;
+
+ for ( std::size_t i = 0; i < n1; ++i )
+ {
+ VB.push_back(s);
+ s += libmaus2::random::Random::rand64() % 4;
+ }
+ }
+
+ std::vector<int> V;
+ for ( auto i : VA )
+ V.push_back(i);
+ for ( auto i : VB )
+ V.push_back(i);
+ std::vector<int> VT = V;
+
+ libmaus2::sorting::RankSearch::mergeLimitedParallel(
+ V.begin(),
+ V.begin()+VA.size(),
+ V.end(),
+ tptr.get(),
+ tmpn,
+ parts
+ );
+
+ for ( std::size_t i = 1; i < V.size(); ++i )
+ assert ( V[i-1] <= V[i] );
+
+ std::sort(VT.begin(),VT.end());
+ for ( std::size_t i = 0; i < V.size(); ++i )
+ assert ( V[i] == VT[i] );
+ }
+
+ std::cerr << "done." << std::endl;
+ }
+
+ {
+ std::vector<int> VA({1,3,5,7,9,9});
+ std::vector<int> VB({2,4,6,6,8,10});
+
+
+ std::size_t const zz = VA.size() + VB.size();
+
+ for ( std::size_t i = 0; i <= zz; ++i )
+ {
+ auto const P = libmaus2::sorting::RankSearch::rankSearch(VA.begin(),VA.end(),VB.begin(),VB.end(),i);
+
+ std::size_t const i_a = P.first - VA.begin();
+ std::size_t const i_b = P.second - VB.begin();
+
+ std::cerr << "[0] i=" << i << " i_a=" << i_a << " i_b=" << i_b << " i_a+i_b=" << (i_a+i_b) << std::endl;
+
+ assert ( i_a + i_b == i );
+ }
+
+ for ( std::size_t i = 0; i <= VA.size(); ++i )
+ {
+ auto const P = libmaus2::sorting::RankSearch::rankSearch(VA.begin(),VA.end(),VB.begin(),VB.begin(),i);
+
+ std::size_t const i_a = P.first - VA.begin();
+ std::size_t const i_b = P.second - VB.begin();
+
+ std::cerr << "[1] i=" << i << " i_a=" << i_a << " i_b=" << i_b << " i_a+i_b=" << (i_a+i_b) << std::endl;
+
+ assert ( i_a + i_b == i );
+ }
+
+ for ( std::size_t i = 0; i <= VB.size(); ++i )
+ {
+ auto const P = libmaus2::sorting::RankSearch::rankSearch(VA.begin(),VA.begin(),VB.begin(),VB.end(),i);
+
+ std::size_t const i_a = P.first - VA.begin();
+ std::size_t const i_b = P.second - VB.begin();
+
+ std::cerr << "[2] i=" << i << " i_a=" << i_a << " i_b=" << i_b << " i_a+i_b=" << (i_a+i_b) << std::endl;
+
+ assert ( i_a + i_b == i );
+ }
+
+ for ( std::size_t i = 0; i <= zz; ++i )
+ {
+ typedef std::vector<int>::const_iterator const_iterator;
+
+ {
+ std::vector< std::pair<const_iterator,const_iterator> > const V(
+ {
+ std::make_pair(VA.begin(),VA.end()),
+ std::make_pair(VB.begin(),VB.end())
+ }
+ );
+
+ auto const R = libmaus2::sorting::RankSearch::rankSearchMulti(V,i);
+
+ std::cerr << "[31] i=" << i;
+ std::size_t s = 0;
+
+ for ( std::size_t j = 0; j < R.size(); ++j )
+ {
+ std::size_t const c = R[j] - V[j].first ;
+ s += c;
+ std::cerr << " i_" << j << "=" << c;
+ }
+
+ std::cerr << " sum=" << s << std::endl;
+
+ assert ( s == i );
+ }
+
+ {
+ std::vector< std::pair<const_iterator,const_iterator> > const V(
+ {
+ std::make_pair(VB.begin(),VB.end()),
+ std::make_pair(VA.begin(),VA.end())
+ }
+ );
+
+ auto const R = libmaus2::sorting::RankSearch::rankSearchMulti(V,i);
+
+ std::cerr << "[32] i=" << i;
+ std::size_t s = 0;
+
+ for ( std::size_t j = 0; j < R.size(); ++j )
+ {
+ std::size_t const c = R[j] - V[j].first ;
+ s += c;
+ std::cerr << " i_" << j << "=" << c;
+ }
+
+ std::cerr << " sum=" << s << std::endl;
+
+ assert ( s == i );
+ }
+ }
+
+ {
+ std::vector<int> V;
+
+ for ( std::size_t i = 0; i < VA.size(); ++i )
+ V.push_back(VA[i]);
+ for ( std::size_t i = 0; i < VB.size(); ++i )
+ V.push_back(VB[i]);
+
+ libmaus2::sorting::RankSearch::mergeInPlace(
+ V.begin(),
+ V.begin()+VA.size(),
+ V.begin()+VA.size()+VB.size()
+ );
+
+ for ( std::size_t i = 0; i < V.size(); ++i )
+ std::cerr << "V[" << i << "]=" << V[i] << std::endl;
+ }
+
+
+ {
+ std::cerr << "[V] running mergeParallelCopyBack..." << std::endl;
+
+ std::size_t const numtries = 4096;
+ std::size_t const maxlen = 64*1024;
+ int const mod = 8192;
+
+ for ( std::size_t z = 0; z < numtries; ++z )
+ {
+ std::size_t const n = libmaus2::random::Random::rand64() % (maxlen+1);
+ std::size_t const l = libmaus2::random::Random::rand64() % (n+1);
+
+ // std::size_t const r = n-l;
+
+ std::vector<int> V;
+ for ( std::size_t j = 0; j < n; ++j )
+ V.push_back(libmaus2::random::Random::rand64() % mod);
+ decltype(V) VT = V;
+
+ std::sort(V.begin(),V.begin()+l);
+ std::sort(V.begin()+l,V.end());
+
+ std::shared_ptr<int[]> T(new int[n]);
+
+ std::size_t const threads = 10;
+ bool const parallelSearch = true;
+ libmaus2::sorting::RankSearch::mergeParallelCopyBack(
+ V.begin(),
+ V.begin()+l,
+ V.end(),
+ T.get(),
+ threads,
+ parallelSearch
+ );
+
+ for ( std::size_t j = 1; j < n; ++j )
+ assert ( V[j-1] <= V[j] );
+
+ std::sort(VT.begin(),VT.end());
+ for ( std::size_t j = 0; j < n; ++j )
+ assert ( VT[j] == V[j] );
+
+ std::size_t const zp = z+1;
+
+ if ( zp % 1024 == 0 )
+ std::cerr << zp << "/" << numtries << std::endl;
+ }
+
+ std::cerr << "[V] running mergeParallelCopyBack done" << std::endl;
+ }
+
+
+ {
+ std::cerr << "[V] running mergeParallel..." << std::endl;
+
+ std::size_t const numtries = 4096;
+ std::size_t const maxlen = 64*1024;
+ int const mod = 8192;
+
+ for ( std::size_t z = 0; z < numtries; ++z )
+ {
+ std::size_t const n = libmaus2::random::Random::rand64() % (maxlen+1);
+ std::size_t const l = libmaus2::random::Random::rand64() % (n+1);
+
+ // std::size_t const r = n-l;
+
+ std::vector<int> V;
+ for ( std::size_t j = 0; j < n; ++j )
+ V.push_back(libmaus2::random::Random::rand64() % mod);
+
+ std::sort(V.begin(),V.begin()+l);
+ std::sort(V.begin()+l,V.end());
+
+ std::shared_ptr<int[]> T(new int[n]);
+
+ std::size_t const threads = 10;
+ bool const parallelSearch = true;
+ libmaus2::sorting::RankSearch::mergeParallel(
+ V.begin(),V.begin()+l,
+ V.begin()+l,V.end(),
+ T.get(),
+ threads,
+ parallelSearch
+ );
+
+ for ( std::size_t j = 1; j < n; ++j )
+ assert ( T[j-1] <= T[j] );
+
+ std::sort(V.begin(),V.end());
+ for ( std::size_t j = 0; j < n; ++j )
+ assert ( T[j] == V[j] );
+
+ std::size_t const zp = z+1;
+
+ if ( zp % 1024 == 0 )
+ std::cerr << zp << "/" << numtries << std::endl;
+ }
+
+ std::cerr << "[V] running mergeParallel done" << std::endl;
+ }
+
+
+
+ {
+ std::size_t const numtries = 4096;
+ std::size_t const maxlen = 64*1024;
+ int const mod = 8192;
+
+ for ( std::size_t z = 0; z < numtries; ++z )
+ {
+ std::size_t const n = libmaus2::random::Random::rand64() % (maxlen+1);
+ std::size_t const l = libmaus2::random::Random::rand64() % (n+1);
+
+ // std::size_t const r = n-l;
+
+ std::vector<int> V;
+ for ( std::size_t j = 0; j < n; ++j )
+ V.push_back(libmaus2::random::Random::rand64() % mod);
+
+ std::sort(V.begin(),V.begin()+l);
+ std::sort(V.begin()+l,V.end());
+
+ libmaus2::sorting::RankSearch::mergeInPlace(
+ V.begin(),
+ V.begin()+l,
+ V.end()
+ );
+
+ for ( std::size_t j = 1; j < n; ++j )
+ assert ( V[j-1] <= V[j] );
+
+ std::size_t const zp = z+1;
+
+ if ( zp % 1024 == 0 )
+ std::cerr << zp << "/" << numtries << std::endl;
+ }
+ }
+
+ {
+ std::size_t const numtries = 4096;
+ std::size_t const maxlen = 64*1024;
+ int const mod = 8192;
+ std::size_t const limit = 2048;
+ std::shared_ptr<int[]> tptr(new int[limit]);
+
+ for ( std::size_t z = 0; z < numtries; ++z )
+ {
+ std::size_t const n = libmaus2::random::Random::rand64() % (maxlen+1);
+ std::size_t const l = libmaus2::random::Random::rand64() % (n+1);
+
+ // std::size_t const r = n-l;
+
+ std::vector<int> V;
+ for ( std::size_t j = 0; j < n; ++j )
+ V.push_back(libmaus2::random::Random::rand64() % mod);
+
+ std::sort(V.begin(),V.begin()+l);
+ std::sort(V.begin()+l,V.end());
+
+ libmaus2::sorting::RankSearch::mergeLimited(
+ V.begin(),
+ V.begin()+l,
+ V.end(),
+ tptr.get(),
+ limit
+ );
+
+ for ( std::size_t j = 1; j < n; ++j )
+ assert ( V[j-1] <= V[j] );
+
+ std::size_t const zp = z+1;
+
+ if ( zp % 1024 == 0 )
+ std::cerr << zp << "/" << numtries << std::endl;
+ }
+ }
+
{
+ std::size_t const n = 1024*1024*1024;
+ std::size_t const threads = 24;
+
+ std::cerr << "filling array...";
+ int const mod = 1024*1024;
+ std::vector<int> V(n);
+ for ( std::size_t j = 0; j < n; ++j )
+ V[j] = libmaus2::random::Random::rand64() % mod;
+ std::vector<int> const VRT = V;
+ std::cerr << "done." << std::endl;
+
+ for ( std::size_t z = 0; z < 4; ++z )
+ {
+ std::vector<int> VT = VRT;
+ V = VRT;
+
+ libmaus2::timing::RealTimeClock rtc;
+ rtc.start();
+
+ if ( z == 0 )
+ {
+ std::shared_ptr<int[]> T(new int[n]);
+
+ std::cerr << "[V] sorting with temp space...";
+
+ libmaus2::sorting::RankSearch::sortWithTempSpace(
+ V.begin(),
+ V.end(),
+ T.get(),
+ threads
+ );
+ std::cerr << "done." << std::endl;
+ }
+ else if ( z == 1 )
+ {
+ std::size_t const ln = 64*1024;
+ std::shared_ptr<int[]> T(new int[ln]);
+
+ std::cerr << "[V] sorting with limited temp space...";
+
+ libmaus2::sorting::RankSearch::sortLimited(
+ V.begin(),
+ V.end(),
+ T.get(),
+ ln,
+ threads
+ );
+
+ std::cerr << "done." << std::endl;
+ }
+ else if ( z == 2 )
+ {
+ std::cerr << "[V] sorting without temp space...";
+
+ libmaus2::sorting::RankSearch::sortInPlace(
+ V.begin(),
+ V.end(),
+ threads
+ );
+ std::cerr << "done." << std::endl;
+ }
+ if ( z == 3 )
+ {
+ std::shared_ptr<int[]> T(new int[n]);
+
+ std::cerr << "[V] sorting with temp space using thread pool...";
+
+ libmaus2::sorting::RankSearch::sortWithTempSpaceViaThreadPool(
+ V.begin(),
+ V.end(),
+ T.get(),
+ threads
+ );
+ std::cerr << "done." << std::endl;
+ }
+ double const tim = rtc.getElapsedSeconds();
+
+ std::cerr << "time " << tim << std::endl;
+
+ rtc.start();
+ std::sort(VT.begin(),VT.end());
+
+ double const timser = rtc.getElapsedSeconds();
+
+ std::cerr << "serial time " << timser << std::endl;
+
+ for ( std::size_t j = 1; j < n; ++j )
+ assert ( V[j-1] <= V[j] );
+ for ( std::size_t j = 0; j < n; ++j )
+ assert ( VT[j] == V[j] );
+ }
+ }
+
+
+ return 0;
+
testQuickSort();
testpairfilesortingSecond(16*1024*1024);
View it on GitLab: https://salsa.debian.org/med-team/libmaus2/-/compare/5dcc41d4ca9cf475e13ec92ff284c528b4836f4b...0cbf640b4377ea51285d3eef56da0e43ac7a4f7e
--
View it on GitLab: https://salsa.debian.org/med-team/libmaus2/-/compare/5dcc41d4ca9cf475e13ec92ff284c528b4836f4b...0cbf640b4377ea51285d3eef56da0e43ac7a4f7e
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20210902/e3e13869/attachment-0001.htm>
More information about the debian-med-commit
mailing list