[med-svn] [Git][med-team/btllib][master] 14 commits: d/rules: simplify

Michael R. Crusoe (@crusoe) gitlab at salsa.debian.org
Mon Jan 13 10:14:52 GMT 2025



Michael R. Crusoe pushed to branch master at Debian Med / btllib


Commits:
7c8e464d by Michael R. Crusoe at 2025-01-13T09:29:30+01:00
d/rules: simplify

- - - - -
bcd13baf by Michael R. Crusoe at 2025-01-13T10:43:45+01:00
d/upstream/metadata: added citation information

- - - - -
3a6fd4d4 by Michael R. Crusoe at 2025-01-13T10:48:31+01:00
debian/patches/find_object_files_at_right_loc.patch: don't enable code coverage measurements.

- - - - -
11e01528 by Michael R. Crusoe at 2025-01-13T10:48:57+01:00
Fixed python package

- - - - -
7fc9a4d5 by Michael R. Crusoe at 2025-01-13T10:50:17+01:00
Disable the Python package so we can upgrade btllib first.

- - - - -
8c221e20 by Michael R. Crusoe at 2025-01-13T11:03:06+01:00
Added autopkgtest using the example code.

- - - - -
33f98734 by Michael R. Crusoe at 2025-01-13T11:03:06+01:00
d/control: Update description of the -tools package to list all 3 programs.

- - - - -
129b1b14 by Michael R. Crusoe at 2025-01-13T11:04:45+01:00
New upstream version 1.7.5+dfsg
- - - - -
21d10c23 by Michael R. Crusoe at 2025-01-13T11:04:45+01:00
New upstream version

- - - - -
622815f0 by Michael R. Crusoe at 2025-01-13T11:04:46+01:00
Update upstream source from tag 'upstream/1.7.5+dfsg'

Update to upstream version '1.7.5+dfsg'
with Debian dir fc25712abe967d06665cee69b0e206224aa5196c
- - - - -
8bb82de7 by Michael R. Crusoe at 2025-01-13T11:04:46+01:00
Standards-Version: 4.7.0 (routine-update)

- - - - -
053a9ecb by Michael R. Crusoe at 2025-01-13T11:05:55+01:00
fix patch fuzz

- - - - -
fc35f5c9 by Michael R. Crusoe at 2025-01-13T11:08:46+01:00
d/control: simplify archicture field using the architecture- properties package

- - - - -
2e1cc9ef by Michael R. Crusoe at 2025-01-13T11:12:38+01:00
routine-update: Ready to upload to unstable

- - - - -


28 changed files:

- .clang-tidy
- README.md
- debian/changelog
- debian/control
- debian/patches/find_object_files_at_right_loc.patch
- + debian/patches/python3
- debian/patches/series
- debian/patches/use_debian_packaged_libs.patch
- − debian/python3-btllib.install
- debian/rules
- + debian/tests/control
- + debian/tests/run-unit-test
- debian/upstream/metadata
- include/btllib/counting_bloom_filter-inl.hpp
- include/btllib/counting_bloom_filter.hpp
- include/btllib/mi_bloom_filter-inl.hpp
- include/btllib/nthash_kmer.hpp
- include/btllib/util.hpp
- meson.build
- recipes/mi_bloom_filter.cpp
- src/btllib/status.cpp
- src/btllib/util.cpp
- tests/counting_bloom_filter.cpp
- tests/large.bam
- tests/python/test_calc_phred_avg.py
- tests/util.cpp
- wrappers/python/btllib.py
- wrappers/python/btllib_wrap.cxx


Changes:

=====================================
.clang-tidy
=====================================
@@ -54,6 +54,24 @@
     -altera-id-dependent-backward-branch,
     -clang-diagnostic-unused-command-line-argument,
     -clang-diagnostic-unneeded-internal-declaration,
+    -readability-convert-member-functions-to-static,
+    -misc-unused-parameters,
+    -misc-include-cleaner,
+    -performance-avoid-endl,
+    -misc-use-anonymous-namespace,
+    -performance-enum-size,
+    -cppcoreguidelines-avoid-const-or-ref-data-members,
+    -cppcoreguidelines-noexcept-move-operations,
+    -clang-analyzer-core.UndefinedBinaryOperatorResult,
+    -misc-header-include-cycle,
+    -misc-const-correctness,
+    -hicpp-noexcept-move,
+    -cppcoreguidelines-avoid-do-while,
+    -readability-static-accessed-through-instance,
+    -performance-noexcept-move-constructor,
+    -hicpp-use-emplace,
+    -readability-avoid-nested-conditional-operator,
+    -modernize-use-emplace,
     -readability-identifier-length',
   WarningsAsErrors: '*',
   CheckOptions: [
@@ -66,4 +84,4 @@
       { key: readability-identifier-naming.GlobalConstantPointerCase, value: UPPER_CASE },
       { key: readability-identifier-naming.ConstexprVariableCase,     value: UPPER_CASE },
   ]
-}
\ No newline at end of file
+}


=====================================
README.md
=====================================
@@ -30,7 +30,8 @@ Using the library
 ---
 - Run time dependencies:
   * SAMtools for reading SAM, BAM, and CRAM files.
-  * gzip, tar, pigz, bzip2, xz, lrzip, zip, and/or 7zip for compressing/decompressing files. Not all of these are necessary, only the ones whose compressions you'll be using. 
+  * gzip, tar, pigz, bzip2, xz, lrzip, zip, and/or 7zip for compressing/decompressing files. Not all of these are necessary, only the ones whose compressions you'll be using.
+    * Note that lrzip is not available on the btllib conda osx-arm64 build
   * wget for downloading sequences from a URL.
 - Building C++ code (`$PREFIX` is the path where btllib is installed):
   * Link your code with `$PREFIX/lib/libbtllib.a` (pass `-L $PREFIX/lib -l btllib` flags to the compiler).
@@ -69,7 +70,7 @@ For btllib developers
 
 The following are all the available `ninja` commands which can be run within `build` directory:
 - `ninja clang-format` formats the whitespace in code (requires clang-format 8+).
-- `ninja wrap` wraps C++ code for Python (requires SWIG 4.0+).
+- `ninja wrap` wraps C++ code for Python (requires SWIG ≥4.0 and <4.3).
 - `ninja clang-tidy` runs clang-tidy on C++ code and makes sure it passes (requires clang-tidy 8+).
 - `ninja` builds the tests and wrapper libraries / makes sure they compile.
 - `ninja test` runs the tests.
@@ -85,6 +86,7 @@ Credits
 - Components:
   - [Hamid Mohamadi](https://github.com/mohamadi) and [Parham Kazemi](https://github.com/parham-k) for [ntHash](https://github.com/bcgsc/ntHash)
   - [Justin Chu](https://github.com/JustinChu) for [MIBloomFilter](https://github.com/bcgsc/btl_bloomfilter)
+  - [Johnathan Wong](https://github.com/jwcodee) for [aaHash](https://github.com/bcgsc/btllib)
 - Included dependencies:
   - [Chase Geigle](https://github.com/skystrife) for [cpptoml](https://github.com/skystrife/cpptoml)
   - Simon Gog, Timo Beller, Alistair Moffat, and Matthias Petri for [sdsl-lite](https://github.com/simongog/sdsl-lite)


=====================================
debian/changelog
=====================================
@@ -1,5 +1,7 @@
-btllib (1.7.0+dfsg-1) UNRELEASED; urgency=medium
+btllib (1.7.5+dfsg-1) unstable; urgency=medium
 
+  * Team upload.
+  [ Andreas Tille ]
   * New upstream version
   * Build-Depends: python3
   * Ignore cmake_options in meson.build
@@ -12,7 +14,22 @@ btllib (1.7.0+dfsg-1) UNRELEASED; urgency=medium
   * Add python3 package:
   TODO: Create modules for different Python3.x versions
 
- -- Andreas Tille <tille at debian.org>  Wed, 07 Feb 2024 08:00:48 +0100
+  [ Michael R. Crusoe ]
+  * d/rules: simplify
+  * d/upstream/metadata: added citation information
+  * debian/patches/find_object_files_at_right_loc.patch: don't enable
+    code coverage measurements.
+  * Fixed Python package.
+  * Disable the Python package so we can upgrade btllib first.
+  * Added autopkgtest using the example code.
+  * d/control: Update description of the -tools package to list all 3
+    programs.
+  * New upstream version
+  * Standards-Version: 4.7.0 (routine-update)
+  * d/control: simplify archicture field using the architecture-
+    properties package
+
+ -- Michael R. Crusoe <crusoe at debian.org>  Mon, 13 Jan 2025 11:11:36 +0100
 
 btllib (1.4.10+dfsg-1) unstable; urgency=medium
 


=====================================
debian/control
=====================================
@@ -4,24 +4,23 @@ Priority: optional
 Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>
 Uploaders: Andreas Tille <tille at debian.org>
 Build-Depends: debhelper-compat (= 13),
+               architecture-is-64-bit,
                dh-exec,
-               dh-sequence-python3,
                meson,
                ninja-build,
                libcpptoml-dev,
                libsdsl-dev,
                libomp-dev,
                libargparse-dev,
-               samtools,
-               python3-dev
-Standards-Version: 4.6.2
+               samtools
+Standards-Version: 4.7.0
 Vcs-Browser: https://salsa.debian.org/med-team/btllib
 Vcs-Git: https://salsa.debian.org/med-team/btllib.git
 Homepage: https://github.com/bcgsc/btllib
 Rules-Requires-Root: no
 
 Package: libbtllib-dev
-Architecture: any-amd64 arm64 loong64 mips64el ppc64el s390x ia64 ppc64 riscv64 sparc64 alpha
+Architecture: any
 Section: libdevel
 Depends: ${shlibs:Depends},
          ${misc:Depends},
@@ -44,14 +43,4 @@ Depends: ${shlibs:Depends},
 Description: Bioinformatics Technology Lab common code library tools
  Bioinformatics Technology Lab common code library in C++.
  .
- This package contains the tool indexlr.
-
-Package: python3-btllib
-Architecture: any
-Section: python
-Depends: ${python3:Depends},
-         ${shlibs:Depends},
-         ${misc:Depends},
-Description: Bioinformatics Technology Lab common Python3 wrapper
- This package contains the Python3 wraper for Bioinformatics Technology
- Lab common code.
+ This package contains the tools indexlr, mi_bf_generate, and randseq


=====================================
debian/patches/find_object_files_at_right_loc.patch
=====================================
@@ -1,19 +1,20 @@
 Author: Nilesh Patra
 Last-Update: 2022-09-30 16:32:51 +0530
 Description: Avoid useless cmake checks and trust the known locations in Debian
+Forwarded: not-needed
 
---- a/meson.build
-+++ b/meson.build
+--- btllib.orig/meson.build
++++ btllib/meson.build
 @@ -1,7 +1,7 @@
  project('btllib', 'cpp',
-         version : '1.7.0',
+         version : '1.7.5',
          license : 'GPL3',
 -        default_options : [ 'cpp_std=c++17', 'warning_level=3', 'werror=true', 'b_coverage=true' ],
-+        default_options : [ 'cpp_std=c++17', 'warning_level=3', 'werror=false', 'b_coverage=true' ],
++        default_options : [ 'cpp_std=c++17', 'warning_level=3', 'werror=false', 'b_coverage=false' ],
          meson_version : '>= 0.60.0')
  
  # Configuration
-@@ -47,25 +47,13 @@ add_global_link_arguments(global_link_ar
+@@ -47,25 +47,13 @@
  threads_dep = dependency('threads')
  openmp_dep = dependency('openmp', required : false)
  


=====================================
debian/patches/python3
=====================================
@@ -0,0 +1,11 @@
+From: Michael R. Crusoe <crusoe at debian.org>
+Subject: force python3 to run the tests
+
+--- btllib.orig/scripts/test-wrappers
++++ btllib/scripts/test-wrappers
+@@ -12,4 +12,4 @@
+ cp "${MESON_SOURCE_ROOT}/wrappers/python/btllib.py" "${MESON_BUILD_ROOT}/wrappers/btllib/__init__.py"
+ export PYTHONPATH="${MESON_BUILD_ROOT}/wrappers/"
+ cd "${MESON_SOURCE_ROOT}/tests/python"
+-python -m unittest
++python3 -m unittest


=====================================
debian/patches/series
=====================================
@@ -1,3 +1,4 @@
 find_object_files_at_right_loc.patch
 # shared+static_lib.patch
 use_debian_packaged_libs.patch
+python3


=====================================
debian/patches/use_debian_packaged_libs.patch
=====================================
@@ -1,6 +1,7 @@
 Author: Andreas Tille <tille at debian.org>
 Last-Update: Wed, 05 Oct 2022 12:57:29 +0200
 Description: There is nothing to copy if libcpptoml-dev package is used
+Forwarded: not-needed
 
 --- a/scripts/install-cpptoml
 +++ b/scripts/install-cpptoml


=====================================
debian/python3-btllib.install deleted
=====================================
@@ -1 +0,0 @@
-usr/lib/btllib/python/btllib/*	usr/lib/python3/btllib


=====================================
debian/rules
=====================================
@@ -7,10 +7,5 @@ export DEB_BUILD_MAINT_OPTIONS=hardening=+all
 %:
 	dh $@
 
-override_dh_auto_test:
+execute_before_dh_auto_test:
 	cp -a tests/ obj-*/
-	dh_auto_test
-
-override_dh_missing:
-	find debian -name setup.py -delete
-	dh_missing


=====================================
debian/tests/control
=====================================
@@ -0,0 +1,3 @@
+Tests: run-unit-test
+Depends: @, g++
+Restrictions: allow-stderr


=====================================
debian/tests/run-unit-test
=====================================
@@ -0,0 +1,17 @@
+#!/bin/bash
+set -e
+
+pkg=btllib
+
+export LC_ALL=C.UTF-8
+if [ "${AUTOPKGTEST_TMP}" = "" ] ; then
+  AUTOPKGTEST_TMP=$(mktemp -d /tmp/${pkg}-test.XXXXXX)
+  # shellcheck disable=SC2064
+  trap "rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
+fi
+
+cp -a examples/* "${AUTOPKGTEST_TMP}"
+
+cd "${AUTOPKGTEST_TMP}"
+
+g++ nthash_spacedseeds.cpp -std=c++17 -lbtllib -fopenmp && ./a.out


=====================================
debian/upstream/metadata
=====================================
@@ -1,3 +1,14 @@
+Reference:
+  Author: Vladimir Nikolić and Parham Kazemi and Lauren Coombe and Johnathan Wong and Amirhossein Afshinfard and Justin Chu and  René L. Warren and Inanç Birol
+  Title: |
+    btllib: A C++ library with Python interface for efficient genomic sequence processing
+  Journal: Journal of Open Source Software
+  Year: 2022
+  Volume: 7
+  Number: 79
+  DOI: 10.21105/joss.04720
+  URL: https://joss.theoj.org/papers/10.21105/joss.04720
+  Eprint: https://www.theoj.org/joss-papers/joss.04720/10.21105.joss.04720.pdf
 Bug-Database: https://github.com/bcgsc/btllib/issues
 Bug-Submit: https://github.com/bcgsc/btllib/issues/new
 Registry:


=====================================
include/btllib/counting_bloom_filter-inl.hpp
=====================================
@@ -56,13 +56,12 @@ inline CountingBloomFilter<T>::CountingBloomFilter(size_t bytes,
  */
 template<typename T>
 inline void
-CountingBloomFilter<T>::insert(const uint64_t* hashes, T min_val)
+CountingBloomFilter<T>::set(const uint64_t* hashes, T min_val, T new_val)
 {
   // Update flag to track if increment is done on at least one counter
   bool update_done = false;
-  T new_val, tmp_min_val;
+  T tmp_min_val;
   while (true) {
-    new_val = min_val + 1;
     for (size_t i = 0; i < hash_num; ++i) {
       tmp_min_val = min_val;
       update_done |= array[hashes[i] % array_size].compare_exchange_strong(
@@ -80,59 +79,25 @@ CountingBloomFilter<T>::insert(const uint64_t* hashes, T min_val)
 
 template<typename T>
 inline void
-CountingBloomFilter<T>::insert(const uint64_t* hashes)
+CountingBloomFilter<T>::insert(const uint64_t* hashes, T n)
 {
-  contains_insert(hashes);
+  contains_insert(hashes, n);
 }
 
 template<typename T>
 inline void
 CountingBloomFilter<T>::remove(const uint64_t* hashes)
 {
-  // Update flag to track if increment is done on at least one counter
-  bool update_done = false;
   T min_val = contains(hashes);
-  T new_val, tmp_min_val;
-  while (true) {
-    new_val = min_val - 1;
-    for (size_t i = 0; i < hash_num; ++i) {
-      tmp_min_val = min_val;
-      update_done |= array[hashes[i] % array_size].compare_exchange_strong(
-        tmp_min_val, new_val);
-    }
-    if (update_done) {
-      break;
-    }
-    min_val = contains(hashes);
-    if (min_val == std::numeric_limits<T>::max()) {
-      break;
-    }
-  }
+  set(hashes, min_val, min_val > 1 ? min_val - 1 : 0);
 }
 
 template<typename T>
 void
 CountingBloomFilter<T>::clear(const uint64_t* hashes)
 {
-  // Update flag to track if increment is done on at least one counter
-  bool update_done = false;
   T min_val = contains(hashes);
-  T new_val, tmp_min_val;
-  while (true) {
-    new_val = 0;
-    for (size_t i = 0; i < hash_num; ++i) {
-      tmp_min_val = min_val;
-      update_done |= array[hashes[i] % array_size].compare_exchange_strong(
-        tmp_min_val, new_val);
-    }
-    if (update_done) {
-      break;
-    }
-    min_val = contains(hashes);
-    if (min_val == std::numeric_limits<T>::max()) {
-      break;
-    }
-  }
+  set(hashes, min_val, 0);
 }
 
 template<typename T>
@@ -151,23 +116,23 @@ CountingBloomFilter<T>::contains(const uint64_t* hashes) const
 
 template<typename T>
 inline T
-CountingBloomFilter<T>::contains_insert(const uint64_t* hashes)
+CountingBloomFilter<T>::contains_insert(const uint64_t* hashes, T n)
 {
   const auto count = contains(hashes);
-  if (count < std::numeric_limits<T>::max()) {
-    insert(hashes, count);
+  if (count <= std::numeric_limits<T>::max() - n) {
+    set(hashes, count, count + n);
   }
   return count;
 }
 
 template<typename T>
 inline T
-CountingBloomFilter<T>::insert_contains(const uint64_t* hashes)
+CountingBloomFilter<T>::insert_contains(const uint64_t* hashes, T n)
 {
   const auto count = contains(hashes);
-  if (count < std::numeric_limits<T>::max()) {
-    insert(hashes, count);
-    return count + 1;
+  if (count <= std::numeric_limits<T>::max() + n) {
+    set(hashes, count, count + n);
+    return count + n;
   }
   return std::numeric_limits<T>::max();
 }
@@ -179,7 +144,7 @@ CountingBloomFilter<T>::insert_thresh_contains(const uint64_t* hashes,
 {
   const auto count = contains(hashes);
   if (count < threshold) {
-    insert(hashes, count);
+    set(hashes, count, count + 1);
     return count + 1;
   }
   return count;
@@ -192,7 +157,7 @@ CountingBloomFilter<T>::contains_insert_thresh(const uint64_t* hashes,
 {
   const auto count = contains(hashes);
   if (count < threshold) {
-    insert(hashes, count);
+    set(hashes, count, count + 1);
   }
   return count;
 }


=====================================
include/btllib/counting_bloom_filter.hpp
=====================================
@@ -75,15 +75,20 @@ public:
    *
    * @param hashes Integer array of the element's hash values. Array size should
    * equal the hash_num argument used when the Bloom filter was constructed.
+   * @param n Increment value
    */
-  void insert(const uint64_t* hashes);
+  void insert(const uint64_t* hashes, T n = 1);
 
   /**
    * Insert an element.
    *
    * @param hashes Integer vector of the element's hash values.
+   * @param n Increment value
    */
-  void insert(const std::vector<uint64_t>& hashes) { insert(hashes.data()); }
+  void insert(const std::vector<uint64_t>& hashes, T n = 1)
+  {
+    insert(hashes.data(), n);
+  }
 
   /**
    * Delete an element.
@@ -142,21 +147,23 @@ public:
    *
    * @param hashes Integer array of the element's hash values. Array size should
    * equal the hash_num argument used when the Bloom filter was constructed.
+   * @param n Increment value
    *
    * @return The count of the queried element before insertion.
    */
-  T contains_insert(const uint64_t* hashes);
+  T contains_insert(const uint64_t* hashes, T n = 1);
 
   /**
    * Get the count of an element and then increment the count.
    *
    * @param hashes Integer vector of the element's hash values.
+   * @param n Increment value
    *
    * @return The count of the queried element before insertion.
    */
-  T contains_insert(const std::vector<uint64_t>& hashes)
+  T contains_insert(const std::vector<uint64_t>& hashes, T n = 1)
   {
-    return contains_insert(hashes.data());
+    return contains_insert(hashes.data(), n);
   }
 
   /**
@@ -165,21 +172,23 @@ public:
    * @param hashes Integer array of the element's hash values. Array size
    * should equal the hash_num argument used when the Bloom filter was
    * constructed.
+   * @param n Increment value
    *
    * @return The count of the queried element after insertion.
    */
-  T insert_contains(const uint64_t* hashes);
+  T insert_contains(const uint64_t* hashes, T n = 1);
 
   /**
    * Increment an element's count and then return the count.
    *
    * @param hashes Integer vector of the element's hash values.
+   * @param n Increment value
    *
    * @return The count of the queried element after insertion.
    */
-  T insert_contains(const std::vector<uint64_t>& hashes)
+  T insert_contains(const std::vector<uint64_t>& hashes, T n = 1)
   {
-    return insert_contains(hashes.data());
+    return insert_contains(hashes.data(), n);
   }
 
   /**
@@ -280,7 +289,7 @@ public:
 private:
   CountingBloomFilter(const std::shared_ptr<BloomFilterInitializer>& bfi);
 
-  void insert(const uint64_t* hashes, T min_val);
+  void set(const uint64_t* hashes, T min_val, T new_val);
 
   friend class KmerCountingBloomFilter<T>;
 
@@ -346,17 +355,22 @@ public:
    *
    * @param hashes Integer array of the k-mer's hash values. Array size should
    * equal the hash_num argument used when the Bloom filter was constructed.
+   * @param n Increment value
    */
-  void insert(const uint64_t* hashes) { counting_bloom_filter.insert(hashes); }
+  void insert(const uint64_t* hashes, T n = 1)
+  {
+    counting_bloom_filter.insert(hashes, n);
+  }
 
   /**
    * Insert a k-mer into the filter.
    *
    * @param hashes Integer vector of the k-mer's hash values.
+   * @param n Increment value
    */
-  void insert(const std::vector<uint64_t>& hashes)
+  void insert(const std::vector<uint64_t>& hashes, T n = 1)
   {
-    counting_bloom_filter.insert(hashes);
+    counting_bloom_filter.insert(hashes, n);
   }
 
   /**
@@ -499,24 +513,26 @@ public:
    *
    * @param hashes Integer array of the k-mers's hash values. Array size should
    * equal the hash_num argument used when the Bloom filter was constructed.
+   * @param n Increment value
    *
    * @return The count of the queried k-mer before insertion.
    */
-  T contains_insert(const uint64_t* hashes)
+  T contains_insert(const uint64_t* hashes, T n = 1)
   {
-    return counting_bloom_filter.contains_insert(hashes);
+    return counting_bloom_filter.contains_insert(hashes, n);
   }
 
   /**
    * Get the count of a k-mer and then increment the count.
    *
    * @param hashes Integer vector of the k-mer's hash values.
+   * @param n Increment value
    *
    * @return The count of the queried k-mer before insertion.
    */
-  T contains_insert(const std::vector<uint64_t>& hashes)
+  T contains_insert(const std::vector<uint64_t>& hashes, T n = 1)
   {
-    return counting_bloom_filter.contains_insert(hashes);
+    return counting_bloom_filter.contains_insert(hashes, n);
   }
 
   /**
@@ -547,24 +563,26 @@ public:
    * @param hashes Integer array of the k-mer's hash values. Array size
    * should equal the hash_num argument used when the Bloom filter was
    * constructed.
+   * @param n Increment value
    *
    * @return The count of the queried k-mer after insertion.
    */
-  T insert_contains(const uint64_t* hashes)
+  T insert_contains(const uint64_t* hashes, T n = 1)
   {
-    return counting_bloom_filter.insert_contains(hashes);
+    return counting_bloom_filter.insert_contains(hashes, n);
   }
 
   /**
    * Increment a k-mer's count and then return the count.
    *
    * @param hashes Integer vector of the k-mer's hash values.
+   * @param n Increment value
    *
    * @return The count of the queried k-mer after insertion.
    */
-  T insert_contains(const std::vector<uint64_t>& hashes)
+  T insert_contains(const std::vector<uint64_t>& hashes, T n = 1)
   {
-    return counting_bloom_filter.insert_contains(hashes);
+    return counting_bloom_filter.insert_contains(hashes, n);
   }
 
   /**


=====================================
include/btllib/mi_bloom_filter-inl.hpp
=====================================
@@ -190,7 +190,7 @@ MIBloomFilter<T>::insert_id(const uint64_t* hashes, const T& id)
 {
   assert(bv_insertion_completed && !id_insertion_completed);
 
-  uint rand = std::rand(); // NOLINT
+  uint32_t rand = std::rand(); // NOLINT
   for (unsigned i = 0; i < hash_num; ++i) {
     uint64_t rank = get_rank_pos(hashes[i]);
     uint16_t count = ++counts_array[rank];


=====================================
include/btllib/nthash_kmer.hpp
=====================================
@@ -493,7 +493,7 @@ private:
     bool has_n = true;
     while (pos <= seq_len - k + 1 && has_n) {
       has_n = false;
-      for (unsigned i = 0; i < k; i++) {
+      for (unsigned i = 0; i < k && pos <= seq_len - k + 1; i++) {
         if (SEED_TAB[(unsigned char)seq[pos + k - i - 1]] == SEED_N) {
           pos += k - i;
           has_n = true;


=====================================
include/btllib/util.hpp
=====================================
@@ -111,6 +111,19 @@ get_basename(const std::string& path);
 std::string
 get_dirname(const std::string& path);
 
+/**
+ * Calculate the sum of the phred scores of a string.
+ *
+ * @param qual The quality string to calculate the sum from.
+ * @param start_pos The start position of the substring. Defaults to 0.
+ * @param len The length of the substring. Defaults to 0. If 0, the whole string
+ * is used.
+ *
+ * @return The sum of the phred scores of the substring.
+ */
+double
+sum_phred(const std::string& qual, size_t start_pos = 0, size_t len = 0);
+
 /**
  * Calculate the average phred score of a string,
  * depending on the start position and length.


=====================================
meson.build
=====================================
@@ -1,5 +1,5 @@
 project('btllib', 'cpp',
-        version : '1.7.0',
+        version : '1.7.5',
         license : 'GPL3',
         default_options : [ 'cpp_std=c++17', 'warning_level=3', 'werror=true', 'b_coverage=true' ],
         meson_version : '>= 0.60.0')


=====================================
recipes/mi_bloom_filter.cpp
=====================================
@@ -252,15 +252,15 @@ main(int argc, char* argv[])
                                  btllib::SeqReader::Flag::SHORT_MODE,
                                  DEFAULT_SEQ_READER_THREADS);
 #pragma omp parallel default(none) shared(ids,                                 \
-                                          id_counter,                          \
-                                          mi_bf,                               \
-                                          mi_bf_stage,                         \
-                                          hash_num,                            \
-                                          kmer_size,                           \
-                                          by_file,                             \
-                                          spaced_seed_set,                     \
-                                          spaced_seeds,                        \
-                                          reader)
+                                            id_counter,                        \
+                                            mi_bf,                             \
+                                            mi_bf_stage,                       \
+                                            hash_num,                          \
+                                            kmer_size,                         \
+                                            by_file,                           \
+                                            spaced_seed_set,                   \
+                                            spaced_seeds,                      \
+                                            reader)
         try {
           for (const auto record : reader) {
 #pragma omp critical


=====================================
src/btllib/status.cpp
=====================================
@@ -31,25 +31,25 @@ get_time()
 void
 log_info(const std::string& msg)
 {
-  std::cerr << ('[' + get_time() + "]" + PRINT_COLOR_INFO + "[INFO] " +
-                PRINT_COLOR_END + msg + '\n')
-            << std::flush;
+  std::string info_msg = "[" + get_time() + "]" + PRINT_COLOR_INFO + "[INFO] " +
+                         PRINT_COLOR_END + msg;
+  std::cerr << info_msg << std::endl;
 }
 
 void
 log_warning(const std::string& msg)
 {
-  std::cerr << ('[' + get_time() + "]" + PRINT_COLOR_WARNING + "[WARNING] " +
-                PRINT_COLOR_END + msg + '\n')
-            << std::flush;
+  std::string warning_msg = "[" + get_time() + "]" + PRINT_COLOR_WARNING +
+                            "[WARNING] " + PRINT_COLOR_END + msg;
+  std::cerr << warning_msg << std::endl;
 }
 
 void
 log_error(const std::string& msg)
 {
-  std::cerr << ('[' + get_time() + "]" + PRINT_COLOR_ERROR + "[ERROR] " +
-                PRINT_COLOR_END + msg + '\n')
-            << std::flush;
+  std::string error_msg = "[" + get_time() + "]" + PRINT_COLOR_ERROR +
+                          "[ERROR] " + PRINT_COLOR_END + msg;
+  std::cerr << error_msg << std::endl;
 }
 
 void
@@ -111,4 +111,4 @@ check_file_accessibility(const std::string& filepath)
   btllib::check_error(ret != 0, get_strerror() + ": " + filepath);
 }
 
-} // namespace btllib
\ No newline at end of file
+} // namespace btllib


=====================================
src/btllib/util.cpp
=====================================
@@ -3,7 +3,9 @@
 #include "btllib/status.hpp"
 
 #include <algorithm>
+#include <cmath>
 #include <condition_variable>
+#include <cstdlib>
 #include <cstring>
 #include <mutex>
 #include <string>
@@ -158,6 +160,23 @@ get_basename(const std::string& path)
   return path.substr(p + 1);
 }
 
+double
+sum_phred(const std::string& qual, const size_t start_pos, size_t len)
+{
+  double phred_sum = 0;
+  static constexpr double PHRED_OFFSET = 33.0;
+  for (size_t i = start_pos; i < start_pos + len; ++i) {
+    // Convert ASCII character to Phred score
+    int phred_score = (int)(qual.at(i) - PHRED_OFFSET);
+
+    // Delog the Phred score: 10^(-Q/10)
+    double delog_phred = pow(10.0, -phred_score / 10.0);
+
+    phred_sum += delog_phred;
+  }
+  return phred_sum;
+}
+
 double
 calc_phred_avg(const std::string& qual, const size_t start_pos, size_t len)
 {
@@ -170,14 +189,9 @@ calc_phred_avg(const std::string& qual, const size_t start_pos, size_t len)
     std::exit(EXIT_FAILURE); // NOLINT(concurrency-mt-unsafe)
   }
 
-  size_t phred_sum = 0;
-
-  for (size_t i = start_pos; i < start_pos + len; ++i) {
-    phred_sum += (size_t)qual.at(i);
-  }
+  double phred_sum = sum_phred(qual, start_pos, len);
 
-  static constexpr double PHRED_OFFSET = 33.0;
-  return ((double)phred_sum / (double)len) - PHRED_OFFSET;
+  return -10 * log10(phred_sum / len);
 }
 
 void
@@ -194,4 +208,4 @@ Barrier::wait()
   }
 }
 
-} // namespace btllib
\ No newline at end of file
+} // namespace btllib


=====================================
tests/counting_bloom_filter.cpp
=====================================
@@ -215,5 +215,19 @@ main()
     TEST_ASSERT_EQ(cbf.contains(hashes), 0);
   }
 
+  {
+    std::cerr << "Testing CBF element initialization" << std::endl;
+    std::vector<uint64_t> hashes = { 0x47c80ef7eab,
+                                     0x8b4a469ef6,
+                                     0x32e7ab5203 };
+    btllib::CountingBloomFilter8 cbf(64, hashes.size());
+    cbf.insert(hashes, 2);
+    TEST_ASSERT_EQ(cbf.contains(hashes), 2);
+    cbf.insert(hashes, 5);
+    TEST_ASSERT_EQ(cbf.contains(hashes), 7);
+    cbf.insert(hashes);
+    TEST_ASSERT_EQ(cbf.contains(hashes), 8);
+  }
+
   return 0;
 }
\ No newline at end of file


=====================================
tests/large.bam
=====================================
Binary files a/tests/large.bam and b/tests/large.bam differ


=====================================
tests/python/test_calc_phred_avg.py
=====================================
@@ -7,9 +7,9 @@ class CalcPhredAvgTests(unittest.TestCase):
     def test_calc_phred_avg(self):
         qual = "$$%%)*0)'%%&$$%&$&'''*)(((((()55561--.12356577-++**++,////.*))((()+))**010/..--+**++*+++)++++78883"
         self.assertAlmostEqual(
-            btllib.calc_phred_avg(qual, 0, 10), 6.4, places=3)
-        self.assertAlmostEqual(btllib.calc_phred_avg(qual), 10.949, places=3)
+            btllib.calc_phred_avg(qual, 0, 10), 5.34264, places=3)
+        self.assertAlmostEqual(btllib.calc_phred_avg(qual), 8.54327, places=3)
         self.assertAlmostEqual(
-            btllib.calc_phred_avg(qual, 0, 4), 3.5, places=3)
+            btllib.calc_phred_avg(qual, 0, 4), 3.47128, places=3)
         self.assertAlmostEqual(
-            btllib.calc_phred_avg(qual, 5, 20), 6.15, places=3)
+            btllib.calc_phred_avg(qual, 5, 20), 5.48923, places=3)


=====================================
tests/util.cpp
=====================================
@@ -32,10 +32,10 @@ main()
   double avg1 = btllib::calc_phred_avg(qual);
   double avg2 = btllib::calc_phred_avg(qual, 0, 4);
   double avg3 = btllib::calc_phred_avg(qual, 5, 20);
-  TEST_ASSERT_LT(std::abs(avg - 6.4), 1e-4);
-  TEST_ASSERT_LT(std::abs(avg1 - 10.949), 1e-4);
-  TEST_ASSERT_LT(std::abs(avg2 - 3.5), 1e-4);
-  TEST_ASSERT_LT(std::abs(avg3 - 6.15), 1e-4);
+  TEST_ASSERT_LT(std::abs(avg - 5.34264), 1e-4);
+  TEST_ASSERT_LT(std::abs(avg1 - 8.54327), 1e-4);
+  TEST_ASSERT_LT(std::abs(avg2 - 3.47128), 1e-4);
+  TEST_ASSERT_LT(std::abs(avg3 - 5.48923), 1e-4);
 
   return 0;
-}
\ No newline at end of file
+}


=====================================
wrappers/python/btllib.py
=====================================
@@ -1,5 +1,5 @@
 # This file was automatically generated by SWIG (https://www.swig.org).
-# Version 4.1.1
+# Version 4.2.1
 #
 # Do not make changes to this file unless you know what you are doing - modify
 # the SWIG interface file instead.


=====================================
wrappers/python/btllib_wrap.cxx
=====================================
The diff for this file was not included because it is too large.


View it on GitLab: https://salsa.debian.org/med-team/btllib/-/compare/d88b1e044bedcaf1fa7ffd67dd4c758d76cfa905...2e1cc9ef3acac3291e689427c1df23447286c135

-- 
View it on GitLab: https://salsa.debian.org/med-team/btllib/-/compare/d88b1e044bedcaf1fa7ffd67dd4c758d76cfa905...2e1cc9ef3acac3291e689427c1df23447286c135
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20250113/09314ec1/attachment-0001.htm>


More information about the debian-med-commit mailing list