[med-svn] [Git][med-team/btllib][master] 14 commits: d/rules: simplify
Michael R. Crusoe (@crusoe)
gitlab at salsa.debian.org
Mon Jan 13 10:14:52 GMT 2025
Michael R. Crusoe pushed to branch master at Debian Med / btllib
Commits:
7c8e464d by Michael R. Crusoe at 2025-01-13T09:29:30+01:00
d/rules: simplify
- - - - -
bcd13baf by Michael R. Crusoe at 2025-01-13T10:43:45+01:00
d/upstream/metadata: added citation information
- - - - -
3a6fd4d4 by Michael R. Crusoe at 2025-01-13T10:48:31+01:00
debian/patches/find_object_files_at_right_loc.patch: don't enable code coverage measurements.
- - - - -
11e01528 by Michael R. Crusoe at 2025-01-13T10:48:57+01:00
Fixed python package
- - - - -
7fc9a4d5 by Michael R. Crusoe at 2025-01-13T10:50:17+01:00
Disable the Python package so we can upgrade btllib first.
- - - - -
8c221e20 by Michael R. Crusoe at 2025-01-13T11:03:06+01:00
Added autopkgtest using the example code.
- - - - -
33f98734 by Michael R. Crusoe at 2025-01-13T11:03:06+01:00
d/control: Update description of the -tools package to list all 3 programs.
- - - - -
129b1b14 by Michael R. Crusoe at 2025-01-13T11:04:45+01:00
New upstream version 1.7.5+dfsg
- - - - -
21d10c23 by Michael R. Crusoe at 2025-01-13T11:04:45+01:00
New upstream version
- - - - -
622815f0 by Michael R. Crusoe at 2025-01-13T11:04:46+01:00
Update upstream source from tag 'upstream/1.7.5+dfsg'
Update to upstream version '1.7.5+dfsg'
with Debian dir fc25712abe967d06665cee69b0e206224aa5196c
- - - - -
8bb82de7 by Michael R. Crusoe at 2025-01-13T11:04:46+01:00
Standards-Version: 4.7.0 (routine-update)
- - - - -
053a9ecb by Michael R. Crusoe at 2025-01-13T11:05:55+01:00
fix patch fuzz
- - - - -
fc35f5c9 by Michael R. Crusoe at 2025-01-13T11:08:46+01:00
d/control: simplify archicture field using the architecture- properties package
- - - - -
2e1cc9ef by Michael R. Crusoe at 2025-01-13T11:12:38+01:00
routine-update: Ready to upload to unstable
- - - - -
28 changed files:
- .clang-tidy
- README.md
- debian/changelog
- debian/control
- debian/patches/find_object_files_at_right_loc.patch
- + debian/patches/python3
- debian/patches/series
- debian/patches/use_debian_packaged_libs.patch
- − debian/python3-btllib.install
- debian/rules
- + debian/tests/control
- + debian/tests/run-unit-test
- debian/upstream/metadata
- include/btllib/counting_bloom_filter-inl.hpp
- include/btllib/counting_bloom_filter.hpp
- include/btllib/mi_bloom_filter-inl.hpp
- include/btllib/nthash_kmer.hpp
- include/btllib/util.hpp
- meson.build
- recipes/mi_bloom_filter.cpp
- src/btllib/status.cpp
- src/btllib/util.cpp
- tests/counting_bloom_filter.cpp
- tests/large.bam
- tests/python/test_calc_phred_avg.py
- tests/util.cpp
- wrappers/python/btllib.py
- wrappers/python/btllib_wrap.cxx
Changes:
=====================================
.clang-tidy
=====================================
@@ -54,6 +54,24 @@
-altera-id-dependent-backward-branch,
-clang-diagnostic-unused-command-line-argument,
-clang-diagnostic-unneeded-internal-declaration,
+ -readability-convert-member-functions-to-static,
+ -misc-unused-parameters,
+ -misc-include-cleaner,
+ -performance-avoid-endl,
+ -misc-use-anonymous-namespace,
+ -performance-enum-size,
+ -cppcoreguidelines-avoid-const-or-ref-data-members,
+ -cppcoreguidelines-noexcept-move-operations,
+ -clang-analyzer-core.UndefinedBinaryOperatorResult,
+ -misc-header-include-cycle,
+ -misc-const-correctness,
+ -hicpp-noexcept-move,
+ -cppcoreguidelines-avoid-do-while,
+ -readability-static-accessed-through-instance,
+ -performance-noexcept-move-constructor,
+ -hicpp-use-emplace,
+ -readability-avoid-nested-conditional-operator,
+ -modernize-use-emplace,
-readability-identifier-length',
WarningsAsErrors: '*',
CheckOptions: [
@@ -66,4 +84,4 @@
{ key: readability-identifier-naming.GlobalConstantPointerCase, value: UPPER_CASE },
{ key: readability-identifier-naming.ConstexprVariableCase, value: UPPER_CASE },
]
-}
\ No newline at end of file
+}
=====================================
README.md
=====================================
@@ -30,7 +30,8 @@ Using the library
---
- Run time dependencies:
* SAMtools for reading SAM, BAM, and CRAM files.
- * gzip, tar, pigz, bzip2, xz, lrzip, zip, and/or 7zip for compressing/decompressing files. Not all of these are necessary, only the ones whose compressions you'll be using.
+ * gzip, tar, pigz, bzip2, xz, lrzip, zip, and/or 7zip for compressing/decompressing files. Not all of these are necessary, only the ones whose compressions you'll be using.
+ * Note that lrzip is not available on the btllib conda osx-arm64 build
* wget for downloading sequences from a URL.
- Building C++ code (`$PREFIX` is the path where btllib is installed):
* Link your code with `$PREFIX/lib/libbtllib.a` (pass `-L $PREFIX/lib -l btllib` flags to the compiler).
@@ -69,7 +70,7 @@ For btllib developers
The following are all the available `ninja` commands which can be run within `build` directory:
- `ninja clang-format` formats the whitespace in code (requires clang-format 8+).
-- `ninja wrap` wraps C++ code for Python (requires SWIG 4.0+).
+- `ninja wrap` wraps C++ code for Python (requires SWIG ≥4.0 and <4.3).
- `ninja clang-tidy` runs clang-tidy on C++ code and makes sure it passes (requires clang-tidy 8+).
- `ninja` builds the tests and wrapper libraries / makes sure they compile.
- `ninja test` runs the tests.
@@ -85,6 +86,7 @@ Credits
- Components:
- [Hamid Mohamadi](https://github.com/mohamadi) and [Parham Kazemi](https://github.com/parham-k) for [ntHash](https://github.com/bcgsc/ntHash)
- [Justin Chu](https://github.com/JustinChu) for [MIBloomFilter](https://github.com/bcgsc/btl_bloomfilter)
+ - [Johnathan Wong](https://github.com/jwcodee) for [aaHash](https://github.com/bcgsc/btllib)
- Included dependencies:
- [Chase Geigle](https://github.com/skystrife) for [cpptoml](https://github.com/skystrife/cpptoml)
- Simon Gog, Timo Beller, Alistair Moffat, and Matthias Petri for [sdsl-lite](https://github.com/simongog/sdsl-lite)
=====================================
debian/changelog
=====================================
@@ -1,5 +1,7 @@
-btllib (1.7.0+dfsg-1) UNRELEASED; urgency=medium
+btllib (1.7.5+dfsg-1) unstable; urgency=medium
+ * Team upload.
+ [ Andreas Tille ]
* New upstream version
* Build-Depends: python3
* Ignore cmake_options in meson.build
@@ -12,7 +14,22 @@ btllib (1.7.0+dfsg-1) UNRELEASED; urgency=medium
* Add python3 package:
TODO: Create modules for different Python3.x versions
- -- Andreas Tille <tille at debian.org> Wed, 07 Feb 2024 08:00:48 +0100
+ [ Michael R. Crusoe ]
+ * d/rules: simplify
+ * d/upstream/metadata: added citation information
+ * debian/patches/find_object_files_at_right_loc.patch: don't enable
+ code coverage measurements.
+ * Fixed Python package.
+ * Disable the Python package so we can upgrade btllib first.
+ * Added autopkgtest using the example code.
+ * d/control: Update description of the -tools package to list all 3
+ programs.
+ * New upstream version
+ * Standards-Version: 4.7.0 (routine-update)
+ * d/control: simplify archicture field using the architecture-
+ properties package
+
+ -- Michael R. Crusoe <crusoe at debian.org> Mon, 13 Jan 2025 11:11:36 +0100
btllib (1.4.10+dfsg-1) unstable; urgency=medium
=====================================
debian/control
=====================================
@@ -4,24 +4,23 @@ Priority: optional
Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>
Uploaders: Andreas Tille <tille at debian.org>
Build-Depends: debhelper-compat (= 13),
+ architecture-is-64-bit,
dh-exec,
- dh-sequence-python3,
meson,
ninja-build,
libcpptoml-dev,
libsdsl-dev,
libomp-dev,
libargparse-dev,
- samtools,
- python3-dev
-Standards-Version: 4.6.2
+ samtools
+Standards-Version: 4.7.0
Vcs-Browser: https://salsa.debian.org/med-team/btllib
Vcs-Git: https://salsa.debian.org/med-team/btllib.git
Homepage: https://github.com/bcgsc/btllib
Rules-Requires-Root: no
Package: libbtllib-dev
-Architecture: any-amd64 arm64 loong64 mips64el ppc64el s390x ia64 ppc64 riscv64 sparc64 alpha
+Architecture: any
Section: libdevel
Depends: ${shlibs:Depends},
${misc:Depends},
@@ -44,14 +43,4 @@ Depends: ${shlibs:Depends},
Description: Bioinformatics Technology Lab common code library tools
Bioinformatics Technology Lab common code library in C++.
.
- This package contains the tool indexlr.
-
-Package: python3-btllib
-Architecture: any
-Section: python
-Depends: ${python3:Depends},
- ${shlibs:Depends},
- ${misc:Depends},
-Description: Bioinformatics Technology Lab common Python3 wrapper
- This package contains the Python3 wraper for Bioinformatics Technology
- Lab common code.
+ This package contains the tools indexlr, mi_bf_generate, and randseq
=====================================
debian/patches/find_object_files_at_right_loc.patch
=====================================
@@ -1,19 +1,20 @@
Author: Nilesh Patra
Last-Update: 2022-09-30 16:32:51 +0530
Description: Avoid useless cmake checks and trust the known locations in Debian
+Forwarded: not-needed
---- a/meson.build
-+++ b/meson.build
+--- btllib.orig/meson.build
++++ btllib/meson.build
@@ -1,7 +1,7 @@
project('btllib', 'cpp',
- version : '1.7.0',
+ version : '1.7.5',
license : 'GPL3',
- default_options : [ 'cpp_std=c++17', 'warning_level=3', 'werror=true', 'b_coverage=true' ],
-+ default_options : [ 'cpp_std=c++17', 'warning_level=3', 'werror=false', 'b_coverage=true' ],
++ default_options : [ 'cpp_std=c++17', 'warning_level=3', 'werror=false', 'b_coverage=false' ],
meson_version : '>= 0.60.0')
# Configuration
-@@ -47,25 +47,13 @@ add_global_link_arguments(global_link_ar
+@@ -47,25 +47,13 @@
threads_dep = dependency('threads')
openmp_dep = dependency('openmp', required : false)
=====================================
debian/patches/python3
=====================================
@@ -0,0 +1,11 @@
+From: Michael R. Crusoe <crusoe at debian.org>
+Subject: force python3 to run the tests
+
+--- btllib.orig/scripts/test-wrappers
++++ btllib/scripts/test-wrappers
+@@ -12,4 +12,4 @@
+ cp "${MESON_SOURCE_ROOT}/wrappers/python/btllib.py" "${MESON_BUILD_ROOT}/wrappers/btllib/__init__.py"
+ export PYTHONPATH="${MESON_BUILD_ROOT}/wrappers/"
+ cd "${MESON_SOURCE_ROOT}/tests/python"
+-python -m unittest
++python3 -m unittest
=====================================
debian/patches/series
=====================================
@@ -1,3 +1,4 @@
find_object_files_at_right_loc.patch
# shared+static_lib.patch
use_debian_packaged_libs.patch
+python3
=====================================
debian/patches/use_debian_packaged_libs.patch
=====================================
@@ -1,6 +1,7 @@
Author: Andreas Tille <tille at debian.org>
Last-Update: Wed, 05 Oct 2022 12:57:29 +0200
Description: There is nothing to copy if libcpptoml-dev package is used
+Forwarded: not-needed
--- a/scripts/install-cpptoml
+++ b/scripts/install-cpptoml
=====================================
debian/python3-btllib.install deleted
=====================================
@@ -1 +0,0 @@
-usr/lib/btllib/python/btllib/* usr/lib/python3/btllib
=====================================
debian/rules
=====================================
@@ -7,10 +7,5 @@ export DEB_BUILD_MAINT_OPTIONS=hardening=+all
%:
dh $@
-override_dh_auto_test:
+execute_before_dh_auto_test:
cp -a tests/ obj-*/
- dh_auto_test
-
-override_dh_missing:
- find debian -name setup.py -delete
- dh_missing
=====================================
debian/tests/control
=====================================
@@ -0,0 +1,3 @@
+Tests: run-unit-test
+Depends: @, g++
+Restrictions: allow-stderr
=====================================
debian/tests/run-unit-test
=====================================
@@ -0,0 +1,17 @@
+#!/bin/bash
+set -e
+
+pkg=btllib
+
+export LC_ALL=C.UTF-8
+if [ "${AUTOPKGTEST_TMP}" = "" ] ; then
+ AUTOPKGTEST_TMP=$(mktemp -d /tmp/${pkg}-test.XXXXXX)
+ # shellcheck disable=SC2064
+ trap "rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
+fi
+
+cp -a examples/* "${AUTOPKGTEST_TMP}"
+
+cd "${AUTOPKGTEST_TMP}"
+
+g++ nthash_spacedseeds.cpp -std=c++17 -lbtllib -fopenmp && ./a.out
=====================================
debian/upstream/metadata
=====================================
@@ -1,3 +1,14 @@
+Reference:
+ Author: Vladimir Nikolić and Parham Kazemi and Lauren Coombe and Johnathan Wong and Amirhossein Afshinfard and Justin Chu and René L. Warren and Inanç Birol
+ Title: |
+ btllib: A C++ library with Python interface for efficient genomic sequence processing
+ Journal: Journal of Open Source Software
+ Year: 2022
+ Volume: 7
+ Number: 79
+ DOI: 10.21105/joss.04720
+ URL: https://joss.theoj.org/papers/10.21105/joss.04720
+ Eprint: https://www.theoj.org/joss-papers/joss.04720/10.21105.joss.04720.pdf
Bug-Database: https://github.com/bcgsc/btllib/issues
Bug-Submit: https://github.com/bcgsc/btllib/issues/new
Registry:
=====================================
include/btllib/counting_bloom_filter-inl.hpp
=====================================
@@ -56,13 +56,12 @@ inline CountingBloomFilter<T>::CountingBloomFilter(size_t bytes,
*/
template<typename T>
inline void
-CountingBloomFilter<T>::insert(const uint64_t* hashes, T min_val)
+CountingBloomFilter<T>::set(const uint64_t* hashes, T min_val, T new_val)
{
// Update flag to track if increment is done on at least one counter
bool update_done = false;
- T new_val, tmp_min_val;
+ T tmp_min_val;
while (true) {
- new_val = min_val + 1;
for (size_t i = 0; i < hash_num; ++i) {
tmp_min_val = min_val;
update_done |= array[hashes[i] % array_size].compare_exchange_strong(
@@ -80,59 +79,25 @@ CountingBloomFilter<T>::insert(const uint64_t* hashes, T min_val)
template<typename T>
inline void
-CountingBloomFilter<T>::insert(const uint64_t* hashes)
+CountingBloomFilter<T>::insert(const uint64_t* hashes, T n)
{
- contains_insert(hashes);
+ contains_insert(hashes, n);
}
template<typename T>
inline void
CountingBloomFilter<T>::remove(const uint64_t* hashes)
{
- // Update flag to track if increment is done on at least one counter
- bool update_done = false;
T min_val = contains(hashes);
- T new_val, tmp_min_val;
- while (true) {
- new_val = min_val - 1;
- for (size_t i = 0; i < hash_num; ++i) {
- tmp_min_val = min_val;
- update_done |= array[hashes[i] % array_size].compare_exchange_strong(
- tmp_min_val, new_val);
- }
- if (update_done) {
- break;
- }
- min_val = contains(hashes);
- if (min_val == std::numeric_limits<T>::max()) {
- break;
- }
- }
+ set(hashes, min_val, min_val > 1 ? min_val - 1 : 0);
}
template<typename T>
void
CountingBloomFilter<T>::clear(const uint64_t* hashes)
{
- // Update flag to track if increment is done on at least one counter
- bool update_done = false;
T min_val = contains(hashes);
- T new_val, tmp_min_val;
- while (true) {
- new_val = 0;
- for (size_t i = 0; i < hash_num; ++i) {
- tmp_min_val = min_val;
- update_done |= array[hashes[i] % array_size].compare_exchange_strong(
- tmp_min_val, new_val);
- }
- if (update_done) {
- break;
- }
- min_val = contains(hashes);
- if (min_val == std::numeric_limits<T>::max()) {
- break;
- }
- }
+ set(hashes, min_val, 0);
}
template<typename T>
@@ -151,23 +116,23 @@ CountingBloomFilter<T>::contains(const uint64_t* hashes) const
template<typename T>
inline T
-CountingBloomFilter<T>::contains_insert(const uint64_t* hashes)
+CountingBloomFilter<T>::contains_insert(const uint64_t* hashes, T n)
{
const auto count = contains(hashes);
- if (count < std::numeric_limits<T>::max()) {
- insert(hashes, count);
+ if (count <= std::numeric_limits<T>::max() - n) {
+ set(hashes, count, count + n);
}
return count;
}
template<typename T>
inline T
-CountingBloomFilter<T>::insert_contains(const uint64_t* hashes)
+CountingBloomFilter<T>::insert_contains(const uint64_t* hashes, T n)
{
const auto count = contains(hashes);
- if (count < std::numeric_limits<T>::max()) {
- insert(hashes, count);
- return count + 1;
+ if (count <= std::numeric_limits<T>::max() + n) {
+ set(hashes, count, count + n);
+ return count + n;
}
return std::numeric_limits<T>::max();
}
@@ -179,7 +144,7 @@ CountingBloomFilter<T>::insert_thresh_contains(const uint64_t* hashes,
{
const auto count = contains(hashes);
if (count < threshold) {
- insert(hashes, count);
+ set(hashes, count, count + 1);
return count + 1;
}
return count;
@@ -192,7 +157,7 @@ CountingBloomFilter<T>::contains_insert_thresh(const uint64_t* hashes,
{
const auto count = contains(hashes);
if (count < threshold) {
- insert(hashes, count);
+ set(hashes, count, count + 1);
}
return count;
}
=====================================
include/btllib/counting_bloom_filter.hpp
=====================================
@@ -75,15 +75,20 @@ public:
*
* @param hashes Integer array of the element's hash values. Array size should
* equal the hash_num argument used when the Bloom filter was constructed.
+ * @param n Increment value
*/
- void insert(const uint64_t* hashes);
+ void insert(const uint64_t* hashes, T n = 1);
/**
* Insert an element.
*
* @param hashes Integer vector of the element's hash values.
+ * @param n Increment value
*/
- void insert(const std::vector<uint64_t>& hashes) { insert(hashes.data()); }
+ void insert(const std::vector<uint64_t>& hashes, T n = 1)
+ {
+ insert(hashes.data(), n);
+ }
/**
* Delete an element.
@@ -142,21 +147,23 @@ public:
*
* @param hashes Integer array of the element's hash values. Array size should
* equal the hash_num argument used when the Bloom filter was constructed.
+ * @param n Increment value
*
* @return The count of the queried element before insertion.
*/
- T contains_insert(const uint64_t* hashes);
+ T contains_insert(const uint64_t* hashes, T n = 1);
/**
* Get the count of an element and then increment the count.
*
* @param hashes Integer vector of the element's hash values.
+ * @param n Increment value
*
* @return The count of the queried element before insertion.
*/
- T contains_insert(const std::vector<uint64_t>& hashes)
+ T contains_insert(const std::vector<uint64_t>& hashes, T n = 1)
{
- return contains_insert(hashes.data());
+ return contains_insert(hashes.data(), n);
}
/**
@@ -165,21 +172,23 @@ public:
* @param hashes Integer array of the element's hash values. Array size
* should equal the hash_num argument used when the Bloom filter was
* constructed.
+ * @param n Increment value
*
* @return The count of the queried element after insertion.
*/
- T insert_contains(const uint64_t* hashes);
+ T insert_contains(const uint64_t* hashes, T n = 1);
/**
* Increment an element's count and then return the count.
*
* @param hashes Integer vector of the element's hash values.
+ * @param n Increment value
*
* @return The count of the queried element after insertion.
*/
- T insert_contains(const std::vector<uint64_t>& hashes)
+ T insert_contains(const std::vector<uint64_t>& hashes, T n = 1)
{
- return insert_contains(hashes.data());
+ return insert_contains(hashes.data(), n);
}
/**
@@ -280,7 +289,7 @@ public:
private:
CountingBloomFilter(const std::shared_ptr<BloomFilterInitializer>& bfi);
- void insert(const uint64_t* hashes, T min_val);
+ void set(const uint64_t* hashes, T min_val, T new_val);
friend class KmerCountingBloomFilter<T>;
@@ -346,17 +355,22 @@ public:
*
* @param hashes Integer array of the k-mer's hash values. Array size should
* equal the hash_num argument used when the Bloom filter was constructed.
+ * @param n Increment value
*/
- void insert(const uint64_t* hashes) { counting_bloom_filter.insert(hashes); }
+ void insert(const uint64_t* hashes, T n = 1)
+ {
+ counting_bloom_filter.insert(hashes, n);
+ }
/**
* Insert a k-mer into the filter.
*
* @param hashes Integer vector of the k-mer's hash values.
+ * @param n Increment value
*/
- void insert(const std::vector<uint64_t>& hashes)
+ void insert(const std::vector<uint64_t>& hashes, T n = 1)
{
- counting_bloom_filter.insert(hashes);
+ counting_bloom_filter.insert(hashes, n);
}
/**
@@ -499,24 +513,26 @@ public:
*
* @param hashes Integer array of the k-mers's hash values. Array size should
* equal the hash_num argument used when the Bloom filter was constructed.
+ * @param n Increment value
*
* @return The count of the queried k-mer before insertion.
*/
- T contains_insert(const uint64_t* hashes)
+ T contains_insert(const uint64_t* hashes, T n = 1)
{
- return counting_bloom_filter.contains_insert(hashes);
+ return counting_bloom_filter.contains_insert(hashes, n);
}
/**
* Get the count of a k-mer and then increment the count.
*
* @param hashes Integer vector of the k-mer's hash values.
+ * @param n Increment value
*
* @return The count of the queried k-mer before insertion.
*/
- T contains_insert(const std::vector<uint64_t>& hashes)
+ T contains_insert(const std::vector<uint64_t>& hashes, T n = 1)
{
- return counting_bloom_filter.contains_insert(hashes);
+ return counting_bloom_filter.contains_insert(hashes, n);
}
/**
@@ -547,24 +563,26 @@ public:
* @param hashes Integer array of the k-mer's hash values. Array size
* should equal the hash_num argument used when the Bloom filter was
* constructed.
+ * @param n Increment value
*
* @return The count of the queried k-mer after insertion.
*/
- T insert_contains(const uint64_t* hashes)
+ T insert_contains(const uint64_t* hashes, T n = 1)
{
- return counting_bloom_filter.insert_contains(hashes);
+ return counting_bloom_filter.insert_contains(hashes, n);
}
/**
* Increment a k-mer's count and then return the count.
*
* @param hashes Integer vector of the k-mer's hash values.
+ * @param n Increment value
*
* @return The count of the queried k-mer after insertion.
*/
- T insert_contains(const std::vector<uint64_t>& hashes)
+ T insert_contains(const std::vector<uint64_t>& hashes, T n = 1)
{
- return counting_bloom_filter.insert_contains(hashes);
+ return counting_bloom_filter.insert_contains(hashes, n);
}
/**
=====================================
include/btllib/mi_bloom_filter-inl.hpp
=====================================
@@ -190,7 +190,7 @@ MIBloomFilter<T>::insert_id(const uint64_t* hashes, const T& id)
{
assert(bv_insertion_completed && !id_insertion_completed);
- uint rand = std::rand(); // NOLINT
+ uint32_t rand = std::rand(); // NOLINT
for (unsigned i = 0; i < hash_num; ++i) {
uint64_t rank = get_rank_pos(hashes[i]);
uint16_t count = ++counts_array[rank];
=====================================
include/btllib/nthash_kmer.hpp
=====================================
@@ -493,7 +493,7 @@ private:
bool has_n = true;
while (pos <= seq_len - k + 1 && has_n) {
has_n = false;
- for (unsigned i = 0; i < k; i++) {
+ for (unsigned i = 0; i < k && pos <= seq_len - k + 1; i++) {
if (SEED_TAB[(unsigned char)seq[pos + k - i - 1]] == SEED_N) {
pos += k - i;
has_n = true;
=====================================
include/btllib/util.hpp
=====================================
@@ -111,6 +111,19 @@ get_basename(const std::string& path);
std::string
get_dirname(const std::string& path);
+/**
+ * Calculate the sum of the phred scores of a string.
+ *
+ * @param qual The quality string to calculate the sum from.
+ * @param start_pos The start position of the substring. Defaults to 0.
+ * @param len The length of the substring. Defaults to 0. If 0, the whole string
+ * is used.
+ *
+ * @return The sum of the phred scores of the substring.
+ */
+double
+sum_phred(const std::string& qual, size_t start_pos = 0, size_t len = 0);
+
/**
* Calculate the average phred score of a string,
* depending on the start position and length.
=====================================
meson.build
=====================================
@@ -1,5 +1,5 @@
project('btllib', 'cpp',
- version : '1.7.0',
+ version : '1.7.5',
license : 'GPL3',
default_options : [ 'cpp_std=c++17', 'warning_level=3', 'werror=true', 'b_coverage=true' ],
meson_version : '>= 0.60.0')
=====================================
recipes/mi_bloom_filter.cpp
=====================================
@@ -252,15 +252,15 @@ main(int argc, char* argv[])
btllib::SeqReader::Flag::SHORT_MODE,
DEFAULT_SEQ_READER_THREADS);
#pragma omp parallel default(none) shared(ids, \
- id_counter, \
- mi_bf, \
- mi_bf_stage, \
- hash_num, \
- kmer_size, \
- by_file, \
- spaced_seed_set, \
- spaced_seeds, \
- reader)
+ id_counter, \
+ mi_bf, \
+ mi_bf_stage, \
+ hash_num, \
+ kmer_size, \
+ by_file, \
+ spaced_seed_set, \
+ spaced_seeds, \
+ reader)
try {
for (const auto record : reader) {
#pragma omp critical
=====================================
src/btllib/status.cpp
=====================================
@@ -31,25 +31,25 @@ get_time()
void
log_info(const std::string& msg)
{
- std::cerr << ('[' + get_time() + "]" + PRINT_COLOR_INFO + "[INFO] " +
- PRINT_COLOR_END + msg + '\n')
- << std::flush;
+ std::string info_msg = "[" + get_time() + "]" + PRINT_COLOR_INFO + "[INFO] " +
+ PRINT_COLOR_END + msg;
+ std::cerr << info_msg << std::endl;
}
void
log_warning(const std::string& msg)
{
- std::cerr << ('[' + get_time() + "]" + PRINT_COLOR_WARNING + "[WARNING] " +
- PRINT_COLOR_END + msg + '\n')
- << std::flush;
+ std::string warning_msg = "[" + get_time() + "]" + PRINT_COLOR_WARNING +
+ "[WARNING] " + PRINT_COLOR_END + msg;
+ std::cerr << warning_msg << std::endl;
}
void
log_error(const std::string& msg)
{
- std::cerr << ('[' + get_time() + "]" + PRINT_COLOR_ERROR + "[ERROR] " +
- PRINT_COLOR_END + msg + '\n')
- << std::flush;
+ std::string error_msg = "[" + get_time() + "]" + PRINT_COLOR_ERROR +
+ "[ERROR] " + PRINT_COLOR_END + msg;
+ std::cerr << error_msg << std::endl;
}
void
@@ -111,4 +111,4 @@ check_file_accessibility(const std::string& filepath)
btllib::check_error(ret != 0, get_strerror() + ": " + filepath);
}
-} // namespace btllib
\ No newline at end of file
+} // namespace btllib
=====================================
src/btllib/util.cpp
=====================================
@@ -3,7 +3,9 @@
#include "btllib/status.hpp"
#include <algorithm>
+#include <cmath>
#include <condition_variable>
+#include <cstdlib>
#include <cstring>
#include <mutex>
#include <string>
@@ -158,6 +160,23 @@ get_basename(const std::string& path)
return path.substr(p + 1);
}
+double
+sum_phred(const std::string& qual, const size_t start_pos, size_t len)
+{
+ double phred_sum = 0;
+ static constexpr double PHRED_OFFSET = 33.0;
+ for (size_t i = start_pos; i < start_pos + len; ++i) {
+ // Convert ASCII character to Phred score
+ int phred_score = (int)(qual.at(i) - PHRED_OFFSET);
+
+ // Delog the Phred score: 10^(-Q/10)
+ double delog_phred = pow(10.0, -phred_score / 10.0);
+
+ phred_sum += delog_phred;
+ }
+ return phred_sum;
+}
+
double
calc_phred_avg(const std::string& qual, const size_t start_pos, size_t len)
{
@@ -170,14 +189,9 @@ calc_phred_avg(const std::string& qual, const size_t start_pos, size_t len)
std::exit(EXIT_FAILURE); // NOLINT(concurrency-mt-unsafe)
}
- size_t phred_sum = 0;
-
- for (size_t i = start_pos; i < start_pos + len; ++i) {
- phred_sum += (size_t)qual.at(i);
- }
+ double phred_sum = sum_phred(qual, start_pos, len);
- static constexpr double PHRED_OFFSET = 33.0;
- return ((double)phred_sum / (double)len) - PHRED_OFFSET;
+ return -10 * log10(phred_sum / len);
}
void
@@ -194,4 +208,4 @@ Barrier::wait()
}
}
-} // namespace btllib
\ No newline at end of file
+} // namespace btllib
=====================================
tests/counting_bloom_filter.cpp
=====================================
@@ -215,5 +215,19 @@ main()
TEST_ASSERT_EQ(cbf.contains(hashes), 0);
}
+ {
+ std::cerr << "Testing CBF element initialization" << std::endl;
+ std::vector<uint64_t> hashes = { 0x47c80ef7eab,
+ 0x8b4a469ef6,
+ 0x32e7ab5203 };
+ btllib::CountingBloomFilter8 cbf(64, hashes.size());
+ cbf.insert(hashes, 2);
+ TEST_ASSERT_EQ(cbf.contains(hashes), 2);
+ cbf.insert(hashes, 5);
+ TEST_ASSERT_EQ(cbf.contains(hashes), 7);
+ cbf.insert(hashes);
+ TEST_ASSERT_EQ(cbf.contains(hashes), 8);
+ }
+
return 0;
}
\ No newline at end of file
=====================================
tests/large.bam
=====================================
Binary files a/tests/large.bam and b/tests/large.bam differ
=====================================
tests/python/test_calc_phred_avg.py
=====================================
@@ -7,9 +7,9 @@ class CalcPhredAvgTests(unittest.TestCase):
def test_calc_phred_avg(self):
qual = "$$%%)*0)'%%&$$%&$&'''*)(((((()55561--.12356577-++**++,////.*))((()+))**010/..--+**++*+++)++++78883"
self.assertAlmostEqual(
- btllib.calc_phred_avg(qual, 0, 10), 6.4, places=3)
- self.assertAlmostEqual(btllib.calc_phred_avg(qual), 10.949, places=3)
+ btllib.calc_phred_avg(qual, 0, 10), 5.34264, places=3)
+ self.assertAlmostEqual(btllib.calc_phred_avg(qual), 8.54327, places=3)
self.assertAlmostEqual(
- btllib.calc_phred_avg(qual, 0, 4), 3.5, places=3)
+ btllib.calc_phred_avg(qual, 0, 4), 3.47128, places=3)
self.assertAlmostEqual(
- btllib.calc_phred_avg(qual, 5, 20), 6.15, places=3)
+ btllib.calc_phred_avg(qual, 5, 20), 5.48923, places=3)
=====================================
tests/util.cpp
=====================================
@@ -32,10 +32,10 @@ main()
double avg1 = btllib::calc_phred_avg(qual);
double avg2 = btllib::calc_phred_avg(qual, 0, 4);
double avg3 = btllib::calc_phred_avg(qual, 5, 20);
- TEST_ASSERT_LT(std::abs(avg - 6.4), 1e-4);
- TEST_ASSERT_LT(std::abs(avg1 - 10.949), 1e-4);
- TEST_ASSERT_LT(std::abs(avg2 - 3.5), 1e-4);
- TEST_ASSERT_LT(std::abs(avg3 - 6.15), 1e-4);
+ TEST_ASSERT_LT(std::abs(avg - 5.34264), 1e-4);
+ TEST_ASSERT_LT(std::abs(avg1 - 8.54327), 1e-4);
+ TEST_ASSERT_LT(std::abs(avg2 - 3.47128), 1e-4);
+ TEST_ASSERT_LT(std::abs(avg3 - 5.48923), 1e-4);
return 0;
-}
\ No newline at end of file
+}
=====================================
wrappers/python/btllib.py
=====================================
@@ -1,5 +1,5 @@
# This file was automatically generated by SWIG (https://www.swig.org).
-# Version 4.1.1
+# Version 4.2.1
#
# Do not make changes to this file unless you know what you are doing - modify
# the SWIG interface file instead.
=====================================
wrappers/python/btllib_wrap.cxx
=====================================
The diff for this file was not included because it is too large.
View it on GitLab: https://salsa.debian.org/med-team/btllib/-/compare/d88b1e044bedcaf1fa7ffd67dd4c758d76cfa905...2e1cc9ef3acac3291e689427c1df23447286c135
--
View it on GitLab: https://salsa.debian.org/med-team/btllib/-/compare/d88b1e044bedcaf1fa7ffd67dd4c758d76cfa905...2e1cc9ef3acac3291e689427c1df23447286c135
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20250113/09314ec1/attachment-0001.htm>
More information about the debian-med-commit
mailing list