[med-svn] [Git][med-team/axe-demultiplexer][master] 11 commits: New upstream version 0.3.3+dfsg

Andreas Tille gitlab at salsa.debian.org
Thu Apr 26 14:31:38 BST 2018


Andreas Tille pushed to branch master at Debian Med / axe-demultiplexer


Commits:
9e85e528 by Andreas Tille at 2018-04-26T15:06:49+02:00
New upstream version 0.3.3+dfsg
- - - - -
6f5ac38c by Andreas Tille at 2018-04-26T15:06:50+02:00
Update upstream source from tag 'upstream/0.3.3+dfsg'

Update to upstream version '0.3.3+dfsg'
with Debian dir 1bcda52bf21fd2914fa385704d70c0cea762fa2b
- - - - -
8dfb0bf0 by Andreas Tille at 2018-04-26T15:06:50+02:00
New upstream version

- - - - -
c6799521 by Andreas Tille at 2018-04-26T15:06:57+02:00
Point Vcs fields to salsa.debian.org

- - - - -
1e3ae3f0 by Andreas Tille at 2018-04-26T15:06:57+02:00
Standards-Version: 4.1.4

- - - - -
1ecfcdd3 by Andreas Tille at 2018-04-26T15:06:57+02:00
debhelper 11

- - - - -
c58c04bb by Andreas Tille at 2018-04-26T15:08:57+02:00
Adapt patches

- - - - -
5d459cdc by Andreas Tille at 2018-04-26T15:12:26+02:00
Build-Depends: s/python-sphinx/python3-sphinx/

- - - - -
9d642430 by Andreas Tille at 2018-04-26T15:14:51+02:00
Do not parse d/changelog

- - - - -
3d4ae943 by Andreas Tille at 2018-04-26T15:28:20+02:00
Enable running tests using Python3 instead of Python2

- - - - -
dc322c5b by Andreas Tille at 2018-04-26T15:31:10+02:00
Upload to unstable

- - - - -


21 changed files:

- README.md
- − TODO.md
- debian/changelog
- debian/compat
- debian/control
- debian/patches/0003-Use-standard-include-path-of-gsl-headers.patch
- + debian/patches/0005-do_not_depend_python2.patch
- debian/patches/series
- debian/rules
- docs/CMakeLists.txt
- docs/algorithm.rst
- docs/conf.py
- docs/gitversion.py
- docs/index.rst
- + docs/tutorial.rst
- docs/usage.rst
- src/axe.c
- src/axe.h
- src/axe_config.h.in
- src/main.c
- tests/CMakeLists.txt


Changes:

=====================================
README.md
=====================================
--- a/README.md
+++ b/README.md
@@ -20,7 +20,7 @@ existing de-multiplexers. Unscientific trials show AXE processes more than
 **Warning**: Axe has not yet been comprehensively tested. However, in the
 spirit of "release early and often", here it is.
 
-###Tests:
+### Tests:
 
 | TravisCI          | [![Build Status](https://travis-ci.org/kdmurray91/axe.svg?branch=master)](https://travis-ci.org/kdmurray91/axe) |
 | ----------------- | --- |
@@ -97,23 +97,11 @@ Full documentation, including a basic description of the algorithm, is hosted
 at https://axe-demultiplexer.readthedocs.org/en/latest/ .
 
 
-Implementation Progress:
-------------------------
-
- - [x] Single ended read de-multiplexing
- - [x] Interleaved/Paired input and output with single-ended de-multiplexing
- - [x] Combinatorial de-multiplexing
- - [ ] CLI integration tests
- - [ ] Comprehensive `libaxe` tests
- - [ ] Comprehensive CLI tests
-
-See also TODO.md
-
 
 Publication
 -----------
 
-A publication is coming soon, if the reviewer gods decide to smile upon us.
+We have a preprint describing AXE up at http://www.biorxiv.org/content/early/2017/07/07/160606
 
 Versioning
 ----------


=====================================
TODO.md deleted
=====================================
--- a/TODO.md
+++ /dev/null
@@ -1,9 +0,0 @@
-TODO list:
-==========
-
-- Re-read documentation; fix errors
-- Exhaustive CLI tests
-- Refactor trie lookups to hide all datrie operations from main body of code
-- Summary info
-- Valgrind integration tests - maybe a CLI flag to enable
-- Investigate shipping the gsl functions bundled in source.


=====================================
debian/changelog
=====================================
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,16 @@
+axe-demultiplexer (0.3.3+dfsg-1) unstable; urgency=medium
+
+  * Team upload.
+  * New upstream version
+  * Point Vcs fields to salsa.debian.org
+  * Standards-Version: 4.1.4
+  * debhelper 11
+  * Build-Depends: s/python-sphinx/python3-sphinx/
+  * Do not parse d/changelog
+  * Enable running tests using Python3 instead of Python2
+
+ -- Andreas Tille <tille at debian.org>  Thu, 26 Apr 2018 15:24:43 +0200
+
 axe-demultiplexer (0.3.2+dfsg1-1) unstable; urgency=medium
 
   [ Kevin Murray ]


=====================================
debian/compat
=====================================
--- a/debian/compat
+++ b/debian/compat
@@ -1 +1 @@
-10
+11


=====================================
debian/control
=====================================
--- a/debian/control
+++ b/debian/control
@@ -3,15 +3,15 @@ Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.
 Uploaders: Kevin Murray <kdmfoss at gmail.com>
 Section: science
 Priority: optional
-Standards-Version: 3.9.8
-Build-Depends: debhelper (>= 10),
+Build-Depends: debhelper (>= 11~),
                cmake,
                zlib1g-dev,
                libqes-dev,
                libgsl-dev,
-               python-sphinx,
-Vcs-Browser: https://anonscm.debian.org/git/debian-med/axe-demultiplexer.git
-Vcs-Git: https://anonscm.debian.org/git/debian-med/axe-demultiplexer.git
+               python3-sphinx
+Standards-Version: 4.1.4
+Vcs-Browser: https://salsa.debian.org/med-team/axe-demultiplexer
+Vcs-Git: https://salsa.debian.org/med-team/axe-demultiplexer.git
 Homepage: https://github.com/kdmurray91/axe
 
 Package: axe-demultiplexer


=====================================
debian/patches/0003-Use-standard-include-path-of-gsl-headers.patch
=====================================
--- a/debian/patches/0003-Use-standard-include-path-of-gsl-headers.patch
+++ b/debian/patches/0003-Use-standard-include-path-of-gsl-headers.patch
@@ -6,8 +6,6 @@ Subject: Use standard include path of gsl headers
  src/axe.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
-diff --git a/src/axe.c b/src/axe.c
-index 6390418..9326947 100644
 --- a/src/axe.c
 +++ b/src/axe.c
 @@ -23,7 +23,7 @@
@@ -16,6 +14,6 @@ index 6390418..9326947 100644
  #include "axe.h"
 -#include "gsl_combination.h"
 +#include <gsl/gsl_combination.h>
- 
- /* Holds the current timestamp, so we don't have to free the returned string
-  * from now(). */
+ #include <sys/types.h>
+ #include <sys/stat.h>
+ #include <unistd.h>


=====================================
debian/patches/0005-do_not_depend_python2.patch
=====================================
--- /dev/null
+++ b/debian/patches/0005-do_not_depend_python2.patch
@@ -0,0 +1,30 @@
+Author: Andreas Tille <tille at debian.org>
+Last-Update: Thu, 26 Apr 2018 15:14:56 +0200
+Description: Enable running tests using Python3 instead of Python2
+ When switching Build-Depends from python-sphinx to python3-sphinx
+ the Python 2 interpreter was not installed any more in the build
+ time environment.  This was used to trigger the build time test
+ suite.  By changing the script tp Python3 it is possible to get
+ rid of this dependency.
+
+--- a/tests/axe_cli_tests.py
++++ b/tests/axe_cli_tests.py
+@@ -1,5 +1,5 @@
+-#!/usr/bin/env python
+-from __future__ import print_function
++#!/usr/bin/env python3
++
+ import hashlib
+ import logging
+ import os
+--- a/tests/CMakeLists.txt
++++ b/tests/CMakeLists.txt
+@@ -19,7 +19,7 @@ ADD_TEST(NAME "UnitTests" COMMAND test_a
+ SET(COVERAGE_CMD test_axe)
+ SET(COVERAGE_OUT "${CMAKE_BINARY_DIR}/coverage_html")
+ 
+-ADD_TEST(NAME "IntegrationTests" COMMAND python
++ADD_TEST(NAME "IntegrationTests" COMMAND python3
+          ${CMAKE_BINARY_DIR}/bin/axe_cli_tests.py
+          ${CMAKE_BINARY_DIR})
+ 


=====================================
debian/patches/series
=====================================
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -2,3 +2,4 @@ cmake-version
 0002-Remove-libqes-from-CML.txt.patch
 0003-Use-standard-include-path-of-gsl-headers.patch
 0004-Dynamically-link-to-libqes.patch
+0005-do_not_depend_python2.patch


=====================================
debian/rules
=====================================
--- a/debian/rules
+++ b/debian/rules
@@ -2,12 +2,12 @@
 
 export DEB_BUILD_MAINT_OPTIONS = hardening=+all
 
-DEBPKGNAME     := $(shell dpkg-parsechangelog | awk '/^Source:/ {print $$2}')
+include /usr/share/dpkg/default.mk
 
 %:
 	dh $@ --buildsystem=cmake
 
 override_dh_installdocs:
 	dh_installdocs
-	mkdir -p debian/$(DEBPKGNAME)/usr/share/doc/$(DEBPKGNAME)/tests
-	cp -a tests/data debian/$(DEBPKGNAME)/usr/share/doc/$(DEBPKGNAME)/tests
+	mkdir -p debian/$(DEB_SOURCE)/usr/share/doc/$(DEB_SOURCE)/tests
+	cp -a tests/data debian/$(DEB_SOURCE)/usr/share/doc/$(DEB_SOURCE)/tests


=====================================
docs/CMakeLists.txt
=====================================
--- a/docs/CMakeLists.txt
+++ b/docs/CMakeLists.txt
@@ -17,7 +17,7 @@ IF(SPHINXBUILD)
 	COMMAND ${CMAKE_COMMAND} -E remove_directory "${CMAKE_BINARY_DIR}/doc"
         WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
         )
-    ADD_CUSTOM_TARGET(doc ALL DEPENDS doc_man doc_onehtml)
+    ADD_CUSTOM_TARGET(doc ALL DEPENDS doc_man doc_html doc_onehtml)
     INSTALL(FILES ${CMAKE_BINARY_DIR}/doc/man/axe.1 DESTINATION "share/man/man1" RENAME "axe-demux.1")
     SET_DIRECTORY_PROPERTIES(PROPERTY ADDITIONAL_MAKE_CLEAN_FILES doc/)
 ELSE()


=====================================
docs/algorithm.rst
=====================================
--- a/docs/algorithm.rst
+++ b/docs/algorithm.rst
@@ -4,17 +4,17 @@ Axe's matching algorithm
 
 Axe uses an algorithm based on longest-prefix-in-trie matching to match a
 variable length from the start of each read against a set of 'mutated'
-barcodes.
+indexes.
 
 Hamming distance matching
 -------------------------
 
 While for most applications in high-throughput sequencing hamming distances are
-a frowned-upon metric, it is typical for HTS read barcodes to be designed to
+a frowned-upon metric, it is typical for HTS read indexes to be designed to
 tolerate a certain level of hamming mismatches. Given these sequences are short
 and typically occur at the 5' end of reads, insertions and deletions rarely
 need be considered, and the increased rate of assignment of reads with many
-errors is offset by the risk of falsely assigning barcodes to an incorrect
+errors is offset by the risk of falsely assigning indexes to an incorrect
 sample. In any case, reads with more than 1-2 sequencing errors in their first
 several bases are likely to be poor quality, and will simply be filtered out
 during downstream quality control.
@@ -22,47 +22,47 @@ during downstream quality control.
 Hamming mismatch tries
 ----------------------
 
-Typically, reads are matched to a set of barcodes by calculating the hamming
-distance between the barcode, and the first :math:`l` bases of a read for a
-barcode of length :math:`l`. The "correct" barcode is then selected by
-recording either the barcode with the lowest hamming distance to the read
-(competitive matching) or by simply accepting the first barcode with a hamming
+Typically, reads are matched to a set of indexes by calculating the hamming
+distance between the index, and the first :math:`l` bases of a read for a
+index of length :math:`l`. The "correct" index is then selected by
+recording either the index with the lowest hamming distance to the read
+(competitive matching) or by simply accepting the first index with a hamming
 distance below a certain threshold.  These approaches are both very
 computationally expensive, and can have lower accuracy than the algorithm I
-propose. Additionally, implementations of these methods rarely handle barcodes
-of differing length and combinatorial barcoding well, if at all.
+propose. Additionally, implementations of these methods rarely handle indexes
+of differing length and combinatorial indexing well, if at all.
 
 Central to Axe's algorithm is the concept of hamming-mismatch tries. A trie is
 a N-ary tree for an N letter alphabet. In the case of high-throughput
 sequencing reads, we have the alphabet ``AGCT``, corresponding to the four
 nucleotides of DNA, plus ``N``, used to represent ambiguous base calls. Instead
-of matching each barcode to each read, we pre-calculate all allowable sequences
+of matching each index to each read, we pre-calculate all allowable sequences
 at each mismatch level, and store these in level-wise tries. For  example, to
 match to a hamming distance of 2, we create three tries: One containing all
-barcodes, verbatim, and two tries where every sequence within a hamming
-distance of 1 and 2 of each barcode respectively. Hereafter, these tries are
+indexes, verbatim, and two tries where every sequence within a hamming
+distance of 1 and 2 of each index respectively. Hereafter, these tries are
 referred to  as the 0, 1 and 2-mm tries, for a hamming distance (mismatch) of
 0, 1 and 2. Then, we find the longest prefix in each sequence read in the 0mm
 trie. If this prefix is not a valid leaf in the 0mm trie, we find the longest
 prefix in the 1mm trie, and so on for all tries in ascending order. If no
 prefix of the read is a complete sequence in any trie, the read is assigned to
-an "non-barcoded" output file.
+an "non-indexed" output file.
 
-This algorithm ensures optimal barcode matching in many ways, but is also
-extremely fast. In situations with barcodes of differing length, we ensure that
-the *longest* acceptable barcode at a given hamming distance is chosen;
-assuming that sequence is random after the barcode, the probability of false
+This algorithm ensures optimal index matching in many ways, but is also
+extremely fast. In situations with indexes of differing length, we ensure that
+the *longest* acceptable index at a given hamming distance is chosen;
+assuming that sequence is random after the index, the probability of false
 assignments using this method is low. We also ensure that short perfect matches
-are preferred to longer inexact matches, as we firstly only consider barcodes
-with no error, then 1 error, and so on. This ensures that reads with barcodes
+are preferred to longer inexact matches, as we firstly only consider indexes
+with no error, then 1 error, and so on. This ensures that reads with indexes
 that are followed by random sequence that happens to inexactly match a longer
-barcode in the set are not falsely assigned to this longer barcode.
+index in the set are not falsely assigned to this longer index.
 
 The speed of this algorithm is largely due to the constant time matching
-algorithm with respect to the number of barcodes to match. The time taken to
-match each read is proportional instead to the length of the barcodes, as for a
-barcode of length :math:`l`, at most :math:`l + 1` trie level descents are
+algorithm with respect to the number of indexes to match. The time taken to
+match each read is proportional instead to the length of the indexes, as for a
+index of length :math:`l`, at most :math:`l + 1` trie level descents are
 required to find an entry in the trie. As this length is more-or-less constant
 and small, the overall complexity of axe's algorithm is :math:`O(n)` for
 :math:`n` reads, as opposed to :math:`O(nm)` for :math:`n` reads and :math:`m`
-barcodes as is typical for traditional matching algorithms
+indexes as is typical for traditional matching algorithms


=====================================
docs/conf.py
=====================================
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -31,7 +31,7 @@ import gitversion
 # ones.
 extensions = [
     'sphinx.ext.todo',
-    'sphinx.ext.pngmath',
+    'sphinx.ext.imgmath',
     'sphinx.ext.ifconfig',
 ]
 
@@ -188,10 +188,10 @@ htmlhelp_basename = 'axedoc'
 
 latex_elements = {
 # The paper size ('letterpaper' or 'a4paper').
-'papersize': 'a4paper',
+#'papersize': 'a4paper',
 
 # The font size ('10pt', '11pt' or '12pt').
-'pointsize': '11pt',
+#'pointsize': '11pt',
 
 # Additional stuff for the LaTeX preamble.
 #'preamble': '',


=====================================
docs/gitversion.py
=====================================
--- a/docs/gitversion.py
+++ b/docs/gitversion.py
@@ -8,8 +8,8 @@
 # versioneer-0.13 (https://github.com/warner/python-versioneer)
 
 # these strings will be replaced by git during git-archive
-git_refnames = " (HEAD -> master, tag: 0.3.2)"
-git_full = "0faf8e3e05ba45343357e0cfa6c9657cedee4cc9"
+git_refnames = " (tag: 0.3.3)"
+git_full = "4b985d2ac95b046e96f0a83cb98f1f5e31fa49bd"
 
 # these strings are filled in when 'setup.py versioneer' creates _version.py
 tag_prefix = ""


=====================================
docs/index.rst
=====================================
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -7,16 +7,17 @@ Welcome to axe's documentation!
 ===============================
 
 Axe is a read de-multiplexer, useful in situations where sequence reads contain
-the barcodes that uniquely distinguish samples. Axe uses a rapid and accurate
+the indexes that uniquely distinguish samples. Axe uses a rapid and accurate
 algorithm based on hamming mismatch tries to competitively match the prefix of
-a sequencing read against a set of barcodes. Axe supports combinatorial
-barcoding schemes.
+a sequencing read against a set of indexes. Axe supports combinatorial
+indexing schemes.
 
 Contents:
 
 .. toctree::
-   :maxdepth: 2
+   :maxdepth: 1
 
+   tutorial
    usage
    algorithm
 


=====================================
docs/tutorial.rst
=====================================
--- /dev/null
+++ b/docs/tutorial.rst
@@ -0,0 +1,74 @@
+************
+Axe Tutorial
+************
+
+In this tutorial, we'll use Axe to demultiplex some paired-end,
+combinatorially-index Genotyping-by-Sequencing reads. The data for this
+tutorial is available from figshare:
+https://figshare.com/articles/axe-tutorial_tar/6143720 .
+
+Axe should be run as the initial step of any analysis: don't use sequence QC
+tools like AdapterRemoval or Trimmomatic before using axe, as indexes may be
+trimmed away, or pairing information removed.
+
+Step 0: Download the trial data
+-------------------------------
+
+This will download the trial data, and extract it on the fly:
+
+.. code-block:: bash
+
+   curl -LS https://ndownloader.figshare.com/files/11094782 | tar xv
+
+Step 1: prepare a key file
+--------------------------
+
+The key file associates index sequences with sample names. A key file can be
+prepared in a spreadsheet editor, like LibreOffice Calc, or Excel. The format
+is quite strict, and is described in detail in the online usage documentation.
+
+Let's now inspect the keyfile I have provided for the tutorial.
+
+.. code-block:: bash
+
+   head axe-keyfile.tsv
+
+
+Step 2: Demultiplex with Axe
+----------------------------
+
+
+In this step, we will demultiplex our interleaved input file to per-sample
+interleaved output files. To see a full range of Axe's options, please run
+``axe-demux -h``, or inspect the online usage documentation.
+
+First, let's inspect the input.
+
+.. code-block:: bash
+
+   zcat axe-tutorial.fastq.gz | head -n 8
+
+Then, we need to ensure that axe has somewhere to put the demultiplexed reads.
+Axe outputs one file (or more, depending on pairing) per sample. Axe does so by
+appending the sample name to some prefix (as given by the ``-I``, ``-F``,
+and/or ``-R`` options). If this prefix is a directory, then sample fastq files
+will be created in that sub-directory, but the directory must exist. Let's make
+an output directory:
+
+.. code-block:: bash
+
+   mkdir -p output
+
+Now, let's demultiplex the reads!
+
+.. code-block:: bash
+
+   axe-demux -i axe-tutorial.fastq.gz -I output/ \
+      -c -b axe-keyfile.tsv -t demux-stats.tsv -z 1
+
+The command above demultiplexes reads from ``axe-tutorial.fastq.gz`` into
+separate files under ``output``, based on the combinatorial (``-c``)
+sample-to-index-sequence mapping described in ``axe-keyfile.tsv``, and saves a
+file of statistics as ``demux-stats.tsv``. Note that we have enabled
+compression of output files using the ``-z`` option, in case you don't have
+much disk space available. This will make Axe slightly slower.


=====================================
docs/usage.rst
=====================================
--- a/docs/usage.rst
+++ b/docs/usage.rst
@@ -9,15 +9,16 @@ Axe Usage
   did not change.
 
 Axe has several usage modes. The primary distinction is between the two
-alternate barcoding schemes, single and combinatorial barcoding. Single barcode
-matching is used when only the first read contains barcode sequences.
-Combinatorial barcoding is used when both reads in a read pair contain
-independent (typically different) barcode sequences.
+alternate indexing schemes, single and combinatorial indexing. Single index
+matching is used when only the first read contains index sequences.
+Combinatorial indexing is used when both reads in a read pair contain
+independent (typically different) index sequences.
 
 For concise reference, the command-line usage of ``axe-demux`` is reproduced
 below:
 
 .. literalinclude:: usage.txt
+   :language: text
 
 Inputs and Outputs
 ------------------
@@ -37,8 +38,8 @@ default) and 9, where 0 indicates plain text output (``gzopen`` mode "wT"), and
 fastest and 9 is most compact.
 
 The output flags should be prefixes that are used to generate the output file
-name based on the barcode's (or barcode pair's) ID. The names are generated as:
-``prefix`` + ``_`` + ``barcode ID`` + ``_`` + ``read number`` + ``.extension``.
+name based on the index's (or index pair's) ID. The names are generated as:
+``prefix`` + ``_`` + ``index ID`` + ``_`` + ``read number`` + ``.extension``.
 The output file for reads that could not be demultiplexed is ``prefix`` + ``_``
 + ``unknown`` + ``_`` + ``read number`` + ``.extension``.  The read number is
 omitted unless the paired read file scheme is used, and is "il" for interleaved
@@ -51,65 +52,65 @@ The corresponding CLI flags are:
  - ``-r`` and ``-R``: Paired R2 file input and output.
  - ``-i`` and ``-I``: Interleaved paired input and output.
 
-The barcode file
+The index file
 ----------------
 
-The barcode file is a tab-separated file with an optional header. It is
+The index file is a tab-separated file with an optional header. It is
 mandatory, and is always supplied using the ``-b`` command line flag. The exact
-format is dependent on barcoding mode, and is described further in the sections
+format is dependent on indexing mode, and is described further in the sections
 below. If a header is present, the header line must start with either
-`Barcode` or ``barcode``, or it will be interpreted as a barcode line, leading
+`Barcode` or ``index``, or it will be interpreted as a index line, leading
 to a parsing error. Any line starting with ';' or '#' is ignored, allowing
-comments to be added in line with barcodes. Please ensure that the software
-used to produce the barcode uses ASCII encoding, and does not insert a
+comments to be added in line with indexes. Please ensure that the software
+used to produce the index uses ASCII encoding, and does not insert a
 Byte-order Mark (BoM) as many text editors can silently use Unicode-based
 encoding schemes. I recommend the use of
 `LibreOffice Calc <www.libreoffice.org>`_ (part of a free and open source
-office suite) to generate barcode tables; Microsoft Excel can also be used.
+office suite) to generate index tables; Microsoft Excel can also be used.
 
 Mismatch level selection
 ------------------------
 
-Independent of barcode mode, the ``-m`` flag is used to select the maximum
-allowable hamming distance between a read's prefix and a barcode to be
-considered as a match. As "mutated" barcodes must be unique, a hamming distance
-of one is the default as typically barcodes are designed to differ by a hamming
+Independent of index mode, the ``-m`` flag is used to select the maximum
+allowable hamming distance between a read's prefix and a index to be
+considered as a match. As "mutated" indexes must be unique, a hamming distance
+of one is the default as typically indexes are designed to differ by a hamming
 distance of at least two. Optionally, (using the ``-p`` flag), axe will allow
-selective mismatch levels, where, if clashes are observed, the barcode will
-only be matched exactly. This allows one to process datasets with barcodes that
+selective mismatch levels, where, if clashes are observed, the index will
+only be matched exactly. This allows one to process datasets with indexes that
 don't have a sufficiently high distance between them.
 
-Single barcode mode
+Single index mode
 -------------------
 
-Single barcode mode is the default mode of operation. Barcodes are matched
-against read one (hereafter the forward read), and the barcode is trimmed from
+Single index mode is the default mode of operation. Barcodes are matched
+against read one (hereafter the forward read), and the index is trimmed from
 only the forward read, unless the ``-2`` command line flag is given, in which
-case a prefix the same length as the matched barcode is also trimmed from the
+case a prefix the same length as the matched index is also trimmed from the
 second or reverse read. Note that sequence of this second read is not checked
 before trimming.
 
-In single barcode mode, the barcode file has two columns: ``Barcode`` and
+In single index mode, the index file has two columns: ``Barcode`` and
 ``ID``.
 
-Combinatorial barcode mode
+Combinatorial index mode
 --------------------------
 
-Combinatorial barcode mode is activated by giving the ``-c`` flag on the
-command line. Forward read barcodes are matched against the forward read, and
-reverse read barcodes are matched against the reverse read. The optimal
-barcodes are selected independently, and the barcode pair is selected from
-these two barcodes. The respective  barcodes are trimmed from both reads; the
-``-2`` command line flag has no effect in combinatorial barcode mode.
+Combinatorial index mode is activated by giving the ``-c`` flag on the
+command line. Forward read indexes are matched against the forward read, and
+reverse read indexes are matched against the reverse read. The optimal
+indexes are selected independently, and the index pair is selected from
+these two indexes. The respective  indexes are trimmed from both reads; the
+``-2`` command line flag has no effect in combinatorial index mode.
 
-In combinatorial barcode mode, the barcode file has three columns:
-``Barcode1``, ``Barcode2`` and ``ID``. Individual barcodes can occur many times
-within the forward and reverse barcodes, but barcode pairs must be unique
+In combinatorial index mode, the index file has three columns:
+``Barcode1``, ``Barcode2`` and ``ID``. Individual indexes can occur many times
+within the forward and reverse indexes, but index pairs must be unique
 combinations.
 
-The Demultipexing Statistics File
----------------------------------
+The Demultiplexing Statistics File
+----------------------------------
 
 The ``-t`` option allows the output of per-sample read counts to a
 tab-separated file. The file will have a header describing its format, and
-includes a line for unbarcoded reads.
+includes a line for reads which could not be demultiplexed.


=====================================
src/axe.c
=====================================
--- a/src/axe.c
+++ b/src/axe.c
@@ -3,7 +3,7 @@
  *
  *       Filename:  axe.c
  *    Description:  Demultiplex reads by 5' barcodes
- *      Copyright:  2014-2015 Kevin Murray <spam at kdmurray.id.au>
+ *      Copyright:  2014-2016 Kevin Murray <kdmfoss at gmail.com>
  *        License:  GNU GPL v3+
  *
  * This program is free software: you can redistribute it and/or modify it
@@ -24,6 +24,9 @@
 
 #include "axe.h"
 #include "gsl_combination.h"
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
 
 /* Holds the current timestamp, so we don't have to free the returned string
  * from now(). */
@@ -169,16 +172,15 @@ _axe_format_outfile_path (const char *prefix, const char *id, int read,
     char buf[4096];
     int res = 0;
     char *our_prefix = NULL;
-    char lastchr = '\0';
     size_t prefix_len = 0;
+    struct stat statbuff;
 
     if (prefix == NULL || id == NULL) {
         return NULL;
     }
 
     prefix_len = strlen(prefix);
-    lastchr = prefix[prefix_len - 1];
-    if (lastchr == '/' || lastchr == '\\') {
+    if (stat(prefix, &statbuff) == 0 && S_ISDIR(statbuff.st_mode)) {
         /* Our prefix is a directory, don't add '_' */
         our_prefix = strdup(prefix);
     } else {


=====================================
src/axe.h
=====================================
--- a/src/axe.h
+++ b/src/axe.h
@@ -3,7 +3,7 @@
  *
  *       Filename:  axe.h
  *    Description:  Demultiplex reads by 5' barcodes
- *      Copyright:  2014-2015 Kevin Murray <spam at kdmurray.id.au>
+ *      Copyright:  2014-2016 Kevin Murray <kdmfoss at gmail.com>
  *        License:  GNU GPL v3+
  *
  * This program is free software: you can redistribute it and/or modify it


=====================================
src/axe_config.h.in
=====================================
--- a/src/axe_config.h.in
+++ b/src/axe_config.h.in
@@ -3,7 +3,7 @@
  *
  *       Filename:  axe_config.h.in
  *    Description:  Pull in build system definitions
- *      Copyright:  2014-2015 Kevin Murray <spam at kdmurray.id.au>
+ *      Copyright:  2014-2016 Kevin Murray <kdmfoss at gmail.com>
  *        License:  GNU GPL v3+
  *
  * This program is free software: you can redistribute it and/or modify it


=====================================
src/main.c
=====================================
--- a/src/main.c
+++ b/src/main.c
@@ -3,7 +3,7 @@
  *
  *       Filename:  axe_main.c
  *    Description:  Main loop for axe
- *      Copyright:  2014-2015 Kevin Murray <spam at kdmurray.id.au>
+ *      Copyright:  2014-2016 Kevin Murray <kdmfoss at gmail.com>
  *        License:  GNU GPL v3+
  *
  * This program is free software: you can redistribute it and/or modify it


=====================================
tests/CMakeLists.txt
=====================================
--- a/tests/CMakeLists.txt
+++ b/tests/CMakeLists.txt
@@ -73,7 +73,7 @@ IF (CMAKE_BUILD_TYPE STREQUAL "Coverage")
 
 		# Capturing lcov counters and generating report
 		COMMAND ${LCOV_PATH} --rc lcov_branch_coverage=1 --directory ${CMAKE_BINARY_DIR}/src --capture --output-file ${COVERAGE_OUT}.info
-		COMMAND ${LCOV_PATH} --rc lcov_branch_coverage=1 --output-file ${COVERAGE_OUT}.info --remove ${COVERAGE_OUT}.info src/kmlib/*
+		COMMAND ${LCOV_PATH} --rc lcov_branch_coverage=1 --output-file ${COVERAGE_OUT}.info --remove ${COVERAGE_OUT}.info src/libqes/*
 		COMMAND ${LCOV_PATH} --rc lcov_branch_coverage=1 --output-file ${COVERAGE_OUT}.info --remove ${COVERAGE_OUT}.info src/datrie/*
 		COMMAND ${GENHTML_PATH} --branch-coverage -o ${COVERAGE_OUT} ${COVERAGE_OUT}.info
 		COMMAND ${CMAKE_COMMAND} -E remove ${COVERAGE_OUT}.info



View it on GitLab: https://salsa.debian.org/med-team/axe-demultiplexer/compare/31305598d383b75ede2a2c6e9b2b4c5287dddfaa...dc322c5b3526e88431285dc0826b4bd7d33581c6

---
View it on GitLab: https://salsa.debian.org/med-team/axe-demultiplexer/compare/31305598d383b75ede2a2c6e9b2b4c5287dddfaa...dc322c5b3526e88431285dc0826b4bd7d33581c6
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20180426/9fb756f2/attachment-0001.html>


More information about the debian-med-commit mailing list