[med-svn] [Git][med-team/sortmerna][master] 5 commits: Remove reference to old doc

Wed Feb 2 12:52:06 GMT 2022


Andreas Tille pushed to branch master at Debian Med / sortmerna


Commits:
1ad59957 by Andreas Tille at 2022-02-02T10:49:41+01:00
Remove reference to old doc

- - - - -
d4e09336 by Andreas Tille at 2022-02-02T11:06:05+01:00
Review copyright

- - - - -
c713757d by Andreas Tille at 2022-02-02T11:07:17+01:00
Do not install cmake files

- - - - -
70b32ffe by Andreas Tille at 2022-02-02T11:22:20+01:00
Deactivate autopkgtest which is based on binary indexdb_rna which is not available any more

- - - - -
b7de51cb by Andreas Tille at 2022-02-02T13:51:49+01:00
Upload to experimental

- - - - -


8 changed files:

- debian/changelog
- debian/copyright
- − debian/doc_source/SortMeRNA-User-Manual-2.0.tex
- − debian/doc_source/get
- debian/examples
- − debian/man/indexdb_rna.1
- debian/rules
- debian/tests/control → debian/tests/control_deactivated


Changes:

=====================================
debian/changelog
=====================================
@@ -1,4 +1,4 @@
-sortmerna (4.3.4-1) UNRELEASED; urgency=medium
+sortmerna (4.3.4-1) experimental; urgency=medium
 
   [ Steffen Möller ]
   * Update watch - we missed a couple of versions
@@ -13,8 +13,11 @@ sortmerna (4.3.4-1) UNRELEASED; urgency=medium
   * Build-Depends: libgflags-dev, libsnappy-dev, libzstd-dev
     (needed by librocksdb-dev (>= 6.25.3) - see #1004665)
   * Build-Depends: rapidjson-dev, libconcurrentqueue-dev
+  * Review copyright
+  * Deactivate autopkgtest which is based on binary indexdb_rna
+    which is not available any more
 
- -- Andreas Tille <tille at debian.org>  Mon, 31 Jan 2022 13:37:04 +0100
+ -- Andreas Tille <tille at debian.org>  Wed, 02 Feb 2022 11:22:30 +0100
 
 sortmerna (2.1-5) unstable; urgency=medium
 


=====================================
debian/copyright
=====================================
@@ -11,7 +11,7 @@ Copyright: 2012-2014 Bonsai Bioinformatics Research Group, LIFL and
            jenya.kopylov at gmail.com, laurent.noe at lifl.fr, helene.touzet at lifl.fr
 License: LGPL-3+
 
-Files: alp/*
+Files: 3rdparty/alp/*
 Copyright: John Spouge, Sergey Sheetlin
 License: PublicDomain
                            PUBLIC DOMAIN NOTICE
@@ -34,19 +34,12 @@ License: PublicDomain
  .
  Please cite the author in any work or product based on this material.
 
-Files: SortMeRNA-User-Manual-2.1.pdf
-Copyright: 2014 Evguenia Kopylova <jenya.kopylov at gmail.com>
-License: LGPL-3+
-Comment: The source for this file was obtained from Git and is
- available in debian/doc_source.
- It will be included in the next upstream release.
-
 Files: debian/*
 Copyright: 2015 Tim Booth <tbooth at ceh.ac.uk>
            2015 Andreas Tille <tille at debian.org>
 License: LGPL-3+
 
-Files: src/ssw.c
+Files: src/sortmerna/ssw.c
 Copyright: 2012-1015 Boston College.
            Mengyao Zhao <zhangmp at bc.edu>
 License: expat


=====================================
debian/doc_source/SortMeRNA-User-Manual-2.0.tex deleted
=====================================
@@ -1,996 +0,0 @@
-\documentclass[10pt,a4paper]{article}
-
-\usepackage[utf8]{inputenc}
-\usepackage[english]{babel}
-\usepackage{amsmath, amsthm, amssymb}
-\usepackage[english]{isodate}
-\usepackage[parfill]{parskip}
-\usepackage{url}
-\usepackage{keystroke}
-\usepackage{graphicx}
-\usepackage{fancyvrb}
-\usepackage{color}
-\usepackage[usenames,dvipsnames]{xcolor}
-\usepackage{booktabs}
-\usepackage{multirow}
-\usepackage{hyperref}
-
-\hypersetup{
-    colorlinks,
-    citecolor=black,
-    filecolor=black,
-    linkcolor=black,
-    urlcolor=black
-}
-
-\usepackage{tikz} % graphic diagrams
-\usetikzlibrary{positioning,patterns,backgrounds,decorations.pathreplacing,decorations.markings,shapes,fit,calc,shadows} % fitting shapes to coordinates
-\usetikzlibrary{automata,trees}
-
-\newcommand\verbbf[1]{\textcolor[rgb]{0,0,1}{\textbf{#1}}}
-
-\title{SortMeRNA User Manual}
-\author{Evguenia Kopylova\\ {\em jenya.kopylov at gmail.com}}
-\date{Oct 2014, version 2.0}
-
-\addtolength{\oddsidemargin}{-.7in}
-\addtolength{\textwidth}{1.2in}
-
-
-\begin{document}
-\maketitle
-
-\newpage
-\tableofcontents
-
-\newpage
-\section{Introduction}
-
-Copyright (C) 2012-2015 Bonsai Bioinformatics Research Group \\
-(LIFL - Universit\'{e} Lille 1), CNRS UMR 8022, INRIA Nord-Europe \\
-\url{http://bioinfo.lifl.fr/RNA/sortmerna/} \\
-OTU-picking extensions and continuous support developed in the Knight Lab, \\
-BioFrontiers Institute, University of Colorado at Boulder, CO \\
-\url{https://knightlab.colorado.edu}
-
-SortMeRNA is a local sequence alignment tool for filtering, mapping and OTU-picking.
-The core algorithm is based on approximate seeds and allows for fast and sensitive analyses
-of NGS reads. The main application of SortMeRNA is filtering rRNA from metatranscriptomic data.
-Additional applications include OTU-picking and taxonomy assignation available through QIIME v1.9+ (\url{http://qiime.org}, currently the development version to be released in early December).
-SortMeRNA takes as input a file of reads (fasta or fastq format) and one or multiple rRNA
-database file(s), and sorts apart aligned and rejected reads into two files specified by the user.
-SortMeRNA works with Illumina, 454, Ion Torrent and PacBio data, and can produce SAM and
-BLAST-like alignments.
-
-For questions \& help, please contact:
-
-\begin{verbatim}
- 1. Evguenia Kopylova     evguenia.kopylova at lifl.fr
- 2. Laurent Noe           laurent.noe at lifl.fr
- 3. Helene Touzet         helene.touzet at lifl.fr
-\end{verbatim}
-
-{\bf Important:} This user manual is strictly for SortMeRNA version 2.0. 
-
-
-\section{Installation}
-
-\begin{figure}[here!]
-\caption{\texttt{sortmerna-2.0} directory tree}~\\
-\centering
-\tikzstyle{every node}=[draw=black,thick,anchor=west]
-\tikzstyle{selected}=[draw=red,fill=red!30]
-\tikzstyle{root}=[fill=gray!50]
-\begin{tikzpicture}[%
-  grow via three points={one child at (0.5,-0.7) and
-  two children at (0.5,-0.7) and (0.5,-1.4)},
-  edge from parent path={(\tikzparentnode.south) |- (\tikzchildnode.west)}]
-  \node [root] {sortmerna-2.0} 
-    child { node {alp}}
-    child { node {cmph}}
-    child { node {src}}		
-    child { node {include}}
-    child { node {scripts}}
-    child { node {tests}}
-    child { node {rRNA\_databases}
-      child { node {silva-bac-16s-id90.fasta}}
-      child { node {...}}
-    }
-    child [missing] {}
-    child [missing] {}
-    child { node [selected] {sortmerna} }
-    child { node [selected] {indexdb\_rna} }
-    ;
-\end{tikzpicture}
-\label{fig:systemtree}
-\end{figure}
-
-\subsection{Install from tarball release}
-\label{sec:install}
-
-\begin{enumerate}
- \item Download \texttt{sortmerna-2.0.tar.gz} from \url{https://github.com/biocore/sortmerna/releases}
- \item Extract the source code package into a directory of your choice, enter \texttt{sortmerna-2.0} directory and type,
-  \begin{verbatim}
-  > bash ./build.sh
- \end{verbatim}
-  \item At this point, two executables \texttt{indexdb\_rna} and \texttt{sortmerna} will be located
- in the \texttt{sortmerna-2.0} directory. 
- If the user would like to install the executables into their default installation directory (\texttt{/usr/local/bin} for Linux or \texttt{/opt/local/bin} for Mac) then type,
- \begin{verbatim}
-  > make install (with root permissions)
- \end{verbatim}
- \item To begin using SortMeRNA, type `\texttt{indexdb\_rna -h}' or `\texttt{sortmerna -h}'. Databases must first be indexed using \texttt{indexdb\_rna}.
-\end{enumerate}
-
-\subsection{Install development version from git}
-\label{sec:install}
-
-\begin{enumerate}
- \item Clone the sortmerna directory to your local system
- \begin{verbatim}
-  > git clone https://github.com/biocore/sortmerna.git
- \end{verbatim}
- \item Build sortmerna
-  \begin{verbatim}
-  > cd sortmerna
-  > bash ./build.sh
- \end{verbatim}
-
-
-
-\end{enumerate}
-
-\subsection{Install from precompiled code}
-
-\begin{enumerate}
- \item Download the latest binary distribution of SortMeRNA from \url{http://bioinfo.lifl.fr/RNA/sortmerna}
- \item Extract the source code package into a directory of your choice,
- \begin{verbatim}
-  > tar -xvf sortmerna-2.0.tar.gz 
-  > cd sortmerna-2.0
- \end{verbatim}
- \item To begin using SortMeRNA, type `\texttt{indexdb\_rna -h}' or `\texttt{sortmerna -h}'. The user must firstly index 
- the databases with the command \texttt{indexdb\_rna} before they can run the command \texttt{sortmerna}.
- 
-\end{enumerate}
-
-\subsection{Uninstall}
-
-\noindent If the user installed SortMeRNA using the command \texttt{`make install'}, then they can use the command \texttt{`make uninstall'} to
-uninstall SortMeRNA (with root permissions).
-
-\section{Databases}
-\noindent SortMeRNA comes prepackaged with 8 databases,\\
-
-\resizebox{6.4in}{!}{
-\begin{tabular}{l|l|l|l|l}
- \textbf{representative database} & \textbf{\%id} & $\#$ \textbf{seq (clustered)} & \textbf{origin} & $\#$ \textbf{seq (original)}  \\
- \hline
- silva-bac-16s-id90 & 90 & 12798 & SILVA SSU Ref NR v.119 & 464618 \\
- silva-arc-16s-id95 & 95 &  3193 & SILVA SSU Ref NR v.119 & 18797  \\
- silva-euk-18s-id95 & 95 &  7348 & SILVA SSU Ref NR v.119 & 51553  \\
- silva-bac-23s-id98 & 98 &  4488 & SILVA LSU Ref v.119 & 43822  \\
- silva-arc-23s-id98 & 98 &  251 & SILVA LSU Ref v.119 & 629  \\
- silva-euk-28s-id98 & 98 &  4935 & SILVA LSU Ref v.119 & 13095 \\
- rfam-5s-id98 & 98 & 59513 & RFAM & 116760 \\
- rfam-5.8s-id98 & 98 & 13034 & RFAM & 225185 \\
-\end{tabular}}
-~\\
-
-HMMER 3.1b1 and SumaClust v1.0.00 were used to reduce the size of the original databases to the similarity listed in column 2 (\%id) of the table above
-(see {\tt/sortmerna/rRNA\_databases/README.txt} for a list of complete steps). 
-
-These representative databases were specifically made for fast filtering of rRNA. Approximately the same number of rRNA will be filtered
-using silva-bac-16s-id90 (12802 rRNA) as using Greengenes 97\% (99322 rRNA), but the former will run significantly faster.
-
-\noindent \textbf{id} $\%$: members of the cluster must have identity at least this \% id with the representative sequence \\
-
-\noindent \textbf{Remark}: The user must first index the fasta database by using the command \texttt{indexdb\_rna} and 
-then filter/map reads against the database using the command \texttt{sortmerna}.
-
-\section{How to run SortMeRNA}
-\subsection{Index the rRNA database: command `\texttt{indexdb\_rna}'}
-
-\noindent The executable \texttt{indexdb\_rna} indexes an rRNA database.\\
-
-\noindent To see the man page for \texttt{indexdb\_rna}, 
-
-\begin{Verbatim}[fontsize=\footnotesize]
->> indexdb_rna -h
-
-  Program:     SortMeRNA version 2.0, 29/11/2014
-  Copyright:   2012-2015 Bonsai Bioinformatics Research Group:
-               LIFL, University Lille 1, CNRS UMR 8022, INRIA Nord-Europe
-               OTU-picking extensions and continuing support developed in the Knight Lab,
-               BioFrontiers Institute, University of Colorado at Boulder
-  Disclaimer:  SortMeRNA comes with ABSOLUTELY NO WARRANTY; without even the
-               implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-               See the GNU Lesser General Public License for more details.
-  Contact:     Evguenia Kopylova, jenya.kopylov at gmail.com 
-               Laurent Noe, laurent.noe at lifl.fr
-               Helene Touzet, helene.touzet at lifl.fr
-
-
-  usage:   ./indexdb_rna --ref db.fasta,db.idx [OPTIONS]:
-
-  --------------------------------------------------------------------------------------------------------
-  | parameter        value           description                                                 default |
-  --------------------------------------------------------------------------------------------------------
-     --ref           STRING,STRING   FASTA reference file, index file                            mandatory
-                                      (ex. --ref /path/to/file1.fasta,/path/to/index1)
-                                       If passing multiple reference sequence files, separate
-                                       them by ':',
-                                      (ex. --ref /path/to/file1.fasta,/path/to/index1:/path/to/file2.fasta,path/to/index2)
-   [OPTIONS]:
-     --fast          BOOL            suggested option for aligning ~99% related species          off
-     --sensitive     BOOL            suggested option for aligning ~75-98% related species       on
-     --tmpdir        STRING          directory where to write temporary files
-     -m              INT             the amount of memory (in Mbytes) for building the index     3072 
-     -L              INT             seed length                                                 18
-     --max_pos       INT             maximum number of positions to store for each unique L-mer  10000
-                                      (setting --max_pos 0 will store all positions)
-     -v              BOOL            verbose
-     -h              BOOL            help
-\end{Verbatim}
-
-
-~\\~\\
-\noindent There are eight rRNA representative databases provided in the `\texttt{sortmerna-2.0/rRNA\_databases}' folder. 
-All databases were derived from the SILVA SSU and LSU databases (release 119) and the RFAM databases using HMMER 3.1b1 and SumaClust v1.0.00.
-Additionally, the user can index their own database. \\
-
-\subsubsection{Example 1: indexdb\_rna using one database}
-
-\begin{Verbatim}[fontsize=\footnotesize]
->> ./indexdb_rna --ref ./rRNA_databases/silva-bac-16s-id90.fasta,./index/silva-bac-16s-db -v
-
-  Program:     SortMeRNA version 2.0, 29/11/2014
-  Copyright:   2012-2015 Bonsai Bioinformatics Research Group:
-               LIFL, University Lille 1, CNRS UMR 8022, INRIA Nord-Europe
-               OTU-picking extensions and continuing support developed in the Knight Lab,
-               BioFrontiers Institute, University of Colorado at Boulder
-  Disclaimer:  SortMeRNA comes with ABSOLUTELY NO WARRANTY; without even the
-               implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-               See the GNU Lesser General Public License for more details.
-  Contact:     Evguenia Kopylova, jenya.kopylov at gmail.com 
-               Laurent Noe, laurent.noe at lifl.fr
-               Helene Touzet, helene.touzet at lifl.fr
-
-
-  Parameters summary: 
-    K-mer size: 19
-    K-mer interval: 1
-    Maximum positions to store per unique K-mer: 10000
-
-  Total number of databases to index: 1
-
-  Begin indexing file ./rRNA_databases/silva-bac-16s-id90.fasta under index name ./index/silva-bac-16s-db: 
-  Collecting sequence distribution statistics ..  done  [1.133206 sec]
-
-  start index part # 0: 
-    (1/3) building burst tries .. done  [23.643256 sec]
-    (2/3) building CMPH hash .. done  [22.306709 sec]
-    (3/3) building position lookup tables .. done [54.958680 sec]
-    total number of sequences in this part = 12798
-      writing kmer data to ./index/silva-bac-16s-db.kmer_0.dat
-      writing burst tries to ./index/silva-bac-16s-db.bursttrie_0.dat
-      writing position lookup table to ./index/silva-bac-16s-db.pos_0.dat
-      writing nucleotide distribution statistics to ./index/silva-bac-16s-db.stats
-    done.
-    
-\end{Verbatim}
-
-~\\
-
-\subsubsection{Example 2: indexdb\_rna using multiple databases}
-
-Multiple databases can be indexed simultaneously by passing them as a `:' separated list to \texttt{--ref} (no spaces allowed). 
-
-\begin{Verbatim}[fontsize=\footnotesize]
->> ./indexdb_rna --ref ./rRNA_databases/silva-bac-16s-id90.fasta,./index/silva-bac-16s-db:\
-./rRNA_databases/silva-bac-23s-id98.fasta,./index/silva-bac-23s-db:\
-./rRNA_databases/silva-arc-16s-id95.fasta,./index/silva-arc-16s-db:\
-./rRNA_databases/silva-arc-23s-id98.fasta,./index/silva-arc-23s-db:\
-./rRNA_databases/silva-euk-18s-id95.fasta,./index/silva-euk-18s-db:\
-./rRNA_databases/silva-euk-28s-id98.fasta,./index/silva-euk-28s:\
-./rRNA_databases/rfam-5s-database-id98.fasta,./index/rfam-5s-db:\
-./rRNA_databases/rfam-5.8s-database-id98.fasta,./index/rfam-5.8s-db
-\end{Verbatim}
-
-\newpage
-\subsection{A guide to choosing `{\bf sortmerna}' parameters for filtering and read mapping}
-
-In SortMeRNA version 1.99 beta and up, users have the option to output sequence alignments for their matching rRNA reads in
-the SAM or BLAST-like formats. Depending on the desired quality of alignments, different parameters choices must be set. 
-Table~\ref{tab:guide} presents a guide to setting parameters choices for most use cases. In all cases, output alignments are always guaranteed to reach
-the threshold E-value score (default E-value=1). An E-value of 1 signifies that one random alignment is expected for aligning
-\textbf{all} reads against the reference database. The E-value in SortMeRNA is computed for the entire search space, not per read.
-
-\begin{table}[htp!]
-\caption{SortMeRNA alignment parameter guide}
-\label{tab:guide}
-    \centering
-    \footnotesize
-	\begin{tabular}{l | l | l}
-	\toprule
-	\parbox[t]{0.6in}{\sf option} & {\sf speed} & \parbox[t]{0.45in}{\sf description}  \\
-	\midrule
-	\multirow{8}{*}{{\tt --num-alignments INT}} 
-	& Very fast for {\tt INT = 1}&  \parbox{6cm}{Output the first alignment passing E-value threshold ({\bf best choice if only filtering is needed})} \\
-	\cmidrule{2-3}
-	& Speed decreases for higher value {\tt INT} &  \parbox{6cm}{Higher {\tt INT} signifies more alignments will be made \& output }\\
-	\cmidrule{2-3}
-	& Very slow for {\tt INT = 0} &  \parbox{6cm}{All alignments reaching the E-value threshold are reported (this option is not suggested for high similarity rRNA databases, due to many possible alignments per read causing a very large file output)} \\
-	\midrule
-	\multirow{4}{*}{{\tt --best INT}} 
-	& Fast for {\tt INT = 1} &  \parbox{6cm}{Only one high-candidate reference sequence will be searched for alignments (determined heuristically using a Longest Increasing Subsequence of seed matches). The single best alignment of those will be reported }\\
-	\cmidrule{2-3}
-	& Speed decreases for higher value {\tt INT}  &  \parbox{6cm}{Higher {\tt INT} signifies more alignments will be made, though only the best one will be reported } \\
-	\cmidrule{2-3}
-	& Very slow for {\tt INT = 0} &   \parbox{6cm}{All high-candidate reference sequences will be searched for alignments, though only the best one will be reported }\\
-	\bottomrule
-	\end{tabular}
-\end{table}
-
-
-
-
-
-\newpage
-\subsection{Filter rRNA reads}
-
-\noindent The executable \texttt{sortmerna} can filter rRNA reads against an indexed rRNA database.\\
-
-\noindent To see the man page for \texttt{sortmerna}, 
-
-\begin{Verbatim}[fontsize=\footnotesize]
->> ./sortmerna -h
-
-  Program:     SortMeRNA version 2.0, 29/11/2014
-  Copyright:   2012-2015 Bonsai Bioinformatics Research Group:
-               LIFL, University Lille 1, CNRS UMR 8022, INRIA Nord-Europe
-               OTU-picking extensions and continuing support developed in the Knight Lab,
-               BioFrontiers Institute, University of Colorado at Boulder
-  Disclaimer:  SortMeRNA comes with ABSOLUTELY NO WARRANTY; without even the
-               implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-               See the GNU Lesser General Public License for more details.
-  Contact:     Evguenia Kopylova, jenya.kopylov at gmail.com 
-               Laurent Noe, laurent.noe at lifl.fr
-               Helene Touzet, helene.touzet at lifl.fr
-
-
-  usage:   ./sortmerna --ref db.fasta,db.idx --reads file.fa --aligned base_name_output [OPTIONS]:
-
-  -------------------------------------------------------------------------------------------------------------
-  | parameter          value           description                                                    default |
-  -------------------------------------------------------------------------------------------------------------
-     --ref             STRING,STRING   FASTA reference file, index file                               mandatory
-                                         (ex. --ref /path/to/file1.fasta,/path/to/index1)
-                                         If passing multiple reference files, separate 
-                                         them using the delimiter ':',
-                                         (ex. --ref /path/to/file1.fasta,/path/to/index1:/path/to/file2.fasta,path/to/index2)
-     --reads           STRING          FASTA/FASTQ reads file                                         mandatory
-     --aligned         STRING          aligned reads filepath + base file name                        mandatory
-                                         (appropriate extension will be added)
-
-   [COMMON OPTIONS]: 
-     --other           STRING          rejected reads filepath + base file name
-                                         (appropriate extension will be added)
-     --fastx           BOOL            output FASTA/FASTQ file                                        off
-                                         (for aligned and/or rejected reads)
-     --sam             BOOL            output SAM alignment                                           off
-                                         (for aligned reads only)
-     --SQ              BOOL            add SQ tags to the SAM file                                    off
-     --blast           INT             output alignments in various Blast-like formats                
-                                        0 - pairwise
-                                        1 - tabular (Blast -m 8 format)
-                                        2 - tabular + column for CIGAR 
-                                        3 - tabular + columns for CIGAR and query coverage
-     --log             BOOL            output overall statistics                                      off
-     --num_alignments  INT             report first INT alignments per read reaching E-value          -1
-                                        (--num_alignments 0 signifies all alignments will be output)
-       or (default)
-     --best            INT             report INT best alignments per read reaching E-value           1
-                                         by searching --min_lis INT candidate alignments
-                                        (--best 0 signifies all candidate alignments will be searched)
-     --min_lis         INT             search all alignments having the first INT longest LIS         2
-                                         LIS stands for Longest Increasing Subsequence, it is 
-                                         computed using seeds' positions to expand hits into
-                                         longer matches prior to Smith-Waterman alignment. 
-     --print_all_reads BOOL            output null alignment strings for non-aligned reads            off
-                                         to SAM and/or BLAST tabular files
-     --paired_in       BOOL            both paired-end reads go in --aligned fasta/q file             off
-                                         (interleaved reads only, see Section 4.2.4 of User Manual)
-     --paired_out      BOOL            both paired-end reads go in --other fasta/q file               off
-                                         (interleaved reads only, see Section 4.2.4 of User Manual)
-     --match           INT             SW score (positive integer) for a match                        2
-     --mismatch        INT             SW penalty (negative integer) for a mismatch                   -3
-     --gap_open        INT             SW penalty (positive integer) for introducing a gap            5
-     --gap_ext         INT             SW penalty (positive integer) for extending a gap              2
-     -N                INT             SW penalty for ambiguous letters (N's)                         scored as --mismatch
-     -F                BOOL            search only the forward strand                                 off
-     -R                BOOL            search only the reverse-complementary strand                   off
-     -a                INT             number of threads to use                                       1
-     -e                DOUBLE          E-value threshold                                              1
-     -m                INT             INT Mbytes for loading the reads into memory                   1024
-                                        (maximum -m INT is 4096)
-     -v                BOOL            verbose                                                        off
-
-
-   [OTU PICKING OPTIONS]: 
-     --id              DOUBLE          %id similarity threshold (the alignment must                   0.97
-                                         still pass the E-value threshold)
-     --coverage        DOUBLE          %query coverage threshold (the alignment must                  0.97
-                                         still pass the E-value threshold)
-     --de_novo_otu     BOOL            FASTA/FASTQ file for reads matching database < %id             off
-                                         (set using --id) and < %cov (set using --coverage) 
-                                         (alignment must still pass the E-value threshold)
-     --otu_map         BOOL            output OTU map (input to QIIME's make_otu_table.py)            off
-
-
-   [ADVANCED OPTIONS] (see SortMeRNA user manual for more details): 
-    --passes           INT,INT,INT     three intervals at which to place the seed on the read         L,L/2,3
-                                         (L is the seed length set in ./indexdb_rna)
-    --edges            INT             number (or percent if INT followed by % sign) of               4
-                                         nucleotides to add to each edge of the read
-                                         prior to SW local alignment 
-    --num_seeds        INT             number of seeds matched before searching                       2
-                                         for candidate LIS 
-    --full_search      BOOL            search for all 0-error and 1-error seed                        off
-                                         matches in the index rather than stopping
-                                         after finding a 0-error match (<1% gain in
-                                         sensitivity with up four-fold decrease in speed)
-    --pid              BOOL            add pid to output file names                                   off
-
-
-   [HELP]:
-     -h                BOOL            help
-     --version         BOOL            SortMeRNA version number
-
-
-\end{Verbatim}
-
-~\\
-\noindent The user can adjust the amount of memory allocated for loading the reads through the 
-command option \texttt{-m}. By default, \texttt{-m} is set to be high enough for 1GB.
-If the reads file is larger than 1GB, then \texttt{sortmerna} internally divides the file into partial sections of 
-1GB and executes one section at a time. Hence, if a user has an input file of 15GB and only 1GB of RAM to store it, the 
-file will be processed in partial sections using \texttt{mmap} without having to physically split it prior to execution. Otherwise, the user
-can increase \texttt{-m} to map larger portions of the file. The limit for \texttt{-m} is given by typing \texttt{sortmerna -h}.
-
-
-\newpage
- 
-\subsubsection{Example 3: multiple databases and the fastest alignment option}
-
-\begin{Verbatim}[fontsize=\footnotesize]
->> time ./sortmerna --ref ./rRNA_databases/silva-bac-16s-id90.fasta,./index/silva-bac-16s-db:\
-./rRNA_databases/silva-bac-23s-id98.fasta,./index/silva-bac-23s-db:\
-./rRNA_databases/silva-arc-16s-id95.fasta,./index/silva-arc-16s-db:\
-./rRNA_databases/silva-arc-23s-id98.fasta,./index/silva-arc-23s-db:\
-./rRNA_databases/silva-euk-18s-id95.fasta,./index/silva-euk-18s-db:\
-./rRNA_databases/silva-euk-28s-id98.fasta,./index/silva-euk-28s:\
-./rRNA_databases/rfam-5s-database-id98.fasta,./index/rfam-5s-db:\
-./rRNA_databases/rfam-5.8s-database-id98.fasta,./index/rfam-5.8s-db\
- --reads SRR106861.fasta --sam --num_alignments 1 --fastx --aligned SRR105861_rRNA\
- --other SRR105861_non_rRNA --log -v
-
-
-  Program:     SortMeRNA version 2.0, 29/11/2014
-  Copyright:   2012-2015 Bonsai Bioinformatics Research Group:
-               LIFL, University Lille 1, CNRS UMR 8022, INRIA Nord-Europe
-               OTU-picking extensions and continuing support developed in the Knight Lab,
-               BioFrontiers Institute, University of Colorado at Boulder
-  Disclaimer:  SortMeRNA comes with ABSOLUTELY NO WARRANTY; without even the
-               implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-               See the GNU Lesser General Public License for more details.
-  Contact:     Evguenia Kopylova, jenya.kopylov at gmail.com 
-               Laurent Noe, laurent.noe at lifl.fr
-               Helene Touzet, helene.touzet at lifl.fr
-
-
-  Computing read file statistics ... done [2.16 sec]
-  size of reads file: 35238748 bytes
-  partial section(s) to be executed: 1 of size 35238748 bytes 
-  Parameters summary:
-    Number of seeds = 2
-    Edges = 4 (as integer)
-    SW match = 2
-    SW mismatch = -3
-    SW gap open penalty = 5
-    SW gap extend penalty = 2
-    SW ambiguous nucleotide = -3
-    SQ tags are not output
-    Number of threads = 1
-
-  Begin mmap reads section # 1:
-  Time to mmap reads and set up pointers [0.11 sec]
-
-  Begin analysis of: ./rRNA_databases/silva-bac-16s-id90.fasta
-    Seed length = 18
-    Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-    Gumbel lambda = 0.602397
-    Gumbel K = 0.328927
-    Minimal SW score based on E-value = 54
-    Loading index part 1/1 ...  done [4.67 sec]
-    Begin index search ...  done [83.53 sec]
-    Freeing index ...  done [0.87 sec]
-
-  Begin analysis of: ./rRNA_databases/silva-bac-23s-id98.fasta
-    Seed length = 18
-    Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-    Gumbel lambda = 0.603075
-    Gumbel K = 0.330488
-    Minimal SW score based on E-value = 53
-    Loading index part 1/1 ...  done [3.63 sec]
-    Begin index search ...  done [94.76 sec]
-    Freeing index ...  done [0.41 sec]
-
-  Begin analysis of: ./rRNA_databases/silva-arc-16s-id95.fasta
-    Seed length = 18
-    Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-    Gumbel lambda = 0.596230
-    Gumbel K = 0.322143
-    Minimal SW score based on E-value = 52
-    Loading index part 1/1 ...  done [1.14 sec]
-    Begin index search ...  done [22.63 sec]
-    Freeing index ...  done [0.14 sec]
-
-  Begin analysis of: ./rRNA_databases/silva-arc-23s-id98.fasta
-    Seed length = 18
-    Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-    Gumbel lambda = 0.597749
-    Gumbel K = 0.325630
-    Minimal SW score based on E-value = 49
-    Loading index part 1/1 ...  done [0.50 sec]
-    Begin index search ...  done [13.27 sec]
-    Freeing index ...  done [0.06 sec]
-
-  Begin analysis of: ./rRNA_databases/silva-euk-18s-id95.fasta
-    Seed length = 18
-    Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-    Gumbel lambda = 0.612228
-    Gumbel K = 0.334926
-    Minimal SW score based on E-value = 52
-    Loading index part 1/1 ...  done [3.23 sec]
-    Begin index search ...  done [30.28 sec]
-    Freeing index ...  done [0.45 sec]
-
-  Begin analysis of: ./rRNA_databases/silva-euk-28s-id98.fasta
-    Seed length = 18
-    Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-    Gumbel lambda = 0.612068
-    Gumbel K = 0.344763
-    Minimal SW score based on E-value = 53
-    Loading index part 1/1 ...  done [3.43 sec]
-    Begin index search ...  done [35.69 sec]
-    Freeing index ...  done [0.48 sec]
-
-  Begin analysis of: ./rRNA_databases/rfam-5s-database-id98.fasta
-    Seed length = 18
-    Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-    Gumbel lambda = 0.616617
-    Gumbel K = 0.341306
-    Minimal SW score based on E-value = 51
-    Loading index part 1/1 ...  done [1.77 sec]
-    Begin index search ...  done [13.50 sec]
-    Freeing index ...  done [0.22 sec]
-
-  Begin analysis of: ./rRNA_databases/rfam-5.8s-database-id98.fasta
-    Seed length = 18
-    Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-    Gumbel lambda = 0.617817
-    Gumbel K = 0.340589
-    Minimal SW score based on E-value = 49
-    Loading index part 1/1 ...  done [0.60 sec]
-    Begin index search ...  done [8.78 sec]
-    Freeing index ...  done [0.07 sec]
-    Total number of reads mapped (incl. all reads file sections searched): 104243
-    Writing aligned FASTA/FASTQ ...  done [1.13 sec]
-    Writing not-aligned FASTA/FASTQ ...  done [0.10 sec]
-              
-\end{Verbatim}
-
-~\\
-\noindent The option `\texttt{--log}' will create an overall statistics file,\\
-
-\begin{Verbatim}[fontsize=\footnotesize]
->> cat SRR105861_rRNA.log 
- Time and date
-
- Command: sortmerna --ref ./rRNA_databases/silva-bac-16s-id90.fasta,./index/silva-bac-16s-db:\
- ./rRNA_databases/silva-bac-23s-id98.fasta,./index/silva-bac-23s-db:\
- ./rRNA_databases/silva-arc-16s-id95.fasta,./index/silva-arc-16s-db:\
- ./rRNA_databases/silva-arc-23s-id98.fasta,./index/silva-arc-23s-db:\
- ./rRNA_databases/silva-euk-18s-id95.fasta,./index/silva-euk-18s-db:\
- ./rRNA_databases/silva-euk-28s-id98.fasta,./index/silva-euk-28s:\
- ./rRNA_databases/rfam-5s-database-id98.fasta,./index/rfam-5s-db:\
- ./rRNA_databases/rfam-5.8s-database-id98.fasta,./index/rfam-5.8s-db\
-  --reads /Users/jenya/Downloads/SRR106861.fasta --sam --num_alignments 1\
-   --fastx --aligned SRR105861_rRNA --other SRR105861_non_rRNA.fasta fasta -v 
- Process pid = 1957
- Parameters summary:
-    Index: ./index/silva-bac-16s-db
-     Seed length = 18
-     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-     Gumbel lambda = 0.602397
-     Gumbel K = 0.328927
-     Minimal SW score based on E-value = 54
-    Index: ./index/silva-bac-23s-db
-     Seed length = 18
-     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-     Gumbel lambda = 0.603075
-     Gumbel K = 0.330488
-     Minimal SW score based on E-value = 53
-    Index: ./index/silva-arc-16s-db
-     Seed length = 18
-     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-     Gumbel lambda = 0.596230
-     Gumbel K = 0.322143
-     Minimal SW score based on E-value = 52
-    Index: ./index/silva-arc-23s-db
-     Seed length = 18
-     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-     Gumbel lambda = 0.597749
-     Gumbel K = 0.325630
-     Minimal SW score based on E-value = 49
-    Index: ./index/silva-euk-18s-db
-     Seed length = 18
-     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-     Gumbel lambda = 0.612228
-     Gumbel K = 0.334926
-     Minimal SW score based on E-value = 52
-    Index: ./index/silva-euk-28s
-     Seed length = 18
-     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-     Gumbel lambda = 0.612068
-     Gumbel K = 0.344763
-     Minimal SW score based on E-value = 53
-    Index: ./index/rfam-5s-db
-     Seed length = 18
-     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-     Gumbel lambda = 0.616617
-     Gumbel K = 0.341306
-     Minimal SW score based on E-value = 51
-    Index: ./index/rfam-5.8s-db
-     Seed length = 18
-     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-     Gumbel lambda = 0.617817
-     Gumbel K = 0.340589
-     Minimal SW score based on E-value = 49
-    Number of seeds = 2
-    Edges = 4 (as integer)
-    SW match = 2
-    SW mismatch = -3
-    SW gap open penalty = 5
-    SW gap extend penalty = 2
-    SW ambiguous nucleotide = -3
-    SQ tags are not output
-    Number of threads = 1
-    Reads file = SRR106861.fasta
-
- Results:
-    Total reads = 113128
-    Total reads passing E-value threshold = 104243 (92.15%)
-    Total reads failing E-value threshold = 8885 (7.85%)
-    Minimum read length = 59
-    Maximum read length = 1253
-    Mean read length = 267
- By database:
-    ./rRNA_databases/silva-bac-16s-id90.fasta		25.73%
-    ./rRNA_databases/silva-bac-23s-id98.fasta		64.37%
-    ./rRNA_databases/silva-arc-16s-id95.fasta		0.00%
-    ./rRNA_databases/silva-arc-23s-id98.fasta		0.00%
-    ./rRNA_databases/silva-euk-18s-id95.fasta		0.00%
-    ./rRNA_databases/silva-euk-28s-id98.fasta		0.00%
-    ./rRNA_databases/rfam-5s-database-id98.fasta		2.04%
-    ./rRNA_databases/rfam-5.8s-database-id98.fasta		0.00%
-    
- \end{Verbatim}
-
-\subsubsection{Filtering paired-end reads}
-
-When writing aligned and non-aligned reads to FASTA/Q files, sometimes the situation arises 
-where one of the paired-end reads aligns and the other one doesn't. Since SortMeRNA
-looks at each read individually, by default the reads will be split into two separate files. That is, the read that
-aligned will go into the {\tt--aligned} FASTA/Q file and the pair that didn't align will go into the
-{\tt--other} FASTA/Q file.
-
-This situation would result in the splitting of some paired reads in the
-output files and not optimal for users who require paired order of the reads for
-downstream analyses.
-
-For users who  wish to keep the order of their paired-ended reads, two options are available.
-If one read aligns and the other one not then,
-
-\begin{enumerate}
- \item[(1)] {\tt--paired-in} will put both reads into the file specified by {\tt--aligned}
- \item[(2)] {\tt--paired-out} will put both reads into the file specified by {\tt--other}
-\end{enumerate}
-
-The first option, {\tt--paired-in} is optimal for users that want all reads in the {\tt--other} file
-to be non-rRNA. However, there are small chances that reads which are non-rRNA will also be
-put into the {\tt--aligned} file.
-
-The second option, {\tt--paired-out} is optimal for users that want only rRNA reads in the
-{\tt--aligned} file. However, there are small chances that reads which are rRNA will also be
-put into the {\tt--other} file.
-
-If neither of these two options is added to the {\tt sortmerna} command, then aligned and
-non-aligned reads will be properly output to the {\tt--aligned} and {\tt--other} files, possibly breaking
-the order for a set of paired reads between two output files.
-
-{\bf It's important to note} that regardless of the options used, the {\tt--log} file will always
-report the true number of reads classified as rRNA (not the number of reads in the {\tt--aligned}
-file).
-
-\subsubsection{Example 4: forward-reverse paired-end reads (2 input files)}
-
-\begin{figure}[here!]
-\centering
-\resizebox{5in}{!}{
-\tikzstyle{mybox} = [draw=OliveGreen, fill=blue!5, very thick,
-    rectangle, rounded corners, inner sep=10pt, inner ysep=20pt]
-\tikzstyle{fancytitle} =[fill=OliveGreen, text=white, rectangle, rounded corners]
-%
-\begin{tikzpicture}
-\node [mybox] (box1) {%
-    \begin{minipage}[t!]{2in}
-    {\footnotesize
-       @SEQUENCE\_ID\_1/\textbf{1} \\
-       ACTT..\\
-       +\\
-       QUALITY\_1/1\\
-       @SEQUENCE\_ID\_2/\textbf{1} \\
-       GTTA..\\
-       +\\
-       QUALITY\_2/1\\
-       ..
-    }
-    \end{minipage}
-};
-
-\node [mybox] (box2) [right=of box1,xshift=2cm] {%
-    \begin{minipage}[t!]{2in}
-    {\footnotesize
-       @SEQUENCE\_ID\_1/\textbf{2} \\
-       GTAC..\\
-       +\\
-       QUALITY\_1/2\\
-       @SEQUENCE\_ID\_2/\textbf{2} \\
-       CCAC..\\
-       +\\
-       QUALITY\_2/2\\
-       ..
-    }
-    \end{minipage}
-};
-    
-\node[fancytitle] at (box1.north) {{\small FASTQ forward reads}};
-\node[fancytitle] at (box2.north) {{\small FASTQ reverse reads}};
-
-\draw [decorate,color=black!80,decoration={brace,mirror,amplitude=5pt,raise=2pt}] (3,0.3) --  node[right=10pt]{$~~pair~\#~1$}(3,1.8);
-\draw [decorate,color=black!80,decoration={brace,amplitude=5pt,raise=2pt}] (5.8,0.3) --  node[right=10pt]{~}(5.8,1.8);
-
-\draw [decorate,color=black!80,decoration={brace,amplitude=5pt,raise=2pt}] (3,0) --  node[right=10pt]{$~~pair~\#~2$}(3,-1.5);
-\draw [decorate,color=black!80,decoration={brace,mirror,amplitude=5pt,raise=2pt}] (5.8,0) --  node[right=10pt]{~}(5.8,-1.5);
-
-\end{tikzpicture}
-}%resizebox
-\caption{Forward and reverse reads in paired-end sequencing format}
-\label{fig:format2}
-\end{figure}
-
-\begin{figure}[here!]
-\centering
-\resizebox{2.3in}{!}{
-\tikzstyle{mybox} = [draw=OliveGreen, fill=blue!5, very thick,
-    rectangle, rounded corners, inner sep=10pt, inner ysep=20pt]
-\tikzstyle{fancytitle} =[fill=OliveGreen, text=white, rectangle, rounded corners]
-%
-\begin{tikzpicture}
-\node [mybox] (box) {%
-    \begin{minipage}[t!]{2in}
-    {\footnotesize
-       @SEQUENCE\_ID\_1/\textbf{1} \\
-       ACTT..\\
-       +\\
-       QUALITY\_1/1\\
-       @SEQUENCE\_ID\_1/\textbf{2} \\
-       GTAC..\\
-       +\\
-       QUALITY\_1/2\\
-       ..
-    }
-    \end{minipage}
-    };
-\node[fancytitle] at (box.north) {{\small FASTQ paired-end reads}};
-\draw [decorate,color=black!80,decoration={brace,mirror,amplitude=10pt,raise=2pt}] (0.5,-1.5) --  node[right=10pt]{$~pair~\#~1$}(0.5,1.8);
-\end{tikzpicture}
-}%resizebox
-\caption{Paired-end read format accepted by SortMeRNA}
-\label{fig:format1}
-\end{figure}
-
-\noindent SortMeRNA accepts only 1 file as input for the reads. If a user has two input files, in the case for the 
-foward and reverse paired-end reads (see Figure~\ref{fig:format2}), they may use the \texttt{merge-paired-reads.sh} script found in 
-\texttt{`sortmerna/scripts'} folder to interleave the paired reads into the format of Figure~\ref{fig:format1}.\\
-
-\noindent The command for \texttt{merge-paired-reads.sh} is the following,
-\begin{verbatim}
- > bash ./merge-paired-reads.sh forward-reads.fastq reverse-reads.fastq outfile.fastq
-\end{verbatim}
-
-\noindent Now, the user may input \texttt{outfile.fastq} to SortMeRNA for analysis.
-
-\noindent Similarly, for unmerging the paired reads back into two separate files, use the command,
-{\small
-\begin{verbatim}
- > bash ./unmerge-paired-reads.sh merged-reads.fastq forward-reads.fastq reverse-reads.fastq 
-\end{verbatim}}
-{\bf Important:} unmerge-paired-reads.sh should only be used if one of the options {\tt--paired\_in} or {\tt--paired\_out}
-was used during filtering. Otherwise it may give incorrect results if a paired-read was split during alignment (one
-read aligned and the other one not).
-
-\newpage
-\subsection{Read mapping}
-
-\subsubsection{Mapping reads for classification}
-
-Although SortMeRNA is very sensitive with the small rRNA databases distributed with the source code,
-these databases are not optimal for classification since often alignments with 75-90\% identity
-will be returned (there are only several thousand rRNA in most of the databases, compared to the original
-SILVA or Greengenes databases containing millions of rRNA). Classification at the species level generally
-considers alignments at 97\% and above, so it is suggested to use a larger database is species classification
-is the main goal.
-
-Moreover, SortMeRNA is a local alignment tool, so it's also important to look at the query coverage \% for
-each alignment. In the SAM output format, neither \% id or query coverage are reported. If the user wishes
-for these values, then the Blast tabular format with CIGAR + query coverage option {(\tt--blast 3)} is the way to go.
-
-\subsubsection{Example 5: mapping reads against the 16S Greengenes 97\% id database with multithreading}
-
-This example will generate SAM and BLAST tabular output files. Alignments are classified as significant
-based on the E-value cutoff (default 1). SortMeRNA's E-value takes into consideration the full size of the
-reference database as well as the query file, thus the E-value is higher than BLAST's (ex. equivalent to BLAST's 1e-5).
-
-\begin{Verbatim}[fontsize=\footnotesize]
->> sortmerna --ref 97_otus_gg_13_8.fasta,./index/97_otus_gg_13_8\
- --reads SRR106861.fasta --blast 3 --sam --log --aligned SRR106861_gg_rRNA -a 20 -v 
-
-
-  Program:     SortMeRNA version 2.0, 29/11/2014
-  Copyright:   2012-2015 Bonsai Bioinformatics Research Group:
-               LIFL, University Lille 1, CNRS UMR 8022, INRIA Nord-Europe
-               OTU-picking extensions and continuing support developed in the Knight Lab,
-               BioFrontiers Institute, University of Colorado at Boulder
-  Disclaimer:  SortMeRNA comes with ABSOLUTELY NO WARRANTY; without even the
-               implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-               See the GNU Lesser General Public License for more details.
-  Contact:     Evguenia Kopylova, jenya.kopylov at gmail.com 
-               Laurent Noe, laurent.noe at lifl.fr
-               Helene Touzet, helene.touzet at lifl.fr
-
-
-  Computing read file statistics ... done [0.44 sec]
-  size of reads file: 35238748 bytes
-  partial section(s) to be executed: 1 of size 35238748 bytes 
-  Parameters summary:
-    Number of seeds = 2
-    Edges = 4 (as integer)
-    SW match = 2
-    SW mismatch = -3
-    SW gap open penalty = 5
-    SW gap extend penalty = 2
-    SW ambiguous nucleotide = -3
-    SQ tags are not output
-    Number of threads = 20
-
-  Begin mmap reads section # 1:
-  Time to mmap reads and set up pointers [0.10 sec]
-
-  Begin analysis of: 97_otus_gg_13_8.fasta
-    Seed length = 18
-    Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-    Gumbel lambda = 0.600470
-    Gumbel K = 0.327880
-    Minimal SW score based on E-value = 57
-    Loading index part 1/1 ...  done [10.76 sec]
-    Begin index search ...  done [23.75 sec]
-    Freeing index ...  done [1.44 sec]
-    Total number of reads mapped (incl. all reads file sections searched): 29089
-    Writing alignments ...  done [7.71 sec]
-              
-\end{Verbatim}
-
-This is almost the same number of 16S rRNA as identified by SortMeRNA using the smaller provided database,
-
-\begin{Verbatim}[fontsize=\footnotesize]
-
->> cat SRR106861_gg_rRNA.log 
- Date and time
-
- Command: sortmerna --ref 97_otus_gg_13_8.fasta,./index/97_otus_gg_13_8\
-  --reads SRR106861.fasta --blast 3 --sam --log --aligned SRR106861_gg_rRNA -a 20 -v 
- Process pid = 44246
- Parameters summary:
-    Index: ./index/97_otus_gg_13_8
-     Seed length = 18
-     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
-     Gumbel lambda = 0.600470
-     Gumbel K = 0.327880
-     Minimal SW score based on E-value = 57
-    Number of seeds = 2
-    Edges = 4 (as integer)
-    SW match = 2
-    SW mismatch = -3
-    SW gap open penalty = 5
-    SW gap extend penalty = 2
-    SW ambiguous nucleotide = -3
-    SQ tags are not output
-    Number of threads = 20
-    Reads file = SRR106861.fasta
-
- Results:
-    Total reads = 113128
-    Total reads passing E-value threshold = 29089 (25.71%)
-    Total reads failing E-value threshold = 84039 (74.29%)
-    Minimum read length = 59
-    Maximum read length = 1253
-    Mean read length = 267
- By database:
-    97_otus_gg_13_8.fasta		25.71%
-    
-\end{Verbatim}
-
-\newpage
-\subsection{OTU-picking}
-
-SortMeRNA is implemented in QIIME's closed-reference and open-reference OTU-picking workflows.
-The readers are referred to QIIME's tutorials for an in-depth discussion of these methods
-\url{http://qiime.org/tutorials/otu_picking.html}.
-
-\section{SortMeRNA advanced options}
-
-\subsection*{{\tt --num\_seeds INT}}
-The threshold number of seeds required to match in the primary seed-search filter before 
-moving on to the secondary seed-cluster filter. More specifically, the threshold number of 
-seeds required before searching for a longest increasing subsequence (LIS) of the seeds' positions
-between the read and the closest matching reference sequence. By default, this is set to 2 seeds. 
-
-\subsection*{{\tt --passes INT,INT,INT}}
-In the primary seed-search filter, SortMeRNA moves a seed of length $L$ (parameter of {\tt indexdb\_rna}) 
-across the read using three passes. If at the end of each pass a threshold number of seeds (defined by {\tt --num\_seeds}) 
-did not match to the reference database, SortMeRNA attempts to find more seeds by decreasing the interval at which the 
-seed is placed along the read by using another pass. In default mode, these intervals are set to 
-$L,L/2,3$ for Pass 1, 2 and 3, respectively. Usually, if the read is highly similar to the reference
-database, a threshold number of seeds will be found in the first pass.
-
-\subsection*{{\tt --edges INT(\%)}}
-The number (or percentage if followed by \%) of nucleotides to add to each edge of the alignment region 
-on the reference sequence before performing Smith-Waterman alignment. By default, this is set to 4 nucleotides.
-
-\subsection*{{\tt --full\_search FLAG}}
-During the index traversal, if a seed match is found with 0-errors, SortMeRNA will stop searching for further
-1-error matches. This heuristic is based upon the assumption that 0-error matches are more significant than
-1-error matches. By turning it off using the {\tt--full\_search} flag, the sensitivity may increase (often
-by less than 1\%) but with up to four-fold decrease in speed.
-
-\subsection*{{\tt --pid FLAG}}
-The pid of the running {\tt sortmerna} process will be added to the output files in order to avoid over-writing output if the same
-{\tt --aligned STRING} base name is provided for different runs. 
-
-\section{Help}
-
-Any issues or bug reports should be reported to \url{https://github.com/biocore/sortmerna/issues} or by e-mail
-to the authors (see list of e-mails in Section 1 of this document). Comments and suggestions are also always appreciated!
-
-\section{Citation}
-
-If you use SortMeRNA please cite,
-
-Kopylova E., No\'{e} L. and Touzet H., ``SortMeRNA: Fast and accurate filtering of ribosomal RNAs in metatranscriptomic data", {\it Bioinformatics} (2012), doi: 10.1093/bioinformatics/bts611.
-
-\end{document}
-


=====================================
debian/doc_source/get deleted
=====================================
@@ -1 +0,0 @@
-wget https://github.com/biocore/sortmerna/raw/master/SortMeRNA-User-Manual-2.0.tex


=====================================
debian/examples
=====================================
@@ -1 +1 @@
-tests/*
\ No newline at end of file
+tests/sortmerna/*
\ No newline at end of file


=====================================
debian/man/indexdb_rna.1 deleted
=====================================
@@ -1,63 +0,0 @@
-.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.1.
-.TH INDEXDB_RNA "1" "August 2015" "indexdb_rna 2.0" "User Commands"
-.SH NAME
-indexdb_rna \- tool for filtering, mapping and OTU-picking NGS reads (indexdb)
-.SH SYNOPSIS
-.B indexdb_rna
-\fB\-\-ref\fR db.fasta,db.idx [OPTIONS]
-.SH DESCRIPTION
-.P
-SortMeRNA is a biological sequence analysis tool for filtering, mapping and
-OTU-picking NGS reads. The core algorithm is based on approximate seeds and
-allows for fast and sensitive analyses of nucleotide sequences. The main
-application of SortMeRNA is filtering rRNA from metatranscriptomic data.
-Additional applications include OTU-picking and taxonomy assignation available
-through QIIME v1.9+ (http://qiime.org - v1.9.0-rc1).
-.P
-SortMeRNA takes as input a file of reads (fasta or fastq format) and one or
-multiple rRNA database file(s), and sorts apart rRNA and rejected reads into
-two files specified by the user. Optionally, it can provide high quality local
-alignments of rRNA reads against the rRNA database. SortMeRNA works with
-Illumina, 454, Ion Torrent and PacBio data, and can produce SAM and
-BLAST-like alignments.
-.SH OPTIONS
-.SS MANDATORY OPTIONS
-.TP
-\fB\-\-ref\fR \fISTRING,STRING\fR
-FASTA reference file, index file
-.br
-Example:
-.br
-\fB\-\-ref\fR \fI\,/path/to/file1.fasta\/,/path/to/index1\fP
-.br
-If passing multiple reference sequence files, separate them by ':'
-.br
-Example:
-.br
-\fB\-\-ref\fR \fI/path/f1.fasta,/path/index1:/path/f2.fasta,path/index2\fP
-.SS OPTIONAL OPTIONS
-.TP
-\fB\-\-fast\fR \fIBOOL\fR
-suggested option for aligning ~99% related species (default: off)
-.TP
-\fB\-\-sensitive\fR \fIBOOL\fR
-suggested option for aligning ~75\-98% related species (default: on)
-.TP
-\fB\-\-tmpdir\fR \fISTRING\fR
-directory where to write temporary files
-.TP
-\fB\-m\fR \fIINT\fR
-the amount of memory (in Mbytes) for building the index (default: 3072)
-.TP
-\fB\-L\fR \fIINT\fR
-seed length (default: 18)
-.TP
-\fB\-\-max_pos\fR \fIINT\fR
-maximum number of positions to store for each unique L\-mer (default: 10000,
-setting \fB\-\-max_pos\fR 0 will store all positions)
-.TP
-\fB\-v\fR \fIBOOL\fR
-verbose
-.TP
-\fB\-h\fR \fIBOOL\fR
-help       


=====================================
debian/rules
=====================================
@@ -14,10 +14,11 @@ override_dh_clean:
 	dh_clean
 	rm -f include/memcheck.h
 
+override_dh_install:
+	dh_install
+	find debian -name cmake -type d | xargs rm -rf
+
 override_dh_fixperms:
 	dh_fixperms
 	find debian/sortmerna -name '*.sh' -exec chmod +x '{}' ';'
 	find debian/sortmerna -name '*.py' -exec chmod -x '{}' ';'
-
-override_dh_compress:
-	dh_compress --exclude=.pdf


=====================================
debian/tests/control → debian/tests/control_deactivated
=====================================



View it on GitLab: https://salsa.debian.org/med-team/sortmerna/-/compare/d9b2d7ff6ebf068b5cffc7a147d580a84ef53a1e...b7de51cb13589e39e6212d2d37c8f5c3f435d736

-- 
View it on GitLab: https://salsa.debian.org/med-team/sortmerna/-/compare/d9b2d7ff6ebf068b5cffc7a147d580a84ef53a1e...b7de51cb13589e39e6212d2d37c8f5c3f435d736
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20220202/5bf9bfa3/attachment-0001.htm>