[med-svn] [Git][med-team/unicycler][debian/stretch-backports] 24 commits: Add autogenerated manpages

Andreas Tille gitlab at salsa.debian.org
Wed Nov 27 14:58:49 GMT 2019



Andreas Tille pushed to branch debian/stretch-backports at Debian Med / unicycler


Commits:
f4b2ba13 by Andreas Tille at 2018-10-24T07:04:39Z
Add autogenerated manpages

- - - - -
9118c256 by Andreas Tille at 2018-10-24T07:30:36Z
Manual edits to man pages

- - - - -
b0b0cac3 by Liubov Chuprikova at 2018-10-27T17:07:15Z
Add autopkgtest

- - - - -
96bf168e by Liubov Chuprikova at 2018-10-27T17:42:08Z
Split data files and docs in unicycler-data

- - - - -
f011ee13 by Andreas Tille at 2018-10-28T15:30:16Z
Upload to unstable

- - - - -
fd8e7ad4 by Steffen Möller at 2018-11-18T14:42:11Z
Updated RRIDs in d/u/metadata
- - - - -
2623e20a by Michael R. Crusoe at 2019-01-18T10:35:16Z
Inherit and use LDFLAGS and CPPFLAGS

- - - - -
04164425 by Michael R. Crusoe at 2019-01-18T10:35:59Z
Mark unicycler-data as Multi-Arch: foreign, as recommended by the Multiarch hinter.

- - - - -
6d5abf3d by Michael R. Crusoe at 2019-01-18T11:05:29Z
Standards-Version: 4.3.0, no changes needed

- - - - -
88ff2479 by Steffen Möller at 2019-03-09T16:51:45Z
corrected bio.tools ref
- - - - -
bb7fd72e by Steffen Moeller at 2019-10-09T12:08:05Z
d/u/metadata: yamllint

- - - - -
5a91834e by Steffen Möller at 2019-10-15T21:18:00Z
Update metadata with ref to conda
- - - - -
40f4b70d by Andreas Tille at 2019-11-12T15:01:56Z
New upstream version

- - - - -
2c389812 by Andreas Tille at 2019-11-12T15:01:57Z
New upstream version 0.4.8+dfsg
- - - - -
7a937b65 by Andreas Tille at 2019-11-12T15:02:09Z
Update upstream source from tag 'upstream/0.4.8+dfsg'

Update to upstream version '0.4.8+dfsg'
with Debian dir fb6c05a896c268a02adc8591c358093e5b585271
- - - - -
a48b00ec by Andreas Tille at 2019-11-12T15:02:09Z
debhelper-compat 12

- - - - -
58ee9a24 by Andreas Tille at 2019-11-12T15:02:13Z
Standards-Version: 4.4.1

- - - - -
01cd1a33 by Andreas Tille at 2019-11-12T15:02:14Z
Remove unnecesary Team Upload line in changelog.

Fixes lintian: unnecessary-team-upload
See https://lintian.debian.org/tags/unnecessary-team-upload.html for more details.

- - - - -
76409e78 by Andreas Tille at 2019-11-12T15:02:17Z
Set upstream metadata fields: Repository.
- - - - -
3eec66f7 by Andreas Tille at 2019-11-12T15:22:07Z
(Build-)Depends: bcftools

- - - - -
4325cc32 by Andreas Tille at 2019-11-18T15:36:01Z
Versioned (Build-)Depends of Python3 enabled spades

- - - - -
1c8150f7 by Andreas Tille at 2019-11-18T15:36:38Z
(Build-)Depends: miniasm

- - - - -
5e0ac616 by Andreas Tille at 2019-11-18T16:00:45Z
Upload to unstable

- - - - -
e23f2a66 by Andreas Tille at 2019-11-27T14:45:45Z
Rebuild for stretch-backports-sloppy

- - - - -


26 changed files:

- README.md
- debian/changelog
- − debian/compat
- debian/control
- + debian/createmanpages
- + debian/manpages
- + debian/mans/unicycler.1
- + debian/mans/unicycler_align.1
- + debian/mans/unicycler_check.1
- + debian/mans/unicycler_polish.1
- + debian/mans/unicycler_scrub.1
- + debian/patches/append_flags
- debian/patches/series
- debian/tests/control
- debian/tests/run-unit-test
- + debian/unicycler-data.docs
- debian/unicycler-data.install
- + debian/unicycler.docs
- debian/upstream/metadata
- unicycler/assembly_graph.py
- unicycler/bridge_long_read_simple.py
- unicycler/settings.py
- unicycler/spades_func.py
- unicycler/src/miniasm/hit.cpp
- unicycler/unicycler.py
- unicycler/version.py


Changes:

=====================================
README.md
=====================================
@@ -428,6 +428,8 @@ SPAdes assembly:
   --depth_filter DEPTH_FILTER    Filter out contigs lower than this fraction of the chromosomal
                                  depth, if doing so does not result in graph dead ends (default:
                                  0.25)
+  --largest_component            Only keep the largest connected component of the assembly graph
+                                 (default: keep all connected components)
   --spades_tmp_dir SPADES_TMP_DIR
                                  Specify SPAdes temporary directory using the SPAdes --tmp-dir
                                  option (default: make a temporary directory in the output


=====================================
debian/changelog
=====================================
@@ -1,3 +1,37 @@
+unicycler (0.4.8+dfsg-1~bpo9+1) stretch-backports-sloppy; urgency=medium
+
+  * Rebuild for stretch-backports-sloppy.
+
+ -- Andreas Tille <tille at debian.org>  Wed, 27 Nov 2019 15:44:46 +0100
+
+unicycler (0.4.8+dfsg-1) unstable; urgency=medium
+
+  [ Michael R. Crusoe ]
+  * Inherit and use LDFLAGS and CPPFLAGS
+  * Mark unicycler-data as Multi-Arch: foreign, as recommended by the
+    Multiarch hinter.
+
+  [ Andreas Tille ]
+  * New upstream version
+  * debhelper-compat 12
+  * Standards-Version: 4.4.1
+  * Set upstream metadata fields: Repository.
+  * (Build-)Depends: bcftools, miniasm
+  * Versioned (Build-)Depends of Python3 enabled spades
+
+ -- Andreas Tille <tille at debian.org>  Mon, 18 Nov 2019 16:48:24 +0100
+
+unicycler (0.4.7+dfsg-2) unstable; urgency=medium
+
+  [ Andreas Tille ]
+  * Add manpages
+
+  [ Liubov Chuprikova ]
+  * Add autopkgtest
+  * Split data files and docs in unicycler-data
+
+ -- Liubov Chuprikova <chuprikovalv at gmail.com>  Wed, 24 Oct 2018 09:04:28 +0200
+
 unicycler (0.4.7+dfsg-1~bpo9+1) stretch-backports; urgency=medium
 
   * Rebuild for stretch-backports.


=====================================
debian/compat deleted
=====================================
@@ -1 +0,0 @@
-11


=====================================
debian/control
=====================================
@@ -4,20 +4,22 @@ Uploaders: Andreas Tille <tille at debian.org>,
            Liubov Chuprikova <chuprikovalv at gmail.com>
 Section: science
 Priority: optional
-Build-Depends: debhelper (>= 11~),
+Build-Depends: debhelper-compat (= 12),
                dh-python,
                python3-all,
                python3-setuptools,
                default-jdk,
+               bcftools,
                bowtie2,
+               miniasm,
                ncbi-blast+,
                pilon,
                racon,
                samtools,
-               spades,
+               spades (>= 3.13.1),
                libseqan2-dev,
                zlib1g-dev
-Standards-Version: 4.2.1
+Standards-Version: 4.4.1
 Vcs-Browser: https://salsa.debian.org/med-team/unicycler
 Vcs-Git: https://salsa.debian.org/med-team/unicycler.git
 Homepage: https://github.com/rrwick/Unicycler
@@ -29,12 +31,14 @@ Depends: ${python3:Depends},
          ${misc:Depends},
          python3-setuptools,
          default-jre,
+         bcftools,
          bowtie2,
+         miniasm,
          ncbi-blast+,
          pilon,
          racon,
          samtools,
-         spades
+         spades (>= 3.13.1)
 Recommends: unicycler-data
 Description: hybrid assembly pipeline for bacterial genomes
  Unicycler is an assembly pipeline for bacterial genomes. It can assemble
@@ -45,6 +49,7 @@ Description: hybrid assembly pipeline for bacterial genomes
 
 Package: unicycler-data
 Architecture: all
+Multi-Arch: foreign
 Depends: ${misc:Depends}
 Description: hybrid assembly pipeline for bacterial genomes (data package)
  Unicycler is an assembly pipeline for bacterial genomes. It can assemble


=====================================
debian/createmanpages
=====================================
@@ -0,0 +1,52 @@
+#!/bin/sh
+MANDIR=debian/mans
+mkdir -p $MANDIR
+
+VERSION=`dpkg-parsechangelog | awk '/^Version:/ {print $2}' | sed -e 's/^[0-9]*://' -e 's/-.*//' -e 's/[+~]dfsg$//'`
+NAME=`grep "^Description:" debian/control | sed 's/^Description: *//' | head -n1`
+PROGNAME=`grep "^Package:" debian/control | sed 's/^Package: *//' | head -n1`
+
+AUTHOR=".SH AUTHOR\nThis manpage was written by $DEBFULLNAME for the Debian distribution and
+can be used for any other usage of the program.
+"
+
+# If program name is different from package name or title should be
+# different from package short description change this here
+progname=${PROGNAME}
+help2man --no-info --no-discard-stderr  \
+         --name="assembly pipeline for bacterial genomes" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+progname=unicycler_align
+help2man --no-info --no-discard-stderr  \
+         --name="sensitive semi-global long read aligner" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+
+progname=unicycler_check
+help2man --no-info --no-discard-stderr  \
+         --name="long read assembly checker" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+progname=unicycler_polish
+help2man --no-info --no-discard-stderr  \
+         --name="Unicycler polish - hybrid assembly polishing" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+progname=unicycler_scrub
+help2man --no-info --no-discard-stderr  \
+         --name="read trimming, chimera detection and misassembly detection" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+echo "$MANDIR/*.1" > debian/manpages
+
+cat <<EOT
+Please enhance the help2man output.
+The following web page might be helpful in doing so:
+    http://liw.fi/manpages/
+EOT


=====================================
debian/manpages
=====================================
@@ -0,0 +1 @@
+debian/mans/*.1


=====================================
debian/mans/unicycler.1
=====================================
@@ -0,0 +1,96 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.8.
+.TH UNICYCLER "1" "October 2018" "unicycler 0.4.7" "User Commands"
+.SH NAME
+unicycler \- assembly pipeline for bacterial genomes
+.SH SYNOPSIS
+.B unicycler
+[\-h] [\-\-help_all] [\-\-version] [\-1 SHORT1] [\-2 SHORT2]
+[\-s UNPAIRED] [\-l LONG] \fB\-o\fR OUT [\-\-verbosity VERBOSITY]
+[\-\-min_fasta_length MIN_FASTA_LENGTH] [\-\-keep KEEP]
+[\-t THREADS] [\-\-mode {conservative,normal,bold}]
+[\-\-linear_seqs LINEAR_SEQS] [\-\-vcf]
+.SH DESCRIPTION
+Unicycler is an assembly pipeline for bacterial genomes. It can assemble
+Illumina-only read sets where it functions as a SPAdes-optimiser. It can
+also assembly long-read-only sets (PacBio or Nanopore) where it runs a
+miniasm+Racon pipeline. For the best possible assemblies, give it both
+Illumina reads and long reads, and it will conduct a hybrid assembly.
+.SH OPTIONS
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+Show this help message and exit
+.TP
+\fB\-\-help_all\fR
+Show a help message with all program options
+.TP
+\fB\-\-version\fR
+Show Unicycler's version number
+.SS Input
+.TP
+\fB\-1\fR SHORT1, \fB\-\-short1\fR SHORT1
+FASTQ file of first short reads in each pair
+(required)
+.TP
+\fB\-2\fR SHORT2, \fB\-\-short2\fR SHORT2
+FASTQ file of second short reads in each pair
+(required)
+.TP
+\fB\-s\fR UNPAIRED, \fB\-\-unpaired\fR UNPAIRED
+FASTQ file of unpaired short reads (optional)
+.TP
+\fB\-l\fR LONG, \fB\-\-long\fR LONG
+FASTQ or FASTA file of long reads (optional)
+.SS Output
+.TP
+\fB\-o\fR OUT, \fB\-\-out\fR OUT
+Output directory (required)
+.TP
+\fB\-\-verbosity\fR VERBOSITY
+Level of stdout and log file information (default: 1)
+.IP
+0 = no stdout,
+.IP
+1 = basic progress indicators,
+.IP
+2 = extra info,
+.IP
+3 = debugging info
+.TP
+\fB\-\-min_fasta_length\fR MIN_FASTA_LENGTH
+Exclude contigs from the FASTA file which are
+shorter than this length (default: 100)
+.TP
+\fB\-\-keep\fR KEEP
+Level of file retention (default: 1)
+.IP
+0 = only keep final files: assembly (FASTA,GFA and log),
+.IP
+1 = also save graphs at main checkpoints,
+.IP
+2 = also keep SAM (enables fast rerun in different mode),
+.IP
+3 = keep all temp files and save all graphs (for debugging)
+.TP
+\fB\-\-vcf\fR
+Produce a VCF by mapping the short reads to the
+final assembly (experimental, default: do not
+produce a vcf file)
+.SS Other
+.TP
+\fB\-t\fR THREADS, \fB\-\-threads\fR THREADS
+Number of threads used (default: 4)
+.TP
+\fB\-\-mode\fR {conservative,normal,bold}
+Bridging mode (default: normal)
+.IP
+conservative = smaller contigs, lowest misassembly rate
+.IP
+normal = moderate contig size and misassembly rate
+.IP
+bold = longest contigs, higher misassembly rate
+.TP
+\fB\-\-linear_seqs\fR LINEAR_SEQS
+The expected number of linear (i.e. non\-circular)
+sequences in the underlying sequence (default: 0)
+.SH AUTHOR
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/mans/unicycler_align.1
=====================================
@@ -0,0 +1,68 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.8.
+.TH UNICYCLER_ALIGN "1" "October 2018" "unicycler_align 0.4.7" "User Commands"
+.SH NAME
+unicycler_align \- sensitive semi-global long read aligner
+.SH SYNOPSIS
+.B unicycler_align
+[\-h] \fB\-\-ref\fR REF \fB\-\-reads\fR READS \fB\-\-sam\fR SAM
+[\-\-contamination CONTAMINATION] [\-\-scores SCORES]
+[\-\-low_score LOW_SCORE] [\-\-keep_bad]
+[\-\-sensitivity SENSITIVITY] [\-\-threads THREADS]
+[\-\-verbosity VERBOSITY] [\-\-min_len MIN_LEN]
+[\-\-allowed_overlap ALLOWED_OVERLAP]
+.SH DESCRIPTION
+Unicycler align \- a sensitive semi\-global long read aligner
+.SH OPTIONS
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+show this help message and exit
+.TP
+\fB\-\-ref\fR REF
+FASTA file containing one or more reference
+sequences
+.TP
+\fB\-\-reads\fR READS
+FASTQ or FASTA file of long reads
+.TP
+\fB\-\-sam\fR SAM
+SAM file of resulting alignments
+.TP
+\fB\-\-contamination\fR CONTAMINATION
+FASTA file of known contamination in long reads
+.TP
+\fB\-\-scores\fR SCORES
+Comma\-delimited string of alignment scores: match,
+mismatch, gap open, gap extend (default: 3,\-6,\-5,\-2)
+.TP
+\fB\-\-low_score\fR LOW_SCORE
+Score threshold \- alignments below this are
+considered poor (default: set threshold
+automatically)
+.TP
+\fB\-\-keep_bad\fR
+Include alignments in the results even if they are
+below the low score threshold (default: low\-scoring
+alignments are discarded)
+.TP
+\fB\-\-sensitivity\fR SENSITIVITY
+A number from 0 (least sensitive) to 3 (most
+sensitive) (default: 0)
+.TP
+\fB\-\-threads\fR THREADS
+Number of threads used (default: number of CPUs, up
+to 8)
+.TP
+\fB\-\-verbosity\fR VERBOSITY
+Level of stdout information (0 to 4) (default: 1)
+.TP
+\fB\-\-min_len\fR MIN_LEN
+Minimum alignment length (bp) \- exclude alignments
+shorter than this length (default: 100)
+.TP
+\fB\-\-allowed_overlap\fR ALLOWED_OVERLAP
+Allow this much overlap between alignments in a
+single read (default: 100)
+.SH SEE ALSO
+unicycler(1)
+.SH AUTHOR
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/mans/unicycler_check.1
=====================================
@@ -0,0 +1,78 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.8.
+.TH UNICYCLER_CHECK "1" "October 2018" "unicycler_check 0.4.7" "User Commands"
+.SH NAME
+unicycler_check \- long read assembly checker
+.SH SYNOPSIS
+.B unicycler_check
+[\-h] \fB\-\-sam\fR SAM \fB\-\-ref\fR REF \fB\-\-reads\fR READS
+[\-\-min_len MIN_LEN]
+[\-\-error_window_size ERROR_WINDOW_SIZE]
+[\-\-depth_window_size DEPTH_WINDOW_SIZE]
+[\-\-error_rate_threshold ERROR_RATE_THRESHOLD]
+[\-\-depth_p_val DEPTH_P_VAL]
+[\-\-window_tables WINDOW_TABLES]
+[\-\-base_tables BASE_TABLES] [\-\-html HTML]
+[\-\-threads THREADS] [\-\-verbosity VERBOSITY]
+.SH DESCRIPTION
+Long read assembly checker
+.SH OPTIONS
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+show this help message and exit
+.TP
+\fB\-\-sam\fR SAM
+Input SAM file of alignments (if this file doesn't
+exist, the alignment will be performed with results
+saved to this file \- you can use the aligner
+arguments with this script)
+.TP
+\fB\-\-ref\fR REF
+FASTA file containing one or more reference
+sequences
+.TP
+\fB\-\-reads\fR READS
+FASTQ file of long reads
+.TP
+\fB\-\-min_len\fR MIN_LEN
+Minimum alignment length (bp) \- exclude alignments
+shorter than this length (default: 100)
+.TP
+\fB\-\-error_window_size\fR ERROR_WINDOW_SIZE
+Window size for error summaries (default: 100)
+.TP
+\fB\-\-depth_window_size\fR DEPTH_WINDOW_SIZE
+Window size for depth summaries (default: 100)
+.TP
+\fB\-\-error_rate_threshold\fR ERROR_RATE_THRESHOLD
+Threshold for high error rates, expressed as the
+fraction between the mean error rate and the random
+alignment error rate (default: 0.3)
+.TP
+\fB\-\-depth_p_val\fR DEPTH_P_VAL
+P\-value for low/high depth thresholds (default:
+0.001)
+.TP
+\fB\-\-window_tables\fR WINDOW_TABLES
+Path and/or prefix for table files summarising
+reference errors for reference windows (default: do
+not save window tables)
+.TP
+\fB\-\-base_tables\fR BASE_TABLES
+Path and/or prefix for table files summarising
+reference errors at each base (default: do not save
+base tables)
+.TP
+\fB\-\-html\fR HTML
+Path for HTML report (default: do not save HTML
+report)
+.TP
+\fB\-\-threads\fR THREADS
+Number of CPU threads used to align (default: the
+number of available CPUs)
+.TP
+\fB\-\-verbosity\fR VERBOSITY
+Level of stdout information (0 to 2) (default: 1)
+.SH SEE ALSO
+unicycler(1)
+.SH AUTHOR
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/mans/unicycler_polish.1
=====================================
@@ -0,0 +1,160 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.8.
+.TH UNICYCLER_POLISH "1" "October 2018" "unicycler_polish 0.4.7" "User Commands"
+.SH NAME
+unicycler_polish \- Unicycler polish - hybrid assembly polishing
+.SH SYNOPSIS
+.B unicycler_polish
+[\-h] \fB\-a\fR ASSEMBLY [\-1 SHORT1] [\-2 SHORT2]
+[\-\-pb_bax PB_BAX [PB_BAX ...]] [\-\-pb_bam PB_BAM]
+[\-\-pb_fasta PB_FASTA] [\-\-long_reads LONG_READS]
+[\-\-no_fix_local] [\-\-min_insert MIN_INSERT]
+[\-\-max_insert MAX_INSERT]
+[\-\-min_align_length MIN_ALIGN_LENGTH]
+[\-\-homopolymer HOMOPOLYMER] [\-\-large LARGE]
+[\-\-illumina_alt ILLUMINA_ALT]
+[\-\-freebayes_qual_cutoff FREEBAYES_QUAL_CUTOFF]
+[\-\-threads THREADS] [\-\-verbosity VERBOSITY]
+[\-\-samtools SAMTOOLS] [\-\-bowtie2 BOWTIE2]
+[\-\-minimap2 MINIMAP2] [\-\-freebayes FREEBAYES]
+[\-\-pitchfork PITCHFORK] [\-\-bax2bam BAX2BAM]
+[\-\-pbalign PBALIGN] [\-\-arrow ARROW] [\-\-pilon PILON]
+[\-\-java JAVA] [\-\-ale ALE] [\-\-racon RACON]
+[\-\-minimap MINIMAP] [\-\-nucmer NUCMER]
+[\-\-showsnps SHOWSNPS]
+.SH DESCRIPTION
+Unicycler polish \- hybrid assembly polishing
+.SH OPTIONS
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+show this help message and exit
+.SS Assembly
+.TP
+\fB\-a\fR ASSEMBLY, \fB\-\-assembly\fR ASSEMBLY
+Input assembly to be polished
+.SS Short reads
+.IP
+To polish with short reads (using Pilon), provide two FASTQ files of
+paired\-end reads
+.TP
+\fB\-1\fR SHORT1, \fB\-\-short1\fR SHORT1
+FASTQ file of short reads (first reads in each pair)
+.TP
+\fB\-2\fR SHORT2, \fB\-\-short2\fR SHORT2
+FASTQ file of short reads (second reads in each
+pair)
+.SS PacBio reads
+.IP
+To polish with PacBio reads (using Arrow), provide one of the following
+.TP
+\fB\-\-pb_bax\fR PB_BAX [PB_BAX ...]
+PacBio raw bax.h5 read files
+.TP
+\fB\-\-pb_bam\fR PB_BAM
+PacBio BAM read file
+.TP
+\fB\-\-pb_fasta\fR PB_FASTA
+FASTA file of PacBio reads
+.SS Generic long reads
+.IP
+To polish with generic long reads, provide the following
+.TP
+\fB\-\-long_reads\fR LONG_READS
+FASTQ/FASTA file of long reads
+.SS Polishing settings
+Various settings for polishing behaviour (defaults should work well in
+most cases)
+.TP
+\fB\-\-no_fix_local\fR
+do not fix local misassemblies (default: False)
+.TP
+\fB\-\-min_insert\fR MIN_INSERT
+minimum valid short read insert size (default: auto)
+.TP
+\fB\-\-max_insert\fR MAX_INSERT
+maximum valid short read insert size (default: auto)
+.TP
+\fB\-\-min_align_length\fR MIN_ALIGN_LENGTH
+Minimum long read alignment length (default: 1000)
+.TP
+\fB\-\-homopolymer\fR HOMOPOLYMER
+Long read polish changes to a homopolymer of this
+length or greater will be ignored (default: 4)
+.TP
+\fB\-\-large\fR LARGE
+Variants of this size or greater will be assess as
+large variants (default: 10)
+.TP
+\fB\-\-illumina_alt\fR ILLUMINA_ALT
+When assessing long read changes with short read
+alignments, a variant will only be applied if the
+alternative occurrences in the short read alignments
+exceed this percentage (default: 5)
+.TP
+\fB\-\-freebayes_qual_cutoff\fR FREEBAYES_QUAL_CUTOFF
+Reject Pilon substitutions from long reads if the
+FreeBayes quality is less than this value (default:
+10.0)
+.SS Other settings
+.TP
+\fB\-\-threads\fR THREADS
+CPU threads to use in alignment and consensus
+(default: number of CPUs)
+.TP
+\fB\-\-verbosity\fR VERBOSITY
+Level of stdout information (0 to 3, default: 2)
+0 = no stdout, 1 = basic progress indicators,
+2 = extra info, 3 = debugging info
+.SS Tool locations
+If these required tools are not available in your PATH variable, specify
+their location here (depending on which input reads are used, some of
+these tools may not be required)
+.TP
+\fB\-\-samtools\fR SAMTOOLS
+path to samtools executable (default: samtools)
+.TP
+\fB\-\-bowtie2\fR BOWTIE2
+path to bowtie2 executable (default: bowtie2)
+.TP
+\fB\-\-minimap2\fR MINIMAP2
+path to minimap2 executable (default: minimap2)
+.TP
+\fB\-\-freebayes\fR FREEBAYES
+path to freebayes executable (default: freebayes)
+.TP
+\fB\-\-pitchfork\fR PITCHFORK
+Path to Pitchfork installation of PacBio tools
+(should contain bin and lib directories) (default: )
+.TP
+\fB\-\-bax2bam\fR BAX2BAM
+path to bax2bam executable (default: bax2bam)
+.TP
+\fB\-\-pbalign\fR PBALIGN
+path to pbalign executable (default: pbalign)
+.TP
+\fB\-\-arrow\fR ARROW
+path to arrow executable (default: arrow)
+.TP
+\fB\-\-pilon\fR PILON
+path to pilon jar file (default: pilon*.jar)
+.TP
+\fB\-\-java\fR JAVA
+path to java executable (default: java)
+.TP
+\fB\-\-ale\fR ALE
+path to ALE executable (default: ALE)
+.TP
+\fB\-\-racon\fR RACON
+path to racon executable (default: racon)
+.TP
+\fB\-\-minimap\fR MINIMAP
+path to miniasm executable (default: minimap)
+.TP
+\fB\-\-nucmer\fR NUCMER
+path to nucmer executable (default: nucmer)
+.TP
+\fB\-\-showsnps\fR SHOWSNPS
+path to show\-snps executable (default: show\-snps)
+.SH SEE ALSO
+unicycler(1)
+.SH AUTHOR
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/mans/unicycler_scrub.1
=====================================
@@ -0,0 +1,75 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.8.
+.TH UNICYCLER_SCRUB "1" "October 2018" "unicycler_scrub 0.4.7" "User Commands"
+.SH NAME
+unicycler_scrub \- read trimming, chimera detection and misassembly detection
+.SH SYNOPSIS
+.B unicycler_scrub
+[\-h] \fB\-i\fR INPUT \fB\-o\fR OUT [\-r READS] [\-\-trim TRIM]
+[\-\-split SPLIT] [\-\-min_split_size MIN_SPLIT_SIZE]
+[\-\-discard_chimeras] [\-t THREADS] [\-\-keep_paf]
+[\-\-parameters PARAMETERS] [\-\-verbosity VERBOSITY]
+.SH DESCRIPTION
+Unicycler\-scrub \- read trimming, chimera detection and misassembly detection
+.SH OPTIONS
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+show this help message and exit
+.TP
+\fB\-i\fR INPUT, \fB\-\-input\fR INPUT
+These are the reads or assembly to be scrubbed (can
+be FASTA or FASTQ format
+.TP
+\fB\-o\fR OUT, \fB\-\-out\fR OUT
+The scrubbed reads or assembly will be saved to this
+file (will have the same format as the \fB\-\-input\fR file
+format) or use "none" to not produce an output file
+.TP
+\fB\-r\fR READS, \fB\-\-reads\fR READS
+These are the reads used to scrub \fB\-\-input\fR (can be
+FASTA or FASTQ format) (default: same file as
+\fB\-\-input\fR)
+.TP
+\fB\-\-trim\fR TRIM
+The aggressiveness with which the input will be
+trimmed (0 to 100, where 0 is no trimming and 100 is
+very aggressive trimming) (default: 50)
+.TP
+\fB\-\-split\fR SPLIT
+The aggressiveness with which the input will be
+split (0 to 100, where 0 is no splitting and 100 is
+very aggressive splitting) (default: 50)
+.TP
+\fB\-\-min_split_size\fR MIN_SPLIT_SIZE
+Parts of split sequences will only be outputted if
+they are at least this big (default: 1000)
+.TP
+\fB\-\-discard_chimeras\fR
+If used, chimeric sequences will be discarded
+instead of split (default: False)
+.TP
+\fB\-t\fR THREADS, \fB\-\-threads\fR THREADS
+Number of threads used (default: 4)
+.TP
+\fB\-\-keep_paf\fR
+Save the alignments to file (makes repeated runs
+faster because alignments can be loaded from file)
+(default: False)
+.TP
+\fB\-\-parameters\fR PARAMETERS
+Low\-level parameters (for debugging use only)
+(default: )
+.TP
+\fB\-\-verbosity\fR VERBOSITY
+Level of stdout information (default: 1)
+.IP
+0 = no stdout,
+.IP
+1 = basic progress indicators,
+.IP
+2 = extra info,
+.IP
+3 = debugging info
+.SH SEE ALSO
+unicycler(1)
+.SH AUTHOR
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/patches/append_flags
=====================================
@@ -0,0 +1,19 @@
+From: Michael R. Crusoe <michael.crusoe at gmail.com>
+Subject: Inherit and use LDFLAGS and CPPFLAGS
+--- unicycler.orig/Makefile
++++ unicycler/Makefile
+@@ -66,7 +66,7 @@
+ 
+ # These flags are required for the build to work.
+ FLAGS        = -std=c++14 -Iunicycler/include -fPIC
+-LDFLAGS      = -shared -lz
++LDFLAGS      += -shared -lz
+ 
+ 
+ # Platform-specific stuff (for Seqan)
+@@ -115,4 +115,4 @@
+ 	$(RM) $(TARGET)
+ 
+ %.o: %.cpp $(HEADERS)
+-	$(CXX) $(FLAGS) $(CXXFLAGS) -c -o $@ $<
++	$(CXX) $(CPPFLAGS) $(FLAGS) $(CXXFLAGS) -c -o $@ $<


=====================================
debian/patches/series
=====================================
@@ -1,3 +1,4 @@
 spades.patch
 # bowtie.patch 
 install_wo_extra_steps.patch
+append_flags


=====================================
debian/tests/control
=====================================
@@ -1,3 +1,3 @@
 Tests: run-unit-test
-Depends: @
+Depends: @, @builddeps@
 Restrictions: allow-stderr


=====================================
debian/tests/run-unit-test
=====================================
@@ -1,15 +1,32 @@
 #!/bin/bash
 set -e
 
-pkg=#PACKAGENAME#
+pkg=unicycler
 
 if [ "$AUTOPKGTEST_TMP" = "" ] ; then
   AUTOPKGTEST_TMP=`mktemp -d /tmp/${pkg}-test.XXXXXX`
   trap "rm -rf $AUTOPKGTEST_TMP" 0 INT QUIT ABRT PIPE TERM
 fi
 
-cp -a /usr/share/doc/${pkg}/examples/* $AUTOPKGTEST_TMP
+if [ -d /usr/share/${pkg}-data/sample_data ] ; then
+  cp -a /usr/share/${pkg}-data/sample_data/* $AUTOPKGTEST_TMP
+else
+  echo "Please install package unicycler-data to run this script"
+  exit 1
+fi
 
 cd $AUTOPKGTEST_TMP
 
-#do_stuff_to_test_package#
+unicycler -1 short_reads_1.fastq.gz -2 short_reads_2.fastq.gz -o illumina_assembly
+
+#unicycler -l long_reads_high_depth.fastq.gz -o long_read_assembly
+# This command fails with the following error:
+#    Assembling contigs and long reads with miniasm
+#    ...
+#    Assembling reads with miniasm... empty result
+#    Error: miniasm assembly failed
+
+# It might be that the reads have not enough depth. See issue:
+#    https://github.com/rrwick/Unicycler/issues/38
+
+unicycler -1 short_reads_1.fastq.gz -2 short_reads_2.fastq.gz -l long_reads_low_depth.fastq.gz -o hybrid_assembly


=====================================
debian/unicycler-data.docs
=====================================
@@ -0,0 +1,2 @@
+sample_data/README.md
+sample_data/download_links


=====================================
debian/unicycler-data.install
=====================================
@@ -1 +1,2 @@
-sample_data /usr/share/unicycler-data
+sample_data/reference.fasta /usr/share/unicycler-data/sample_data
+sample_data/*.fastq.gz /usr/share/unicycler-data/sample_data


=====================================
debian/unicycler.docs
=====================================
@@ -0,0 +1,2 @@
+debian/tests/run-unit-test
+debian/README.test


=====================================
debian/upstream/metadata
=====================================
@@ -1,23 +1,39 @@
 Reference:
- - Author: Ryan R. Wick and Louise M. Judd and Claire L. Gorrie and Kathryn E. Holt
-   Title: "Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads"
-   Journal: PLOS Computational Biology
-   Year: 2017
-   Volume: 13
-   Number: 6
-   Pages: e1005595
-   DOI: 10.1371/journal.pcbi.1005595
-   PMID: 28594827
-   URL: http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005595
-   eprint: http://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005595&type=printable
- - Author: Ryan R. Wick and Louise M. Judd and Claire L. Gorrie and Kathryn E. Holt
-   Title: Completing bacterial genome assemblies with multiplex MinION sequencing
-   Journal: Microbial Genomics
-   Year: 2017
-   Volume: 3
-   Number: 10
-   Pages: e000132
-   DOI: 10.1099/mgen.0.000132
-   PMID: 29177090
-   URL: http://mgen.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.000132
-   eprint: http://mgen.microbiologyresearch.org/deliver/fulltext/mgen/3/10/mgen000132.pdf?itemId=/content/journal/mgen/10.1099/mgen.0.000132&mimeType=pdf&isFastTrackArticle=
+- Author: >
+    Ryan R. Wick and Louise M. Judd and Claire L. Gorrie and Kathryn
+    E. Holt
+  Title: >
+    Unicycler: Resolving bacterial genome assemblies from short and long
+    sequencing reads
+  Journal: PLOS Computational Biology
+  Year: 2017
+  Volume: 13
+  Number: 6
+  Pages: e1005595
+  DOI: 10.1371/journal.pcbi.1005595
+  PMID: 28594827
+  URL: "http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005595"
+  eprint: "http://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005595&type=printable"
+- Author: >
+    Ryan R. Wick and Louise M. Judd and Claire L. Gorrie and Kathryn E. Holt
+  Title: >
+    Completing bacterial genome assemblies with multiplex MinION sequencing
+  Journal: Microbial Genomics
+  Year: 2017
+  Volume: 3
+  Number: 10
+  Pages: e000132
+  DOI: 10.1099/mgen.0.000132
+  PMID: 29177090
+  URL: "http://mgen.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.000132"
+  eprint: "http://mgen.microbiologyresearch.org/deliver/fulltext/mgen/3/10/mgen000132.pdf?itemId=/content/journal/mgen/10.1099/mgen.0.000132&mimeType=pdf&isFastTrackArticle="
+Registry:
+- Name: OMICtools
+  Entry: OMICS_14591
+- Name: bio.tools
+  Entry: unicycler
+- Name: conda:bioconda
+  Entry: unicycler
+- Name: SciCrunch
+  Entry: NA
+Repository: https://github.com/rrwick/Unicycler.git


=====================================
unicycler/assembly_graph.py
=====================================
@@ -423,6 +423,7 @@ class AssemblyGraph(object):
           3) deleting the segment would not create any dead ends
         """
         segment_nums_to_remove = []
+        total_length_removed = 0
         ten_longest_contigs = sorted(self.segments.values(), reverse=True,
                                      key=lambda x: x.get_length())[:10]
         whole_graph_cutoff = self.get_median_read_depth(ten_longest_contigs) * relative_depth_cutoff
@@ -437,7 +438,9 @@ class AssemblyGraph(object):
                             self.all_segments_below_depth(component, whole_graph_cutoff) or \
                             self.dead_end_change_if_deleted(seg_num) <= 0:
                         segment_nums_to_remove.append(seg_num)
+                        total_length_removed += segment.get_length()
         self.remove_segments(segment_nums_to_remove)
+        return len(segment_nums_to_remove), total_length_removed
 
     def filter_homopolymer_loops(self):
         """
@@ -455,6 +458,28 @@ class AssemblyGraph(object):
             log.log('Removed homopolymer loops:', 3)
             log.log_number_list(segment_nums_to_remove, 3)
 
+    def choose_largest_component(self):
+        """
+        Special logic: throw out all of the graph's connected components except for the largest one.
+        """
+        largest_component_length = None
+        connected_components = self.get_connected_components()
+        for component_nums in connected_components:
+            component_segments = [self.segments[x] for x in component_nums]
+            component_length = sum(x.get_length() for x in component_segments)
+            if largest_component_length is None or component_length > largest_component_length:
+                largest_component_length = component_length
+        segment_nums_to_remove = []
+        for component_nums in connected_components:
+            component_segments = [self.segments[x] for x in component_nums]
+            component_length = sum(x.get_length() for x in component_segments)
+            if component_length < largest_component_length:
+                segment_nums_to_remove += component_nums
+        self.remove_segments(segment_nums_to_remove)
+        if segment_nums_to_remove:
+            log.log('\nRemoved not-largest components:', 3)
+            log.log_number_list(segment_nums_to_remove, 3)
+
     def remove_segments(self, nums_to_remove):
         """
         This function deletes all segments in the nums_to_remove list, along with their links. It
@@ -923,7 +948,7 @@ class AssemblyGraph(object):
             dead_ends += 1
         return potential_dead_ends - dead_ends
 
-    def clean(self, read_depth_filter):
+    def clean(self, read_depth_filter, largest_component):
         """
         This function does various graph repairs, filters and normalisations to make it a bit
         nicer.
@@ -931,9 +956,12 @@ class AssemblyGraph(object):
         log.log('Repair multi way junctions  ' + get_dim_timestamp(), 3)
         self.repair_multi_way_junctions()
         log.log('Filter by read depth        ' + get_dim_timestamp(), 3)
-        self.filter_by_read_depth(read_depth_filter)
+        removed_count, removed_length = self.filter_by_read_depth(read_depth_filter)
         log.log('Filter homopolymer loops    ' + get_dim_timestamp(), 3)
         self.filter_homopolymer_loops()
+        if largest_component:
+            log.log('Keep largest component      ' + get_dim_timestamp(), 3)
+            self.choose_largest_component()
         log.log('Merge all possible          ' + get_dim_timestamp(), 3)
         self.merge_all_possible(None, 2)
         log.log('Normalise read depths       ' + get_dim_timestamp(), 3)
@@ -943,6 +971,7 @@ class AssemblyGraph(object):
         log.log('Sort link order             ' + get_dim_timestamp(), 3)
         self.sort_link_order()
         log.log('Graph cleaning finished     ' + get_dim_timestamp(), 3)
+        return removed_count, removed_length
 
     def final_clean(self):
         """


=====================================
unicycler/bridge_long_read_simple.py
=====================================
@@ -490,7 +490,7 @@ def get_read_loop_vote(start, end, middle, repeat, strand, minimap_alignments, r
                 best_score = test_seq_score
                 best_count = loop_count
 
-        # Break when we've hit the max loop count. But if the max isn't our best, then we keep
+        # Break when we've hit the max loop count. But if the max is our best, then we keep
         # trying higher.
         if loop_count >= max_tested_loop_count and loop_count != best_count:
             break


=====================================
unicycler/settings.py
=====================================
@@ -30,8 +30,8 @@ SIMPLE_REPEAT_BRIDGING_BAND_SIZE = 50
 CONTIG_READ_QSCORE = 40
 
 # This is the maximum number of times an assembly will be Racon polished
-RACON_POLISH_LOOP_COUNT_HYBRID = 5
-RACON_POLISH_LOOP_COUNT_LONG_ONLY = 10
+RACON_POLISH_LOOP_COUNT_HYBRID = 2
+RACON_POLISH_LOOP_COUNT_LONG_ONLY = 4
 
 
 # This is the number of times assembly graph contigs are included in the Racon polish reads. E.g.


=====================================
unicycler/spades_func.py
=====================================
@@ -30,7 +30,8 @@ class BadFastq(Exception):
 
 def get_best_spades_graph(short1, short2, short_unpaired, out_dir, read_depth_filter, verbosity,
                           spades_path, threads, keep, kmer_count, min_k_frac, max_k_frac, kmers,
-                          no_spades_correct, expected_linear_seqs, spades_tmp_dir):
+                          no_spades_correct, expected_linear_seqs, spades_tmp_dir,
+                          largest_component):
     """
     This function tries a SPAdes assembly at different k-mers and returns the best.
     'The best' is defined as the smallest dead-end count after low-depth filtering.  If multiple
@@ -126,7 +127,7 @@ def get_best_spades_graph(short1, short2, short_unpaired, out_dir, read_depth_fi
             continue
 
         log.log('\nCleaning k{} graph'.format(kmer), 2)
-        assembly_graph.clean(read_depth_filter)
+        assembly_graph.clean(read_depth_filter, largest_component)
         clean_graph_filename = os.path.join(spades_dir, ('k%03d' % kmer) + '_assembly_graph.gfa')
         assembly_graph.save_to_gfa(clean_graph_filename, verbosity=2)
 
@@ -183,7 +184,7 @@ def get_best_spades_graph(short1, short2, short_unpaired, out_dir, read_depth_fi
     assembly_graph = AssemblyGraph(best_graph_filename, best_kmer, paths_file=paths_file,
                                    insert_size_mean=insert_size_mean,
                                    insert_size_deviation=insert_size_deviation)
-    assembly_graph.clean(read_depth_filter)
+    removed_count, removed_length = assembly_graph.clean(read_depth_filter, largest_component)
     clean_graph_filename = os.path.join(spades_dir, 'k' + str(best_kmer) + '_assembly_graph.gfa')
     assembly_graph.save_to_gfa(clean_graph_filename, verbosity=2)
 
@@ -197,9 +198,14 @@ def get_best_spades_graph(short1, short2, short_unpaired, out_dir, read_depth_fi
                 row_colour={best_kmer_row: 'green'},
                 row_extra_text={best_kmer_row: ' ' + get_left_arrow() + 'best'})
 
+    # Report on the results of the read depth filter (can help with identifying levels of
+    # contamination).
+    log.log('\nRead depth filter: removed {} contigs totalling {} bp'.format(removed_count,
+                                                                           removed_length))
+
     # Clean up.
     if keep < 3 and os.path.isdir(spades_dir):
-        log.log('\nDeleting ' + spades_dir + '/')
+        log.log('Deleting ' + spades_dir + '/')
         shutil.rmtree(spades_dir, ignore_errors=True)
     if keep < 3 and spades_tmp_dir is not None and os.path.isdir(spades_tmp_dir):
         log.log('Deleting ' + spades_tmp_dir + '/')


=====================================
unicycler/src/miniasm/hit.cpp
=====================================
@@ -8,6 +8,7 @@
 #include <limits>
 
 #pragma GCC diagnostic ignored "-Wpragmas"
+#pragma GCC diagnostic ignored "-Wunknown-warning-option"
 #pragma GCC diagnostic ignored "-Wvla"
 #pragma GCC diagnostic ignored "-Wvla-extension"
 #pragma GCC diagnostic ignored "-Wmaybe-uninitialized"


=====================================
unicycler/unicycler.py
=====================================
@@ -85,7 +85,7 @@ def main():
                                           args.spades_path, args.threads, args.keep,
                                           args.kmer_count, args.min_kmer_frac, args.max_kmer_frac,
                                           args.kmers, args.no_correct, args.linear_seqs,
-                                          args.spades_tmp_dir)
+                                          args.spades_tmp_dir, args.largest_component)
         determine_copy_depth(graph)
         if args.keep > 0 and not os.path.isfile(best_spades_graph):
             graph.save_to_gfa(best_spades_graph, save_copy_depth_info=True, newline=True,
@@ -346,6 +346,10 @@ def get_arguments():
                               help='Filter out contigs lower than this fraction of the chromosomal '
                                    'depth, if doing so does not result in graph dead ends'
                                    if show_all_args else argparse.SUPPRESS)
+    spades_group.add_argument('--largest_component', action='store_true',
+                              help='Only keep the largest connected component of the assembly '
+                                   'graph (default: keep all connected components)'
+                                   if show_all_args else argparse.SUPPRESS)
     spades_group.add_argument('--spades_tmp_dir', type=str, default=None,
                               help="Specify SPAdes temporary directory using the SPAdes --tmp-dir "
                                    "option (default: make a temporary directory in the output "
@@ -912,10 +916,10 @@ def rotate_completed_replicons(graph, args, counter):
         log.log_section_header('Rotating completed replicons')
         log.log_explanation('Any completed circular contigs (i.e. single contigs which have one '
                             'link connecting end to start) can have their start position changed '
-                            'with altering the sequence. For consistency, Unicycler now searches '
-                            'for a starting gene (dnaA or repA) in each such contig, and if one '
-                            'is found, the contig is rotated to start with that gene on the '
-                            'forward strand.')
+                            'without altering the sequence. For consistency, Unicycler now '
+                            'searches for a starting gene (dnaA or repA) in each such contig, and '
+                            'if one is found, the contig is rotated to start with that gene on '
+                            'the forward strand.')
 
         rotation_result_table = [['Segment', 'Length', 'Depth', 'Starting gene', 'Position',
                                   'Strand', 'Identity', 'Coverage']]


=====================================
unicycler/version.py
=====================================
@@ -13,4 +13,4 @@ details. You should have received a copy of the GNU General Public License along
 not, see <http://www.gnu.org/licenses/>.
 """
 
-__version__ = '0.4.7'
+__version__ = '0.4.8'



View it on GitLab: https://salsa.debian.org/med-team/unicycler/compare/4b2870466642da6aa54914ddb3a5976bde465131...e23f2a660408876f73516bfd836379a2e05c4a8d

-- 
View it on GitLab: https://salsa.debian.org/med-team/unicycler/compare/4b2870466642da6aa54914ddb3a5976bde465131...e23f2a660408876f73516bfd836379a2e05c4a8d
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20191127/06cd708e/attachment-0001.html>


More information about the debian-med-commit mailing list