[med-svn] [Git][med-team/unicycler][master] 5 commits: d/watch: Fix watch regex

Nilesh Patra (@nilesh) gitlab at salsa.debian.org
Sat Aug 7 20:51:24 BST 2021



Nilesh Patra pushed to branch master at Debian Med / unicycler


Commits:
ae1b45ca by Nilesh Patra at 2021-08-08T00:49:24+05:30
d/watch: Fix watch regex

- - - - -
4ffd09ab by Nilesh Patra at 2021-08-08T00:50:16+05:30
New upstream version 0.4.9+dfsg
- - - - -
098a4d74 by Nilesh Patra at 2021-08-08T00:50:32+05:30
Update upstream source from tag 'upstream/0.4.9+dfsg'

Update to upstream version '0.4.9+dfsg'
with Debian dir 84a461f8ce953a86d6b644be308ab43782a3c66e
- - - - -
aa01f79b by Nilesh Patra at 2021-08-08T00:54:44+05:30
d/p/{append_flags,spades.patch}: Refresh patches

- - - - -
50a1cc57 by Nilesh Patra at 2021-08-08T01:21:06+05:30
[skip ci] Interim changelog entry

- - - - -


12 changed files:

- README.md
- debian/changelog
- debian/patches/append_flags
- debian/patches/spades.patch
- debian/watch
- test/test_misc.py
- unicycler/assembly_graph.py
- unicycler/misc.py
- unicycler/settings.py
- unicycler/spades_func.py
- unicycler/unicycler.py
- unicycler/version.py


Changes:

=====================================
README.md
=====================================
@@ -10,6 +10,17 @@ And read about how we use it to complete bacterial genomes here:
 
 
 
+# A note on Trycycler
+
+[Trycycler](https://github.com/rrwick/Trycycler/wiki) is a newer tool that in many cases is a better choice than Unicycler. Here is a quick guide on whether you should use Unicycler or Trycycler to assemble your bacterial genome:
+* If you only have short reads, use Unicycler (Trycycler does not do short-read assembly).
+* If you only have long reads, Trycycler is a better choice. While Unicycler can do long-read-only assembly, its approach is somewhat out-of-date. Something else to consider is the depth of your long reads, as Trycycler prefers deeper read sets. If your long-read set is particularly shallow (~25× or less), then [Flye](https://github.com/fenderglass/Flye) might be your best option.
+* If you have both short and long reads (i.e. are doing a hybrid assembly), then Unicycler and [Trycycler+polishing](https://github.com/rrwick/Trycycler/wiki/Polishing-after-Trycycler) are both viable options. If you have lots of long reads (~100× depth or more), use Trycycler+polishing. If you have sparse long reads (~25× or less), use Unicycler. If your long-read depth falls between those values, it might be worth trying both approaches.
+
+You can read more on Trycycler's FAQ page: [Should I use Unicycler or Trycycler to assemble my bacterial genome?](https://github.com/rrwick/Trycycler/wiki/FAQ-and-miscellaneous-tips#should-i-use-unicycler-or-trycycler-to-assemble-my-bacterial-genome)
+
+
+
 # Table of contents
 
 * [Introduction](#introduction)
@@ -94,7 +105,7 @@ Reasons to __not__ use Unicycler:
     * [ICC](https://software.intel.com/en-us/c-compilers) also works (though I don't know the minimum required version number)
 * [setuptools](https://packaging.python.org/installing/#install-pip-setuptools-and-wheel) (only required for installation of Unicycler)
 * For short-read or hybrid assembly:
-  * [SPAdes](http://bioinf.spbau.ru/spades) v3.6.2 or later (`spades.py`)
+  * [SPAdes](http://bioinf.spbau.ru/spades) v3.6.2 – v3.13.0 (`spades.py`)
 * For long-read or hybrid assembly:
   * [Racon](https://github.com/isovic/racon) (`racon`)
 * For polishing


=====================================
debian/changelog
=====================================
@@ -1,3 +1,11 @@
+unicycler (0.4.9+dfsg-1) UNRELEASED; urgency=medium
+
+  * d/watch: Fix watch regex
+  * New upstream version 0.4.9+dfsg
+TODO: Error: Unicycler requires SPAdes v3.6.2 - v3.13.0
+
+ -- Nilesh Patra <nilesh at debian.org>  Sun, 08 Aug 2021 00:50:46 +0530
+
 unicycler (0.4.8+dfsg-2) unstable; urgency=medium
 
   * Autopkgtest: Restrictions: skip-not-installable


=====================================
debian/patches/append_flags
=====================================
@@ -1,7 +1,7 @@
 From: Michael R. Crusoe <michael.crusoe at gmail.com>
 Subject: Inherit and use LDFLAGS and CPPFLAGS
---- unicycler.orig/Makefile
-+++ unicycler/Makefile
+--- a/Makefile
++++ b/Makefile
 @@ -66,7 +66,7 @@
  
  # These flags are required for the build to work.


=====================================
debian/patches/spades.patch
=====================================
@@ -4,7 +4,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
 
 --- a/test/test_dependencies.py
 +++ b/test/test_dependencies.py
-@@ -42,7 +42,7 @@ class TestDependencies(unittest.TestCase
+@@ -42,7 +42,7 @@
  
      def test_spades_not_found(self):
          stdout, stderr, ret_code = self.run_unicycler(['--spades_path', 'not_a_real_path'])
@@ -13,7 +13,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
          self.assertTrue('could not find SPAdes' in stderr)
          self.assertEqual(ret_code, 1)
  
-@@ -102,14 +102,14 @@ class TestDependencies(unittest.TestCase
+@@ -102,14 +102,14 @@
      def test_no_rotate(self):
          stdout, stderr, ret_code = self.run_unicycler(['--spades_path', 'not_a_real_path',
                                                         '--no_rotate'])
@@ -30,7 +30,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
          self.assertTrue(bool(re.search(r'bowtie2-build\s+not used', stdout)))
          self.assertTrue(bool(re.search(r'bowtie2\s+not used', stdout)))
          self.assertTrue(bool(re.search(r'samtools\s+not used', stdout)))
-@@ -119,12 +119,12 @@ class TestDependencies(unittest.TestCase
+@@ -119,12 +119,12 @@
      def test_verbosity_1(self):
          stdout, stderr, ret_code = self.run_unicycler(['--spades_path', 'not_a_real_path',
                                                         '--verbosity', '1'])
@@ -47,7 +47,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
          self.assertTrue(bool(re.search(r'Program\s+Version\s+Status\s+Path', stdout)))
 --- a/test/overlap_removal_test.py
 +++ b/test/overlap_removal_test.py
-@@ -94,7 +94,7 @@ def run_spades(out_dir):
+@@ -94,7 +94,7 @@
      reads_2 = os.path.join(out_dir, 'reads_2.fastq')
      reads_unpaired = os.path.join(out_dir, 'reads_unpaired.fastq')
  
@@ -58,7 +58,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
          spades_cmd += ['-1', reads_1, '-2', reads_2]
 --- a/test/test_misc.py
 +++ b/test/test_misc.py
-@@ -391,14 +391,14 @@ class TestMiscFunctions(unittest.TestCas
+@@ -391,14 +391,14 @@
  
      def test_spades_version_parsing_3(self):
          spades_version_output = 'option -v not recognized\nSPAdes genome assembler v.3.5.0\n\n' \
@@ -77,16 +77,16 @@ Description: SPAdes is in Debian at /usr/bin/spades
          self.assertEqual(version, '2.4.0')
 --- a/README.md
 +++ b/README.md
-@@ -94,7 +94,7 @@ Reasons to __not__ use Unicycler:
+@@ -105,7 +105,7 @@
      * [ICC](https://software.intel.com/en-us/c-compilers) also works (though I don't know the minimum required version number)
  * [setuptools](https://packaging.python.org/installing/#install-pip-setuptools-and-wheel) (only required for installation of Unicycler)
  * For short-read or hybrid assembly:
--  * [SPAdes](http://bioinf.spbau.ru/spades) v3.6.2 or later (`spades.py`)
-+  * [SPAdes](http://bioinf.spbau.ru/spades) v3.6.2 or later (`spades`)
+-  * [SPAdes](http://bioinf.spbau.ru/spades) v3.6.2 – v3.13.0 (`spades.py`)
++  * [SPAdes](http://bioinf.spbau.ru/spades) v3.6.2 – v3.13.0 (`spades`)
  * For long-read or hybrid assembly:
    * [Racon](https://github.com/isovic/racon) (`racon`)
  * For polishing
-@@ -415,7 +415,7 @@ SPAdes assembly:
+@@ -426,7 +426,7 @@
    These options control the short-read SPAdes assembly at the beginning of the Unicycler
    pipeline.
  
@@ -97,7 +97,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
    --min_kmer_frac MIN_KMER_FRAC  Lowest k-mer size for SPAdes assembly, expressed as a fraction of
 --- a/setup.py
 +++ b/setup.py
-@@ -63,7 +63,7 @@ def missing_tool(tool_name):
+@@ -63,7 +63,7 @@
  
  def tool_check():
      # Check for required programs.
@@ -108,7 +108,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
      for tool in tools:
 --- a/unicycler/misc.py
 +++ b/unicycler/misc.py
-@@ -124,7 +124,7 @@ def check_spades(spades_path):
+@@ -124,7 +124,7 @@
  
      if not err.decode():
          quit_with_error('SPAdes was found but does not produce output (make sure to use '
@@ -117,7 +117,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
  
  
  def find_pilon(pilon_path, java_path, args):
-@@ -916,7 +916,7 @@ def spades_path_and_version(spades_path)
+@@ -902,7 +902,7 @@
  
  def spades_version_from_spades_output(spades_output):
      """
@@ -128,7 +128,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
          return re.search(r'v(\d+\.\d+\.\d+)', spades_output).group(1)
 --- a/unicycler/unicycler.py
 +++ b/unicycler/unicycler.py
-@@ -320,7 +320,7 @@ def get_arguments():
+@@ -320,7 +320,7 @@
                                               'These options control the short-read SPAdes '
                                               'assembly at the beginning of the Unicycler pipeline.'
                                               if show_all_args else argparse.SUPPRESS)
@@ -137,7 +137,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
                                help='Path to the SPAdes executable'
                                     if show_all_args else argparse.SUPPRESS)
      spades_group.add_argument('--no_correct', action='store_true',
-@@ -756,7 +756,7 @@ def check_dependencies(args, short_reads
+@@ -764,7 +764,7 @@
          spades_path, spades_version, spades_status = '', '', 'not used'
      else:
          spades_path, spades_version, spades_status = spades_path_and_version(args.spades_path)
@@ -146,7 +146,7 @@ Description: SPAdes is in Debian at /usr/bin/spades
      if args.verbosity > 1:
          spades_row.append(spades_path)
      program_table.append(spades_row)
-@@ -864,7 +864,7 @@ def quit_if_dependency_problem(spades_st
+@@ -873,7 +873,7 @@
          quit_with_error('SPAdes cannot run due to an incompatible Python version')
      if spades_status == 'bad':
          quit_with_error('SPAdes was found but does not produce output (make sure to use '


=====================================
debian/watch
=====================================
@@ -1,4 +1,4 @@
 version=4
 
 opts="repacksuffix=+dfsg,dversionmangle=s/\+dfsg//g,repack,compression=xz" \
-  https://github.com/rrwick/Unicycler/releases .*/archive/v?@ANY_VERSION@@ARCHIVE_EXT@
+  https://github.com/rrwick/Unicycler/releases .*/archive/.*/v?@ANY_VERSION@@ARCHIVE_EXT@


=====================================
test/test_misc.py
=====================================
@@ -402,3 +402,36 @@ class TestMiscFunctions(unittest.TestCase):
                                 '-o <output_dir>\n\nBasic options:'
         version = unicycler.misc.spades_version_from_spades_output(spades_version_output)
         self.assertEqual(version, '2.4.0')
+
+    def test_spades_version_status_1(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('2.4.0'), 'too old')
+
+    def test_spades_version_status_2(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('3.4.0'), 'too old')
+
+    def test_spades_version_status_3(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('3.6.0'), 'too old')
+
+    def test_spades_version_status_4(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('3.6.1'), 'too old')
+
+    def test_spades_version_status_5(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('3.6.2'), 'good')
+
+    def test_spades_version_status_6(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('3.7.0'), 'good')
+
+    def test_spades_version_status_7(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('3.9.9'), 'good')
+
+    def test_spades_version_status_8(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('3.13.0'), 'good')
+
+    def test_spades_version_status_9(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('3.13.1'), 'too new')
+
+    def test_spades_version_status_10(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('3.14.1'), 'too new')
+
+    def test_spades_version_status_11(self):
+        self.assertEqual(unicycler.misc.spades_status_from_version('4.0.0'), 'too new')


=====================================
unicycler/assembly_graph.py
=====================================
@@ -1643,6 +1643,8 @@ class AssemblyGraph(object):
 
         # We'll search specifically for the middle segments as they should be easy to spot.
         for middle in self.segments:
+            if self.segments[middle].get_length() > settings.MAX_SIMPLE_LOOP_SIZE:
+                continue
 
             # A middle segment will always have exactly one connection on each end which connect
             # to the same segment (the repeat segment).


=====================================
unicycler/misc.py
=====================================
@@ -891,23 +891,9 @@ def spades_path_and_version(spades_path):
     if 'python version' in out and 'is not supported' in out:
         return found_spades_path, '', 'Python problem'
 
-    # Make sure SPAdes is 3.6.2+
+    # Make sure SPAdes is 3.6.2 - 3.13.0
     try:
-        major_version = int(version.split('.')[0])
-        if major_version < 3:
-            status = 'too old'
-        else:
-            minor_version = int(version.split('.')[1])
-            if minor_version < 6:
-                status = 'too old'
-            elif minor_version > 6:
-                status = 'good'
-            else:  # minor_version == 6
-                patch_version = int(version.split('.')[2])
-                if patch_version < 2:
-                    status = 'too old'
-                else:
-                    status = 'good'
+        status = spades_status_from_version(version)
     except (ValueError, IndexError):
         version, status = '?', 'too old'
 
@@ -933,6 +919,36 @@ def spades_version_from_spades_output(spades_output):
     return ''
 
 
+def spades_status_from_version(version):
+    major_version = int(version.split('.')[0])
+    if major_version < 3:
+        return 'too old'
+    if major_version >= 4:
+        return 'too new'
+
+    minor_version = int(version.split('.')[1])
+    if minor_version < 6:
+        return 'too old'
+    if minor_version > 13:
+        return 'too new'
+    assert 6 <= minor_version <= 13
+
+    patch_version = int(version.split('.')[2])
+    if 6 < minor_version < 13:
+        return 'good'
+    assert minor_version == 6 or minor_version == 13
+    if minor_version == 6:
+        if patch_version < 2:
+            return 'too old'
+        else:
+            return 'good'
+    if minor_version == 13:
+        if patch_version > 0:
+            return 'too new'
+        else:
+            return 'good'
+
+
 def racon_path_and_version(racon_path):
     found_racon_path = shutil.which(racon_path)
     if found_racon_path is None:
@@ -941,7 +957,7 @@ def racon_path_and_version(racon_path):
     process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
     out, _ = process.communicate()
     out = out.decode().lower()
-    if 'racon' not in out or  'options' not in out:
+    if 'racon' not in out or 'options' not in out:
         return found_racon_path, '-', 'bad'
 
     return found_racon_path, racon_version(found_racon_path), 'good'


=====================================
unicycler/settings.py
=====================================
@@ -181,3 +181,5 @@ REQUIRED_MINIASM_ASSEMBLY_SIZE_FOR_BRIDGING = 0.5
 # limits the amount of trimming it's willing to do. I.e. if miniasm trimmed more than this from a
 # contig, Unicycler won't.
 MAX_MINIASM_DEAD_END_TRIM_SIZE = 100
+
+MAX_SIMPLE_LOOP_SIZE = 10000


=====================================
unicycler/spades_func.py
=====================================
@@ -263,7 +263,8 @@ def spades_read_correction(short1, short2, unpaired, spades_dir, threads, spades
         command += ['-1', short1, '-2', short2]
     if using_unpaired_reads:
         command += ['-s', unpaired]
-    command += ['-o', read_correction_dir, '--threads', str(threads), '--only-error-correction']
+    command += ['-o', read_correction_dir, '--threads', str(threads), '--only-error-correction',
+                '--phred-offset', '33']
     if spades_tmp_dir is not None:
         command += ['--tmp-dir', spades_tmp_dir]
     process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)


=====================================
unicycler/unicycler.py
=====================================
@@ -162,16 +162,16 @@ def main():
         bridges += create_simple_long_read_bridges(graph, args.out, args.keep, args.threads,
                                                    read_dict, long_read_filename, scoring_scheme,
                                                    anchor_segments)
+        if not args.no_long_read_alignment:
+            read_names, min_scaled_score, min_alignment_length = \
+                align_long_reads_to_assembly_graph(graph, anchor_segments, args, full_command,
+                                                   read_dict, read_names, long_read_filename)
 
-        read_names, min_scaled_score, min_alignment_length = \
-            align_long_reads_to_assembly_graph(graph, anchor_segments, args, full_command,
-                                               read_dict, read_names, long_read_filename)
-
-        expected_linear_seqs = args.linear_seqs > 0
-        bridges += create_long_read_bridges(graph, read_dict, read_names, anchor_segments,
-                                            args.verbosity, min_scaled_score, args.threads,
-                                            scoring_scheme, min_alignment_length,
-                                            expected_linear_seqs, args.min_bridge_qual)
+            expected_linear_seqs = args.linear_seqs > 0
+            bridges += create_long_read_bridges(graph, read_dict, read_names, anchor_segments,
+                                                args.verbosity, min_scaled_score, args.threads,
+                                                scoring_scheme, min_alignment_length,
+                                                expected_linear_seqs, args.min_bridge_qual)
 
     if short_reads_available:
         seg_nums_used_in_bridges = graph.apply_bridges(bridges, args.verbosity,
@@ -370,8 +370,8 @@ def get_arguments():
                                     if show_all_args else argparse.SUPPRESS)
     miniasm_group.add_argument('--existing_long_read_assembly', type=str, default=None,
                                help='A pre-prepared long read assembly for the sample in GFA '
-                                    'format. If this option is used, Unicycler will skip the '
-                                    'miniasm/Racon steps and instead use the given assembly '
+                                    'or FASTA format. If this option is used, Unicycler will skip '
+                                    'the miniasm/Racon steps and instead use the given assembly '
                                     '(default: perform long read assembly using miniasm/Racon)'
                                     if show_all_args else argparse.SUPPRESS)
 
@@ -466,6 +466,10 @@ def get_arguments():
                                             'These options control the alignment of long reads to '
                                             'the assembly graph.'
                                             if show_all_args else argparse.SUPPRESS)
+    align_group.add_argument('--no_long_read_alignment', action='store_true',
+                             help='Skip long-read-alignment-bases bridging (default: use '
+                                  'long-read alignments to produce bridges)'
+                                  if show_all_args else argparse.SUPPRESS)
     add_aligning_arguments(align_group, show_all_args)
 
     # If no arguments were used, print the entire help (argparse default is to just give an error
@@ -839,7 +843,8 @@ def check_dependencies(args, short_reads_available, long_reads_available):
     for i, row in enumerate(program_table):
         if 'not used' in row:
             row_colours[i] = 'dim'
-        elif 'too old' in row or 'not found' in row or 'bad' in row or 'Python problem' in row:
+        elif ('too old' in row or 'too new' in row or 'not found' in row or 'bad' in row or
+              'Python problem' in row):
             row_colours[i] = 'red'
 
     print_table(program_table, alignments='LLLL', row_colour=row_colours, max_col_width=60,
@@ -862,8 +867,8 @@ def quit_if_dependency_problem(spades_status, racon_status, makeblastdb_status,
     log.log('')
     if spades_status == 'not found':
         quit_with_error('could not find SPAdes at ' + args.spades_path)
-    if spades_status == 'too old':
-        quit_with_error('Unicycler requires SPAdes v3.6.2 or higher')
+    if spades_status == 'too old' or spades_status == 'too new':
+        quit_with_error('Unicycler requires SPAdes v3.6.2 - v3.13.0')
     if spades_status == 'Python problem':
         quit_with_error('SPAdes cannot run due to an incompatible Python version')
     if spades_status == 'bad':


=====================================
unicycler/version.py
=====================================
@@ -13,4 +13,4 @@ details. You should have received a copy of the GNU General Public License along
 not, see <http://www.gnu.org/licenses/>.
 """
 
-__version__ = '0.4.8'
+__version__ = '0.4.9'



View it on GitLab: https://salsa.debian.org/med-team/unicycler/-/compare/8c22dd061ff0c4bff44216b4ebb8234ac55884b6...50a1cc570d350eefe4ebd1fe44137a2c2f520817

-- 
View it on GitLab: https://salsa.debian.org/med-team/unicycler/-/compare/8c22dd061ff0c4bff44216b4ebb8234ac55884b6...50a1cc570d350eefe4ebd1fe44137a2c2f520817
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20210807/d021ddab/attachment-0001.htm>


More information about the debian-med-commit mailing list