[med-svn] [Git][med-team/unicycler][upstream] New upstream version 0.4.9+dfsg
Nilesh Patra (@nilesh)
gitlab at salsa.debian.org
Sat Aug 7 20:51:28 BST 2021
Nilesh Patra pushed to branch upstream at Debian Med / unicycler
Commits:
4ffd09ab by Nilesh Patra at 2021-08-08T00:50:16+05:30
New upstream version 0.4.9+dfsg
- - - - -
8 changed files:
- README.md
- test/test_misc.py
- unicycler/assembly_graph.py
- unicycler/misc.py
- unicycler/settings.py
- unicycler/spades_func.py
- unicycler/unicycler.py
- unicycler/version.py
Changes:
=====================================
README.md
=====================================
@@ -10,6 +10,17 @@ And read about how we use it to complete bacterial genomes here:
+# A note on Trycycler
+
+[Trycycler](https://github.com/rrwick/Trycycler/wiki) is a newer tool that in many cases is a better choice than Unicycler. Here is a quick guide on whether you should use Unicycler or Trycycler to assemble your bacterial genome:
+* If you only have short reads, use Unicycler (Trycycler does not do short-read assembly).
+* If you only have long reads, Trycycler is a better choice. While Unicycler can do long-read-only assembly, its approach is somewhat out-of-date. Something else to consider is the depth of your long reads, as Trycycler prefers deeper read sets. If your long-read set is particularly shallow (~25× or less), then [Flye](https://github.com/fenderglass/Flye) might be your best option.
+* If you have both short and long reads (i.e. are doing a hybrid assembly), then Unicycler and [Trycycler+polishing](https://github.com/rrwick/Trycycler/wiki/Polishing-after-Trycycler) are both viable options. If you have lots of long reads (~100× depth or more), use Trycycler+polishing. If you have sparse long reads (~25× or less), use Unicycler. If your long-read depth falls between those values, it might be worth trying both approaches.
+
+You can read more on Trycycler's FAQ page: [Should I use Unicycler or Trycycler to assemble my bacterial genome?](https://github.com/rrwick/Trycycler/wiki/FAQ-and-miscellaneous-tips#should-i-use-unicycler-or-trycycler-to-assemble-my-bacterial-genome)
+
+
+
# Table of contents
* [Introduction](#introduction)
@@ -94,7 +105,7 @@ Reasons to __not__ use Unicycler:
* [ICC](https://software.intel.com/en-us/c-compilers) also works (though I don't know the minimum required version number)
* [setuptools](https://packaging.python.org/installing/#install-pip-setuptools-and-wheel) (only required for installation of Unicycler)
* For short-read or hybrid assembly:
- * [SPAdes](http://bioinf.spbau.ru/spades) v3.6.2 or later (`spades.py`)
+ * [SPAdes](http://bioinf.spbau.ru/spades) v3.6.2 – v3.13.0 (`spades.py`)
* For long-read or hybrid assembly:
* [Racon](https://github.com/isovic/racon) (`racon`)
* For polishing
=====================================
test/test_misc.py
=====================================
@@ -402,3 +402,36 @@ class TestMiscFunctions(unittest.TestCase):
'-o <output_dir>\n\nBasic options:'
version = unicycler.misc.spades_version_from_spades_output(spades_version_output)
self.assertEqual(version, '2.4.0')
+
+ def test_spades_version_status_1(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('2.4.0'), 'too old')
+
+ def test_spades_version_status_2(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('3.4.0'), 'too old')
+
+ def test_spades_version_status_3(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('3.6.0'), 'too old')
+
+ def test_spades_version_status_4(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('3.6.1'), 'too old')
+
+ def test_spades_version_status_5(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('3.6.2'), 'good')
+
+ def test_spades_version_status_6(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('3.7.0'), 'good')
+
+ def test_spades_version_status_7(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('3.9.9'), 'good')
+
+ def test_spades_version_status_8(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('3.13.0'), 'good')
+
+ def test_spades_version_status_9(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('3.13.1'), 'too new')
+
+ def test_spades_version_status_10(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('3.14.1'), 'too new')
+
+ def test_spades_version_status_11(self):
+ self.assertEqual(unicycler.misc.spades_status_from_version('4.0.0'), 'too new')
=====================================
unicycler/assembly_graph.py
=====================================
@@ -1643,6 +1643,8 @@ class AssemblyGraph(object):
# We'll search specifically for the middle segments as they should be easy to spot.
for middle in self.segments:
+ if self.segments[middle].get_length() > settings.MAX_SIMPLE_LOOP_SIZE:
+ continue
# A middle segment will always have exactly one connection on each end which connect
# to the same segment (the repeat segment).
=====================================
unicycler/misc.py
=====================================
@@ -891,23 +891,9 @@ def spades_path_and_version(spades_path):
if 'python version' in out and 'is not supported' in out:
return found_spades_path, '', 'Python problem'
- # Make sure SPAdes is 3.6.2+
+ # Make sure SPAdes is 3.6.2 - 3.13.0
try:
- major_version = int(version.split('.')[0])
- if major_version < 3:
- status = 'too old'
- else:
- minor_version = int(version.split('.')[1])
- if minor_version < 6:
- status = 'too old'
- elif minor_version > 6:
- status = 'good'
- else: # minor_version == 6
- patch_version = int(version.split('.')[2])
- if patch_version < 2:
- status = 'too old'
- else:
- status = 'good'
+ status = spades_status_from_version(version)
except (ValueError, IndexError):
version, status = '?', 'too old'
@@ -933,6 +919,36 @@ def spades_version_from_spades_output(spades_output):
return ''
+def spades_status_from_version(version):
+ major_version = int(version.split('.')[0])
+ if major_version < 3:
+ return 'too old'
+ if major_version >= 4:
+ return 'too new'
+
+ minor_version = int(version.split('.')[1])
+ if minor_version < 6:
+ return 'too old'
+ if minor_version > 13:
+ return 'too new'
+ assert 6 <= minor_version <= 13
+
+ patch_version = int(version.split('.')[2])
+ if 6 < minor_version < 13:
+ return 'good'
+ assert minor_version == 6 or minor_version == 13
+ if minor_version == 6:
+ if patch_version < 2:
+ return 'too old'
+ else:
+ return 'good'
+ if minor_version == 13:
+ if patch_version > 0:
+ return 'too new'
+ else:
+ return 'good'
+
+
def racon_path_and_version(racon_path):
found_racon_path = shutil.which(racon_path)
if found_racon_path is None:
@@ -941,7 +957,7 @@ def racon_path_and_version(racon_path):
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
out, _ = process.communicate()
out = out.decode().lower()
- if 'racon' not in out or 'options' not in out:
+ if 'racon' not in out or 'options' not in out:
return found_racon_path, '-', 'bad'
return found_racon_path, racon_version(found_racon_path), 'good'
=====================================
unicycler/settings.py
=====================================
@@ -181,3 +181,5 @@ REQUIRED_MINIASM_ASSEMBLY_SIZE_FOR_BRIDGING = 0.5
# limits the amount of trimming it's willing to do. I.e. if miniasm trimmed more than this from a
# contig, Unicycler won't.
MAX_MINIASM_DEAD_END_TRIM_SIZE = 100
+
+MAX_SIMPLE_LOOP_SIZE = 10000
=====================================
unicycler/spades_func.py
=====================================
@@ -263,7 +263,8 @@ def spades_read_correction(short1, short2, unpaired, spades_dir, threads, spades
command += ['-1', short1, '-2', short2]
if using_unpaired_reads:
command += ['-s', unpaired]
- command += ['-o', read_correction_dir, '--threads', str(threads), '--only-error-correction']
+ command += ['-o', read_correction_dir, '--threads', str(threads), '--only-error-correction',
+ '--phred-offset', '33']
if spades_tmp_dir is not None:
command += ['--tmp-dir', spades_tmp_dir]
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
=====================================
unicycler/unicycler.py
=====================================
@@ -162,16 +162,16 @@ def main():
bridges += create_simple_long_read_bridges(graph, args.out, args.keep, args.threads,
read_dict, long_read_filename, scoring_scheme,
anchor_segments)
+ if not args.no_long_read_alignment:
+ read_names, min_scaled_score, min_alignment_length = \
+ align_long_reads_to_assembly_graph(graph, anchor_segments, args, full_command,
+ read_dict, read_names, long_read_filename)
- read_names, min_scaled_score, min_alignment_length = \
- align_long_reads_to_assembly_graph(graph, anchor_segments, args, full_command,
- read_dict, read_names, long_read_filename)
-
- expected_linear_seqs = args.linear_seqs > 0
- bridges += create_long_read_bridges(graph, read_dict, read_names, anchor_segments,
- args.verbosity, min_scaled_score, args.threads,
- scoring_scheme, min_alignment_length,
- expected_linear_seqs, args.min_bridge_qual)
+ expected_linear_seqs = args.linear_seqs > 0
+ bridges += create_long_read_bridges(graph, read_dict, read_names, anchor_segments,
+ args.verbosity, min_scaled_score, args.threads,
+ scoring_scheme, min_alignment_length,
+ expected_linear_seqs, args.min_bridge_qual)
if short_reads_available:
seg_nums_used_in_bridges = graph.apply_bridges(bridges, args.verbosity,
@@ -370,8 +370,8 @@ def get_arguments():
if show_all_args else argparse.SUPPRESS)
miniasm_group.add_argument('--existing_long_read_assembly', type=str, default=None,
help='A pre-prepared long read assembly for the sample in GFA '
- 'format. If this option is used, Unicycler will skip the '
- 'miniasm/Racon steps and instead use the given assembly '
+ 'or FASTA format. If this option is used, Unicycler will skip '
+ 'the miniasm/Racon steps and instead use the given assembly '
'(default: perform long read assembly using miniasm/Racon)'
if show_all_args else argparse.SUPPRESS)
@@ -466,6 +466,10 @@ def get_arguments():
'These options control the alignment of long reads to '
'the assembly graph.'
if show_all_args else argparse.SUPPRESS)
+ align_group.add_argument('--no_long_read_alignment', action='store_true',
+ help='Skip long-read-alignment-bases bridging (default: use '
+ 'long-read alignments to produce bridges)'
+ if show_all_args else argparse.SUPPRESS)
add_aligning_arguments(align_group, show_all_args)
# If no arguments were used, print the entire help (argparse default is to just give an error
@@ -839,7 +843,8 @@ def check_dependencies(args, short_reads_available, long_reads_available):
for i, row in enumerate(program_table):
if 'not used' in row:
row_colours[i] = 'dim'
- elif 'too old' in row or 'not found' in row or 'bad' in row or 'Python problem' in row:
+ elif ('too old' in row or 'too new' in row or 'not found' in row or 'bad' in row or
+ 'Python problem' in row):
row_colours[i] = 'red'
print_table(program_table, alignments='LLLL', row_colour=row_colours, max_col_width=60,
@@ -862,8 +867,8 @@ def quit_if_dependency_problem(spades_status, racon_status, makeblastdb_status,
log.log('')
if spades_status == 'not found':
quit_with_error('could not find SPAdes at ' + args.spades_path)
- if spades_status == 'too old':
- quit_with_error('Unicycler requires SPAdes v3.6.2 or higher')
+ if spades_status == 'too old' or spades_status == 'too new':
+ quit_with_error('Unicycler requires SPAdes v3.6.2 - v3.13.0')
if spades_status == 'Python problem':
quit_with_error('SPAdes cannot run due to an incompatible Python version')
if spades_status == 'bad':
=====================================
unicycler/version.py
=====================================
@@ -13,4 +13,4 @@ details. You should have received a copy of the GNU General Public License along
not, see <http://www.gnu.org/licenses/>.
"""
-__version__ = '0.4.8'
+__version__ = '0.4.9'
View it on GitLab: https://salsa.debian.org/med-team/unicycler/-/commit/4ffd09abedb596e0fa6e919dfbd93551082f842f
--
View it on GitLab: https://salsa.debian.org/med-team/unicycler/-/commit/4ffd09abedb596e0fa6e919dfbd93551082f842f
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20210807/9e91c451/attachment-0001.htm>
More information about the debian-med-commit
mailing list