[med-svn] [Git][med-team/flye][master] 6 commits: routine-update: New upstream version
Alexandre Detiste (@detiste-guest)
gitlab at salsa.debian.org
Wed May 22 14:32:29 BST 2024
Alexandre Detiste pushed to branch master at Debian Med / flye
Commits:
533c6422 by Alexandre Detiste at 2024-05-22T14:18:46+02:00
routine-update: New upstream version
- - - - -
ebaf557e by Alexandre Detiste at 2024-05-22T14:18:47+02:00
New upstream version 2.9.4+dfsg
- - - - -
93796a3f by Alexandre Detiste at 2024-05-22T14:19:09+02:00
Update upstream source from tag 'upstream/2.9.4+dfsg'
Update to upstream version '2.9.4+dfsg'
with Debian dir 138944cd37f3ce82baf8095768a3e830ff178c7a
- - - - -
6e953a1b by Alexandre Detiste at 2024-05-22T14:19:10+02:00
routine-update: Standards-Version: 4.7.0
- - - - -
6640e6a4 by Alexandre Detiste at 2024-05-22T14:22:15+02:00
refresh patch
- - - - -
8bddd17f by Alexandre Detiste at 2024-05-22T15:31:27+02:00
release
- - - - -
16 changed files:
- README.md
- debian/changelog
- debian/control
- debian/patches/python3.12.patch
- docs/NEWS.md
- docs/USAGE.md
- flye/__build__.py
- flye/__version__.py
- flye/main.py
- flye/polishing/bubbles.py
- flye/polishing/polish.py
- flye/utils/sam_parser.py
- src/polishing/bubble_processor.cpp
- src/polishing/bubble_processor.h
- src/polishing/subs_matrix.cpp
- src/polishing/subs_matrix.h
Changes:
=====================================
README.md
=====================================
@@ -3,7 +3,7 @@ Flye assembler
[![BioConda Install](https://img.shields.io/conda/dn/bioconda/flye.svg?style=flag&label=BioConda%20install)](https://anaconda.org/bioconda/flye)
-### Version: 2.9.3
+### Version: 2.9.4
Flye is a de novo assembler for single-molecule sequencing reads,
such as those produced by PacBio and Oxford Nanopore Technologies.
@@ -178,7 +178,7 @@ Publications
Mikhail Kolmogorov, Derek M. Bickhart, Bahar Behsaz, Alexey Gurevich, Mikhail Rayko, Sung Bong
Shin, Kristen Kuhn, Jeffrey Yuan, Evgeny Polevikov, Timothy P. L. Smith and Pavel A. Pevzner
"metaFlye: scalable long-read metagenome assembly using repeat graphs", Nature Methods, 2020
-[doi:s41592-020-00971-x](https://doi.org/10.1038/s41592-020-00971-x)
+[doi:10.1038/s41592-020-00971-x](https://doi.org/10.1038/s41592-020-00971-x)
Mikhail Kolmogorov, Jeffrey Yuan, Yu Lin and Pavel Pevzner,
"Assembly of Long Error-Prone Reads Using Repeat Graphs", Nature Biotechnology, 2019
=====================================
debian/changelog
=====================================
@@ -1,3 +1,11 @@
+flye (2.9.4+dfsg-1) unstable; urgency=medium
+
+ * Team upload.
+ * New upstream version
+ * Standards-Version: 4.7.0 (routine-update)
+
+ -- Alexandre Detiste <tchet at debian.org> Wed, 22 May 2024 14:21:21 +0200
+
flye (2.9.3+dfsg2-1) unstable; urgency=medium
[ Étienne Mollier ]
=====================================
debian/control
=====================================
@@ -12,7 +12,7 @@ Build-Depends: debhelper-compat (= 13),
libminimap2-dev,
samtools,
zlib1g-dev
-Standards-Version: 4.6.2
+Standards-Version: 4.7.0
Vcs-Browser: https://salsa.debian.org/med-team/flye
Vcs-Git: https://salsa.debian.org/med-team/flye.git
Homepage: https://github.com/fenderglass/Flye
=====================================
debian/patches/python3.12.patch
=====================================
@@ -4,11 +4,9 @@ Author: Andreas Tille <tille at debian.org>
Last-Update: Fri, 02 Feb 2024 10:40:30 +0100
-diff --git a/flye/assembly/scaffolder.py b/flye/assembly/scaffolder.py
-index f52a70d..5d957b0 100644
--- a/flye/assembly/scaffolder.py
+++ b/flye/assembly/scaffolder.py
-@@ -12,7 +12,7 @@ import logging
+@@ -12,7 +12,7 @@
import flye.utils.fasta_parser as fp
import flye.config.py_cfg as cfg
@@ -17,11 +15,9 @@ index f52a70d..5d957b0 100644
logger = logging.getLogger()
-diff --git a/flye/config/configurator.py b/flye/config/configurator.py
-index 3df64cb..28a7e1f 100644
--- a/flye/config/configurator.py
+++ b/flye/config/configurator.py
-@@ -12,7 +12,7 @@ import logging
+@@ -12,7 +12,7 @@
import flye.utils.fasta_parser as fp
import flye.config.py_cfg as cfg
@@ -30,11 +26,9 @@ index 3df64cb..28a7e1f 100644
logger = logging.getLogger()
-diff --git a/flye/main.py b/flye/main.py
-index d069808..afd0c60 100644
--- a/flye/main.py
+++ b/flye/main.py
-@@ -33,7 +33,7 @@ import flye.utils.fasta_parser as fp
+@@ -33,7 +33,7 @@
#import flye.trestle.trestle as tres
#import flye.trestle.graph_resolver as tres_graph
from flye.repeat_graph.repeat_graph import RepeatGraph
@@ -43,11 +37,9 @@ index d069808..afd0c60 100644
logger = logging.getLogger()
-diff --git a/flye/polishing/alignment.py b/flye/polishing/alignment.py
-index c7ad442..09cfc65 100644
--- a/flye/polishing/alignment.py
+++ b/flye/polishing/alignment.py
-@@ -18,8 +18,8 @@ from copy import copy
+@@ -18,8 +18,8 @@
import flye.utils.fasta_parser as fp
from flye.utils.utils import which, get_median
from flye.utils.sam_parser import AlignmentException
@@ -58,20 +50,18 @@ index c7ad442..09cfc65 100644
logger = logging.getLogger()
-diff --git a/flye/polishing/bubbles.py b/flye/polishing/bubbles.py
-index 4a04bf9..e13c623 100644
--- a/flye/polishing/bubbles.py
+++ b/flye/polishing/bubbles.py
-@@ -10,7 +10,7 @@ from __future__ import absolute_import
+@@ -10,7 +10,7 @@
from __future__ import division
import logging
from bisect import bisect
-from flye.six.moves import range
+from six.moves import range
from collections import defaultdict
+ from queue import Queue
- import multiprocessing
-@@ -21,7 +21,7 @@ import flye.config.py_cfg as cfg
+@@ -22,7 +22,7 @@
from flye.polishing.alignment import shift_gaps, get_uniform_alignments
from flye.utils.sam_parser import SynchronizedSamReader, SynchonizedChunkManager
from flye.utils.utils import process_in_parallel, get_median
@@ -80,11 +70,9 @@ index 4a04bf9..e13c623 100644
logger = logging.getLogger()
-diff --git a/flye/polishing/consensus.py b/flye/polishing/consensus.py
-index 0e0befc..2aa3499 100644
--- a/flye/polishing/consensus.py
+++ b/flye/polishing/consensus.py
-@@ -10,8 +10,8 @@ from __future__ import absolute_import
+@@ -10,8 +10,8 @@
from __future__ import division
import logging
from collections import defaultdict
@@ -95,7 +83,7 @@ index 0e0befc..2aa3499 100644
import multiprocessing
import traceback
-@@ -21,7 +21,7 @@ from flye.utils.sam_parser import SynchronizedSamReader, SynchonizedChunkManager
+@@ -21,7 +21,7 @@
import flye.config.py_cfg as cfg
import flye.utils.fasta_parser as fp
from flye.utils.utils import process_in_parallel
@@ -104,11 +92,9 @@ index 0e0befc..2aa3499 100644
logger = logging.getLogger()
-diff --git a/flye/polishing/polish.py b/flye/polishing/polish.py
-index 78060c1..c55e7b0 100644
--- a/flye/polishing/polish.py
+++ b/flye/polishing/polish.py
-@@ -21,8 +21,8 @@ from flye.polishing.bubbles import make_bubbles
+@@ -21,8 +21,8 @@
import flye.utils.fasta_parser as fp
from flye.utils.utils import which
import flye.config.py_cfg as cfg
@@ -119,11 +105,9 @@ index 78060c1..c55e7b0 100644
POLISH_BIN = "flye-modules"
-diff --git a/flye/short_plasmids/circular_sequences.py b/flye/short_plasmids/circular_sequences.py
-index 92a448b..c87ac86 100644
--- a/flye/short_plasmids/circular_sequences.py
+++ b/flye/short_plasmids/circular_sequences.py
-@@ -9,8 +9,8 @@ import flye.short_plasmids.unmapped_reads as unmapped
+@@ -9,8 +9,8 @@
import flye.utils.fasta_parser as fp
from flye.utils.sam_parser import read_paf, read_paf_grouped
import logging
@@ -134,11 +118,9 @@ index 92a448b..c87ac86 100644
logger = logging.getLogger()
-diff --git a/flye/short_plasmids/unmapped_reads.py b/flye/short_plasmids/unmapped_reads.py
-index fd218e5..bdbab3c 100644
--- a/flye/short_plasmids/unmapped_reads.py
+++ b/flye/short_plasmids/unmapped_reads.py
-@@ -9,8 +9,8 @@ import flye.utils.fasta_parser as fp
+@@ -9,8 +9,8 @@
from flye.utils.sam_parser import read_paf_grouped
import logging
from collections import defaultdict
@@ -149,8 +131,6 @@ index fd218e5..bdbab3c 100644
logger = logging.getLogger()
-diff --git a/flye/short_plasmids/utils.py b/flye/short_plasmids/utils.py
-index 0d1f817..75a0a3d 100644
--- a/flye/short_plasmids/utils.py
+++ b/flye/short_plasmids/utils.py
@@ -2,7 +2,7 @@
@@ -162,11 +142,9 @@ index 0d1f817..75a0a3d 100644
def find_connected_components(graph):
def dfs(start_vertex, connected_components_counter):
-diff --git a/flye/trestle/divergence.py b/flye/trestle/divergence.py
-index e3b2644..f9c9862 100644
--- a/flye/trestle/divergence.py
+++ b/flye/trestle/divergence.py
-@@ -12,7 +12,7 @@ from __future__ import absolute_import
+@@ -12,7 +12,7 @@
from __future__ import division
import logging
from collections import defaultdict
@@ -175,7 +153,7 @@ index e3b2644..f9c9862 100644
import multiprocessing
import os.path
-@@ -22,7 +22,7 @@ from flye.utils.sam_parser import SynchronizedSamReader, SynchonizedChunkManager
+@@ -22,7 +22,7 @@
import flye.utils.fasta_parser as fp
from flye.utils.utils import process_in_parallel
import flye.config.py_cfg as config
@@ -184,11 +162,9 @@ index e3b2644..f9c9862 100644
logger = logging.getLogger()
-diff --git a/flye/trestle/graph_resolver.py b/flye/trestle/graph_resolver.py
-index 9944831..b6ad6d7 100644
--- a/flye/trestle/graph_resolver.py
+++ b/flye/trestle/graph_resolver.py
-@@ -13,8 +13,8 @@ from collections import defaultdict
+@@ -13,8 +13,8 @@
import flye.utils.fasta_parser as fp
from flye.repeat_graph.graph_alignment import iter_alignments
@@ -199,11 +175,9 @@ index 9944831..b6ad6d7 100644
logger = logging.getLogger()
-diff --git a/flye/trestle/trestle.py b/flye/trestle/trestle.py
-index 244e4ad..677090d 100644
--- a/flye/trestle/trestle.py
+++ b/flye/trestle/trestle.py
-@@ -25,8 +25,8 @@ import flye.polishing.polish as pol
+@@ -25,8 +25,8 @@
import flye.trestle.divergence as div
import flye.trestle.trestle_config as trestle_config
@@ -214,11 +188,9 @@ index 244e4ad..677090d 100644
logger = logging.getLogger()
-diff --git a/flye/utils/fasta_parser.py b/flye/utils/fasta_parser.py
-index 66c0b15..54f7dca 100644
--- a/flye/utils/fasta_parser.py
+++ b/flye/utils/fasta_parser.py
-@@ -23,7 +23,7 @@ else:
+@@ -23,7 +23,7 @@
_STR = bytes.decode
_BYTES = str.encode
@@ -227,11 +199,9 @@ index 66c0b15..54f7dca 100644
logger = logging.getLogger()
-diff --git a/flye/utils/sam_parser.py b/flye/utils/sam_parser.py
-index 0db41f0..a16bb6b 100644
--- a/flye/utils/sam_parser.py
+++ b/flye/utils/sam_parser.py
-@@ -32,8 +32,8 @@ else:
+@@ -32,8 +32,8 @@
_STR = bytes.decode
_BYTES = str.encode
=====================================
docs/NEWS.md
=====================================
@@ -1,3 +1,7 @@
+Flye 2.9.4 release (14 May 2024)
+===============================
+* Minor technical changes
+
Flye 2.9.3 release (28 November 2023)
====================================
* Disjointig step speedup for `--nano-hq` mode
=====================================
docs/USAGE.md
=====================================
@@ -316,7 +316,7 @@ Scaffold gaps are marked with `??` symbols, and `*` symbol denotes a
terminal graph node.
Alternative contigs (representing alternative haplotypes) will have the same
-alt. group ID. Primary contigs are marked by `*`. Note that the ouptut of
+alt. group ID. Primary contigs are marked by `*`. Note that the outptut of
alternative contigs could be disabled via the `--no-alt-contigs` option.
## <a name="graph"></a> Repeat graph
=====================================
flye/__build__.py
=====================================
@@ -1 +1 @@
-__build__ = 1797
+__build__ = 1799
=====================================
flye/__version__.py
=====================================
@@ -1 +1 @@
-__version__ = "2.9.3"
+__version__ = "2.9.4"
=====================================
flye/main.py
=====================================
@@ -406,12 +406,13 @@ def _set_genome_size(args):
args.genome_size = human2bytes(args.genome_size.upper())
-def _run_polisher_only(args):
+def _run_polisher_only(args, output_progress=True):
"""
Runs standalone polisher
"""
- logger.info("Running Flye polisher")
- logger.debug("Cmd: %s", " ".join(sys.argv))
+ if output_progress:
+ logger.info("Running Flye polisher")
+ logger.debug("Cmd: %s", " ".join(sys.argv))
bam_input = False
for read_file in args.reads:
@@ -434,8 +435,9 @@ def _run_polisher_only(args):
pol.polish(args.polish_target, args.reads, args.out_dir,
args.num_iters, args.threads, args.platform,
- args.read_type, output_progress=True)
- logger.info("Done!")
+ args.read_type, output_progress)
+ if output_progress:
+ logger.info("Done!")
def _run(args):
=====================================
flye/polishing/bubbles.py
=====================================
@@ -12,6 +12,7 @@ import logging
from bisect import bisect
from flye.six.moves import range
from collections import defaultdict
+from queue import Queue
import multiprocessing
import traceback
@@ -93,11 +94,16 @@ def _thread_worker(aln_reader, chunk_feeder, contigs_info, err_mode,
for b in ctg_bubbles:
b.position += ctg_region.start
- with bubbles_file_lock:
- _output_bubbles(ctg_bubbles, open(bubbles_file, "a"))
+ if bubbles_file_lock:
+ bubbles_file_lock.acquire()
+
+ _output_bubbles(ctg_bubbles, open(bubbles_file, "a"))
results_queue.put((ctg_id, len(ctg_bubbles), num_long_bubbles,
num_empty, num_long_branch, aln_errors,
mean_cov))
+
+ if bubbles_file_lock:
+ bubbles_file_lock.release()
del profile
del ctg_bubbles
@@ -116,20 +122,26 @@ def make_bubbles(alignment_path, contigs_info, contigs_path,
CHUNK_SIZE = 1000000
contigs_fasta = fp.read_sequence_dict(contigs_path)
- manager = multiprocessing.Manager()
+ manager = None if num_proc == 1 else multiprocessing.Manager()
aln_reader = SynchronizedSamReader(alignment_path, contigs_fasta, manager,
cfg.vals["max_read_coverage"], use_secondary=True)
chunk_feeder = SynchonizedChunkManager(contigs_fasta, manager, chunk_size=CHUNK_SIZE)
- results_queue = manager.Queue()
- error_queue = manager.Queue()
- bubbles_out_lock = multiprocessing.Lock()
- #bubbles_out_handle = open(bubbles_out, "w")
+ if manager:
+ results_queue = manager.Queue()
+ error_queue = manager.Queue()
+ bubbles_out_lock = multiprocessing.Lock()
- process_in_parallel(_thread_worker, (aln_reader, chunk_feeder, contigs_info, err_mode,
+ process_in_parallel(_thread_worker, (aln_reader, chunk_feeder, contigs_info, err_mode,
results_queue, error_queue, bubbles_out, bubbles_out_lock), num_proc)
- #_thread_worker(aln_reader, chunk_feeder, contigs_info, err_mode,
- # results_queue, error_queue, bubbles_out, bubbles_out_lock)
+ else:
+ results_queue = Queue()
+ error_queue = Queue()
+ bubbles_out_lock = None
+
+ _thread_worker(aln_reader, chunk_feeder, contigs_info, err_mode,
+ results_queue, error_queue, bubbles_out, bubbles_out_lock)
+
if not error_queue.empty():
raise error_queue.get()
=====================================
flye/polishing/polish.py
=====================================
@@ -104,6 +104,7 @@ def polish(contig_seqs, read_seqs, work_dir, num_iters, num_threads, read_platfo
logger.disabled = logger_state
open(stats_file, "w").write("#seq_name\tlength\tcoverage\n")
open(polished_file, "w")
+ gzip.open(bed_coverage, "wt")
return polished_file, stats_file
#####
=====================================
flye/utils/sam_parser.py
=====================================
@@ -137,8 +137,12 @@ class SynchonizedChunkManager(object):
#will be shared between processes
#self.shared_manager = multiprocessing.Manager()
self.shared_num_jobs = multiprocessing.Value(ctypes.c_int, 0)
- self.shared_lock = multiproc_manager.Lock()
- self.shared_eof = multiprocessing.Value(ctypes.c_bool, False)
+ if multiproc_manager:
+ self.shared_lock = multiproc_manager.Lock()
+ self.shared_eof = multiprocessing.Value(ctypes.c_bool, False)
+ else:
+ self.shared_lock = None
+ self.shared_eof = ctypes.c_bool(False)
for ctg_id in reference_fasta:
@@ -161,15 +165,22 @@ class SynchonizedChunkManager(object):
def get_chunk(self):
job_id = None
while True:
- with self.shared_lock:
- if self.shared_eof.value:
- return None
-
- job_id = self.shared_num_jobs.value
- self.shared_num_jobs.value = self.shared_num_jobs.value + 1
- if self.shared_num_jobs.value == len(self.fetch_list):
- self.shared_eof.value = True
- break
+ if self.shared_lock:
+ self.shared_lock.acquire()
+
+ if self.shared_eof.value:
+ if self.shared_lock:
+ self.shared_lock.release()
+ return None
+
+ job_id = self.shared_num_jobs.value
+ self.shared_num_jobs.value = self.shared_num_jobs.value + 1
+ if self.shared_num_jobs.value == len(self.fetch_list):
+ self.shared_eof.value = True
+
+ if self.shared_lock:
+ self.shared_lock.release()
+ break
time.sleep(0.01)
@@ -197,7 +208,7 @@ class SynchronizedSamReader(object):
self.cigar_parser = re.compile(b"[0-9]+[MIDNSHP=X]")
#self.shared_manager = multiprocessing.Manager()
- self.ref_fasta = multiproc_manager.dict()
+ self.ref_fasta = dict() if multiproc_manager == None else multiproc_manager.dict()
for (h, s) in iteritems(reference_fasta):
self.ref_fasta[_BYTES(h)] = _BYTES(s)
=====================================
src/polishing/bubble_processor.cpp
=====================================
@@ -21,14 +21,14 @@ namespace
BubbleProcessor::BubbleProcessor(const std::string& subsMatPath,
const std::string& hopoMatrixPath,
bool showProgress, bool hopoEnabled):
+ _hopoEnabled(hopoEnabled),
_subsMatrix(subsMatPath),
- _hopoMatrix(hopoMatrixPath),
+ _hopoMatrix(hopoMatrixPath, _hopoEnabled),
_generalPolisher(_subsMatrix),
_homoPolisher(_subsMatrix, _hopoMatrix),
_dinucFixer(_subsMatrix),
_verbose(false),
- _showProgress(showProgress),
- _hopoEnabled(hopoEnabled)
+ _showProgress(showProgress)
{
}
=====================================
src/polishing/bubble_processor.h
=====================================
@@ -37,6 +37,10 @@ private:
const int BUBBLES_CACHE = 100;
+ bool _verbose;
+ bool _showProgress;
+ bool _hopoEnabled;
+
const SubstitutionMatrix _subsMatrix;
const HopoMatrix _hopoMatrix;
const GeneralPolisher _generalPolisher;
@@ -50,7 +54,4 @@ private:
std::ifstream _bubblesFile;
std::ofstream _consensusFile;
std::ofstream _logFile;
- bool _verbose;
- bool _showProgress;
- bool _hopoEnabled;
};
=====================================
src/polishing/subs_matrix.cpp
=====================================
@@ -215,8 +215,12 @@ std::string HopoMatrix::obsToStr(HopoMatrix::Observation obs)
return result;
}*/
-HopoMatrix::HopoMatrix(const std::string& fileName)
+HopoMatrix::HopoMatrix(const std::string& fileName, bool hopoEnabled = true)
{
+ if (!hopoEnabled)
+ {
+ return;
+ }
for (size_t i = 0; i < NUM_HOPO_STATES; ++i)
{
_observationProbs.emplace_back(NUM_HOPO_OBS, probToScore(MIN_HOPO_PROB));
@@ -256,7 +260,7 @@ void HopoMatrix::loadMatrix(const std::string& fileName)
{
observationsFreq.push_back(std::vector<size_t>(NUM_HOPO_OBS, 0));
}
-
+
while (std::getline(fin, buffer))
{
if (buffer.empty()) continue;
=====================================
src/polishing/subs_matrix.h
=====================================
@@ -68,7 +68,7 @@ public:
};
typedef std::vector<Observation> ObsVector;
- HopoMatrix(const std::string& fileName);
+ HopoMatrix(const std::string& fileName, bool hopoEnabled);
AlnScoreType getObsProb(State state, Observation observ) const
{return _observationProbs[state.id][observ.id];}
AlnScoreType getGenomeProb(State state) const
View it on GitLab: https://salsa.debian.org/med-team/flye/-/compare/07ad497f1d2c7f7e038e33152e35ecccff95a188...8bddd17f26b33c0c81ce847fdea6fe088fed0ca4
--
This project does not include diff previews in email notifications.
View it on GitLab: https://salsa.debian.org/med-team/flye/-/compare/07ad497f1d2c7f7e038e33152e35ecccff95a188...8bddd17f26b33c0c81ce847fdea6fe088fed0ca4
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20240522/771be6f2/attachment-0001.htm>
More information about the debian-med-commit
mailing list