[Debian-med-packaging] Bug#843621: mummer: serious bug?
Andreas Tille
tille at debian.org
Tue Nov 8 12:36:28 UTC 2016
Hi Ivan,
On Tue, Nov 08, 2016 at 01:20:23PM +0100, Andreas Tille wrote:
>
> When aligning and comparing two sequences using dnadiff, where one is an
> identical (or very similar) subset of another, we get strange results
> (however not on all genomes).
> For example, taking only the first chromosome of the S. Cerevisiae S288c
> reference and comparing it to the entire reference:
> dnadiff saccharomyces_cerevisiae-S288C.fa chr1.fa
>
> Results with (in version version 3.23~dfsg-3):
> ...
> [Bases]
> TotalBases 12071326 230218
> AlignedBases 572185(4.74%) 51864(22.53%)
> UnalignedBases 11499141(95.26%) 178354(77.47%)
> ...
> AvgIdentity 97.65 97.65
> ...
>
> Whereas in version 3.23~dfsg-2 we get something that's more expected:
> ...
> [Bases]
> TotalBases 12071326 230218
> AlignedBases 572185(4.74%) 230218(100.00%)
> UnalignedBases 11499141(95.26%) 0(0.00%)
> ...
> AvgIdentity 100.00 100.00
> ...
That's a very helpful observation. My first suspicion is that the
patches I've taken over from mugsy that was featuring a code copy of
mummer are responsible for the diff. I'd like to wait for some comments
from the Debian Med team. In any case I can confirm that also the
version currently in Debian testing (3.23+dfsg-1) which is the release
candidate for the next stable distribution, is reproducing the issue:
$ dnadiff saccharomyces_cerevisiae-S288C.fa chr1.fa
$ grep -A3 '^\[Bases\]' out.report
[Bases]
TotalBases 12071326 230218
AlignedBases 572185(4.74%) 51931(22.56%)
UnalignedBases 11499141(95.26%) 178287(77.44%)
> We found this to happen in de novo assemblies of the W303 PacBio dataset,
> as well as on a C. Elegans dataset, but not on E. Coli.
>
> I'm attaching the sample reference I used in the example above.
That's very helpful to reproduce the issue.
> P.S. The problem does not happen with the official Mummer 3.23 SourceForge
> code.
Also a helpful hint which is targeting in the same direction as my
assumption that the code diff between 3.23~dfsg-2 and 3.23~dfsg-3 might
be responsible and should most probably be reverted.
Kind regards
Andreas.
--
http://fam-tille.de
More information about the Debian-med-packaging
mailing list