[Debian-med-packaging] Bug#889623: kraken: failing autopkgtest, possibly broken package
Andreas Tille
tille at debian.org
Wed Feb 7 12:23:59 UTC 2018
Hi again,
the issue is most probably caused by a broken fasta parser. With the
following simplification of the header
$ git diff
diff --git a/debian/tests/test_data/Acartia_tonsa.fasta b/debian/tests/test_data/Acartia_tonsa.fasta
index abcda65..011076b 100644
--- a/debian/tests/test_data/Acartia_tonsa.fasta
+++ b/debian/tests/test_data/Acartia_tonsa.fasta
@@ -1,4 +1,4 @@
->gi|441431932| Acartia tonsa copepod circovirus isolate 154_D11, complete genome
+>441431932
ACCTCGGCAACATCCGATCATCATATGATCAATTATGACTCATCCCGCGTGGAATTATGTAGCCAATGAA
ATCGCTCCATATTTCAAAAATTGAGTTTTCACAGTGGCCGCAATCTATAAAGCCGCGAGCGAAGCGAGCG
GTAGGCATTTTCAGTTTGACCAAAATGCCTAGCAACGCAACAACCAACCGAGCCCGTGGGTGGTGCTTCA
diff --git a/debian/tests/test_data/Acinetobacter_phage.fasta b/debian/tests/test_data/Acinetobacter_phage.fasta
index 317a94c..591f2e0 100644
--- a/debian/tests/test_data/Acinetobacter_phage.fasta
+++ b/debian/tests/test_data/Acinetobacter_phage.fasta
@@ -1,4 +1,4 @@
->gi|28173057| Acinetobacter phage AP205, complete genome
+>28173057
GGAGTGAACCCCGGAGGGGGTTCGCTGAAAGCCGAATCGAATTCGACTTTGCGTGATTCACATCACGTCT
TACTCACGATACTAGTACCGCGAGTTATCTTGTGGTAATTAAAAACTACCAGGAGATAACTTTATGAAGA
AAAGGACAAAAGCCTTGCTTCCCTATGCGGTTTTCATCATACTCAGCTTTCAACTAACATTGTTGACTGC
the test works again as expected
$ sh /usr/share/doc/kraken/run-unit-test
Added "Acartia_tonsa.fasta" to library (test_db)
Added "Acinetobacter_phage.fasta" to library (test_db)
Kraken build set to minimize disk writes.
Creating k-mer set (step 1 of 6)...
Found jellyfish v1.1.11
Hash size not specified, using '5120'
K-mer set created. [0.029s]
Skipping step 2, no database reduction requested.
Sorting k-mer set (step 3 of 6)...
K-mer set sorted. [0.030s]
Skipping step 4, GI number to seqID map now obsolete.
Creating seqID to taxID map (step 5 of 6)...
3 sequences mapped to taxa. [0.009s]
Setting LCAs in database (step 6 of 6)...
Finished processing 3 sequences
Database LCAs set. [0.020s]
Database construction complete. [Total: 0.105s]
2 sequences (0.00 Mbp) processed in 0.000s (983.6 Kseq/m, 94.43 Mbp/m).
2 sequences classified (100.00%)
0 sequences unclassified (0.00%)
0.00 0 0 U 0 unclassified
Since we want to make sure our users will be able to work with usual
fasta files I'll open an issue upstream and leave the bug open.
Kind regards
Andreas.
--
http://fam-tille.de
More information about the Debian-med-packaging
mailing list