[Debian-med-packaging] Bug#1042769: provean: incompatible with cd-hit >= 4.8.1-4
Andrius Merkys
merkys at debian.org
Wed Apr 2 09:14:20 BST 2025
A note for posterity:
I solved this bug by comparing the outputs of each tool used in provean
(cdhit and blastdbcmd) between last known working environment (Ubuntu
Xenial) and Ubuntu Focal (where it is broken). After preparing a patch,
I checked it on Debian Sid as well.
Correct output with nr [1]:
$ provean -q ~/debian-packages/provean/examples/P04637.fasta -v
~/debian-packages/provean/examples/P04637.var --psiblast psiblast
--cdhit cdhit --blastdbcmd blastdbcmd -d nr
## PROVEAN v1.1 output ##
## Input Arguments ##
/home/andrius/debian-packages/provean/src/provean -q
/home/andrius/debian-packages/provean/examples/P04637.fasta -v
/home/andrius/debian-packages/provean/examples/P04637.var --psiblast
psiblast --cdhit cdhit --blastdbcmd blastdbcmd -d nr
## Parameters ##
# Query sequence file:
/home/andrius/debian-packages/provean/examples/P04637.fasta
# Variation file: /home/andrius/debian-packages/provean/examples/P04637.var
# Protein database: nr
# Supporting sequence set file (optional): Not provided
# Supporting sequence set file for storing (optional): Not provided
# Substitution matrix: BLOSUM62
# Gap costs: 10, 1
# Clustering threshold: 0.750
# Maximum number of clusters: 30
[11:05:16] loading query sequence from a FASTA file...
[11:05:16] loading variations...
[11:05:16] searching related sequences...
[11:07:20] retrieving subject sequence information...
[11:07:20] clustering subject sequences...
[11:07:20] selecting clusters...
[11:07:20] 363 subject sequences in 30 clusters were selected for
supporting sequences.
[11:07:20] loading subject sequences from a FASTA file...
# Number of clusters: 30
# Number of supporting sequences used: 363
[11:07:20] computing delta alignment scores...
[11:07:22] printing PROVEAN scores...
## PROVEAN scores ##
# VARIATION SCORE
P72R 0.155
G105C -8.019
K370del -1.545
H178_H179insPHP -10.293
L22_W23delinsQS -10.399
Note on autopkgtest:
Maybe we can build a smaller database or take a subset of nr?
Andrius
[1] ftp://ftp.jcvi.org/data/provean/nr_Aug_2011/
More information about the Debian-med-packaging
mailing list