[med-svn] [Git][med-team/prank][upstream] New upstream version 250331

Fri Sep 5 05:52:19 BST 2025


Charles Plessy pushed to branch upstream at Debian Med / prank


Commits:
6c01e227 by Charles Plessy at 2025-09-05T11:44:58+09:00
New upstream version 250331
- - - - -


12 changed files:

- + VERSION_HISTORY
- src/Makefile
- src/ancestralsequence.cpp
- src/ancestralsequence.h
- src/boolmatrix.h
- src/dbmatrix.h
- src/flmatrix.h
- src/intmatrix.h
- src/mafft_alignment.cpp
- + src/prank.1
- src/prank.cpp
- src/progressivealignment.h


Changes:

=====================================
VERSION_HISTORY
=====================================
@@ -0,0 +1,142 @@
+v.250331
+Fixes by Michael Hiller
+- nuc vs. aa detection and information passed to Mafft
+Fixes by Martin Larralde
+- speed ups by removal of unnecessary calls
+
+
+v.170703
+Many changes not previously uploaded to GitHub, for example:
+- guide tree inference with FastTree 
+- ancestral states with RAxML
+- possible re-estimation of branch lengths
+ 
+
+v.170427
+Changes by Nikolai Hecker
+- make the temp files thread safe
+- fix a compile issue with GCC 6.2.0
+- reduce the information written to stdout; disable with "-verbose"
+
+
+v.150803
+- force linear order of alignment anchors
+- convert multifurcating trees to bifurcating ones
+- finish despite BppAncestor failing
+
+
+v.140603
+- now avoids crashes due to weird sequence names
+- now (codon) ancestral reconstructio also for translated alignments
+- new option '-njgaps' to consider gaps as mismatches for NJ distances
+
+
+v.140110
+- a fix for the external tool paths
+
+
+v.131211
+- search order for external tools changed: use own one first
+- '-nobbpa' to run without BppAncestors (even when available)
+- a fix for rare crashes due to overly long branch lengths
+
+
+v.131119
+- ancestor inference for translated alignments of DNA sequences
+- new option '-treeonly'
+
+
+v.130820
+- disabled copmutation of alignment score for the guide tree alignment
+- more information about iteration with a user-provided tree
+- workaround for a BppAncestor bug causing incomplete last codon
+- path to other binaries no more affected by renaming of the program file
+
+
+v.130708
+- Significant bug fixes -- update *strongly* recommended:
+ * option -F was mistakenly turned off by the new iterative approach
+ * without option -F, ancestor reconstruction (and scoring) were incorrect
+
+
+v.130410
+- More information about optimization score and fix for last alignment
+- Minor fixes on alignment conversion and use of external models
+
+
+v.130129
+- Introduced alignment score and automatic iteration to maximise the score
+- Changed interface for the analysis input and output
+- New output: inferred evolutionary events per branch
+
+
+v.121218
+- More detailed information about unmatching names. New option "-prunedata".
+
+
+v.121212
+- Support for some NHX tags.
+
+
+v.121210
+- Fixed underflow errors affecting ancestral reconstruction of large
+  alignments.
+
+
+v.121018
+- Ancestral sequences can differently indicate insertions and deletions.
+- Can update an alignment, recomputing nodes with tag "[&&NHX:XN=realign]"
+
+
+v.121002
+- Alignment merge now accepts trees such as "(t1:#.#,t2:#.#);".
+  Provided with "-t=filename" or "-tree=tree-string".
+
+
+v.120827
+- All files now under the GPL licence.
+
+
+v.120814
+- Can now also merge "alignments" of one sequence
+- New option '-mergedist=#' to define the distance for two alignments
+
+
+v.120717
+- All input data now converted to upper-case.
+
+
+v.120716
+- Fixed the translated alignment (been broken in recent clean up)
+- Fixed the output order of ancestral sequences
+
+
+v.120712
+- For codon alignment, MAFFT guide tree now with protein sequences
+  (fixes several issues with codon alignment)
+
+
+v.120626
+- Guide tree estimation from a MAFFT alignment
+- Merge of two pre-defined alignments
+- Support for Exonerate and MAFFT on Windows
+- Clean up of some code
+
+
+v.111130
+- Exonerate anchoring now also for guidetree computation. Experimental!
+
+
+v.111129
+- Allow guide trees with no branch lengths. Default branch length is 0.1;
+  use -fixedbranches=# to change.
+- Removed the dependency to boost libraries.
+
+
+v.111013
+- First update in Google Code
+- Alignment speed ups with Exonerate anchoring.
+
+
+v.101018
+- Last version before migration to Google Code


=====================================
src/Makefile
=====================================
@@ -6,7 +6,7 @@
 
 CC            = gcc
 CXX           = g++
-DEFINES       = 
+DEFINES       = -DNDEBUG
 CFLAGS        = -m64 -pipe -O3 $(DEFINES)
 CXXFLAGS      = -m64 -pipe -O3 $(DEFINES)
 INCPATH       = -I. -I/usr/include


=====================================
src/ancestralsequence.cpp
=====================================
@@ -574,23 +574,6 @@ void AncestralSequence::cleanSpace()
 }
 
 
-// Get the probability of characters in different structures given the tree.
-// Takes into account the phylogeny.
-//
-double AncestralSequence::mlCharProbAt(int j,int i,int k)
-{
-    return mlCharProb->g(k,j,i);
-}
-
-// Same for insertions skipped
-//
-double AncestralSequence::mlCharProbAtF(int j,int i,int k)
-{
-    realIndex->g(i);
-
-    return mlCharProb->g(k,j,realIndex->g(i));
-}
-
 void AncestralSequence::writeSequence(string name)
 {
     char str[10];


=====================================
src/ancestralsequence.h
=====================================
@@ -177,9 +177,20 @@ public:
     void setChildGaps(Sequence *l,Sequence *r);
     void setRealIndex(bool left);
 
-    double mlCharProbAt(int j,int i,int k);
-    double mlCharProbAtF(int j,int i,int k);
+    // Get the probability of characters in different structures given the tree.
+    // Takes into account the phylogeny.
+    //
+    double mlCharProbAt(int j,int i,int k)
+    {
+        return mlCharProb->g(k,j,i);
+    }
 
+    // Same for insertions skipped
+    //
+    double mlCharProbAtF(int j,int i,int k)
+    {
+        return mlCharProb->g(k,j,realIndex->g(i));
+    }
 
     void cleanSpace();
 


=====================================
src/boolmatrix.h
=====================================
@@ -58,7 +58,7 @@ public:
     void initialise(int v = 0);
 
     int g(int xa, int ya=0, int za = 0, int wa = 0)
-    { /**/
+    { /*
         if (!(xa>=0&&ya>=0&&za>=0&&wa>=0&&xa<x&&ya<y&&za<z&&wa<w))std::cout<<name<<" "<<xa<<" "<<ya<<" "<<za<<" "<<wa<<std::endl;/**/
         assert(xa>=0);
         assert(xa<x);


=====================================
src/dbmatrix.h
=====================================
@@ -20,8 +20,6 @@
 #ifndef DBMATRIX_H
 #define DBMATRIX_H
 
-#define NDEBUG
-
 #ifndef RFOR
 #define RFOR(i,n) for(i=n; i>=0; i--)
 #endif
@@ -61,7 +59,7 @@ public:
     void initialise(double v = 0);
 
     double g(int xa, int ya=0, int za = 0, int wa = 0)
-    { /**/
+    { /*
         if (!(xa>=0&&ya>=0&&za>=0&&wa>=0&&xa<x&&ya<y&&za<z&&wa<w))std::cout<<name<<" "<<xa<<" "<<ya<<" "<<za<<" "<<wa<<std::endl;/**/
         assert(xa>=0);
         assert(xa<x);


=====================================
src/flmatrix.h
=====================================
@@ -20,8 +20,6 @@
 #ifndef FLMATRIX_H
 #define FLMATRIX_H
 
-// #define NDEBUG
-
 #ifndef RFOR
 #define RFOR(i,n) for(i=n; i>=0; i--)
 #endif
@@ -61,7 +59,7 @@ public:
     void initialise(float v = 0);
 
     float g(int xa, int ya=0, int za = 0, int wa = 0)
-    { /**/
+    { /*
         if (!(xa>=0&&ya>=0&&za>=0&&wa>=0&&xa<x&&ya<y&&za<z&&wa<w))std::cout<<name<<" "<<xa<<" "<<ya<<" "<<za<<" "<<wa<<std::endl;/**/
         assert(xa>=0);
         assert(xa<x);


=====================================
src/intmatrix.h
=====================================
@@ -20,8 +20,6 @@
 #ifndef INTMATRIX_H
 #define INTMATRIX_H
 
-// #define NDEBUG
-
 #ifndef RFOR
 #define RFOR(i,n) for(i=n; i>=0; i--)
 #endif
@@ -60,7 +58,7 @@ public:
     void initialise(int v = 0);
 
     int g(int xa, int ya=0, int za = 0, int wa = 0)
-    { /**/
+    { /*
         if (!(xa>=0&&ya>=0&&za>=0&&wa>=0&&xa<x&&ya<y&&za<z&&wa<w))std::cout<<name<<" "<<xa<<" "<<ya<<" "<<za<<" "<<wa<<std::endl;/**/
         assert(xa>=0);
         assert(xa<x);


=====================================
src/mafft_alignment.cpp
=====================================
@@ -97,7 +97,11 @@ void Mafft_alignment::align_sequences(vector<string> *names,vector<string> *sequ
     m_output.close();
 
     stringstream command;
-    command << mafftpath<<"mafft "<<tmp_dir<<"/m"<<r<<".fas 2> /dev/null";
+    if (PROTEIN) {
+        command << mafftpath<<"mafft --amino "<<tmp_dir<<"/m"<<r<<".fas 2> /dev/null";
+    }else{
+        command << mafftpath<<"mafft "<<tmp_dir<<"/m"<<r<<".fas 2> /dev/null";
+    }
     if(NOISE>0)
         cout<<"cmd: "<<command.str()<<endl;
 


=====================================
src/prank.1
=====================================
@@ -0,0 +1,230 @@
+.\" Automatically generated by Pod::Man 2.28 (Pod::Simple 3.29)
+.\"
+.\" Standard preamble:
+.\" ========================================================================
+.de Sp \" Vertical space (when we can't use .PP)
+.if t .sp .5v
+.if n .sp
+..
+.de Vb \" Begin verbatim text
+.ft CW
+.nf
+.ne \\$1
+..
+.de Ve \" End verbatim text
+.ft R
+.fi
+..
+.\" Set up some character translations and predefined strings.  \*(-- will
+.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
+.\" double quote, and \*(R" will give a right double quote.  \*(C+ will
+.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and
+.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,
+.\" nothing in troff, for use with C<>.
+.tr \(*W-
+.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
+.ie n \{\
+.    ds -- \(*W-
+.    ds PI pi
+.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
+.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch
+.    ds L" ""
+.    ds R" ""
+.    ds C` ""
+.    ds C' ""
+'br\}
+.el\{\
+.    ds -- \|\(em\|
+.    ds PI \(*p
+.    ds L" ``
+.    ds R" ''
+.    ds C`
+.    ds C'
+'br\}
+.\"
+.\" Escape single quotes in literal strings from groff's Unicode transform.
+.ie \n(.g .ds Aq \(aq
+.el       .ds Aq '
+.\"
+.\" If the F register is turned on, we'll generate index entries on stderr for
+.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
+.\" entries marked with X<> in POD.  Of course, you'll have to process the
+.\" output yourself in some meaningful fashion.
+.\"
+.\" Avoid warning from groff about undefined register 'F'.
+.de IX
+..
+.nr rF 0
+.if \n(.g .if rF .nr rF 1
+.if (\n(rF:(\n(.g==0)) \{
+.    if \nF \{
+.        de IX
+.        tm Index:\\$1\t\\n%\t"\\$2"
+..
+.        if !\nF==2 \{
+.            nr % 0
+.            nr F 2
+.        \}
+.    \}
+.\}
+.rr rF
+.\" ========================================================================
+.\"
+.IX Title "PRANK 1"
+.TH PRANK 1 "2017-04-27" "v.121211" "The Probabilistic Alignment Kit"
+.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
+.\" way too many mistakes in technical documents.
+.if n .ad l
+.nh
+.SH "NAME"
+prank \- Computes probabilistic multiple sequence alignments
+.SH "SYNOPSIS"
+.IX Header "SYNOPSIS"
+\&\fBprank\fR \fIsequence_file\fR
+.PP
+\&\fBprank\fR [optional parameters] \-d=\fIsequence_file\fR [optional parameters]
+.SH "DESCRIPTION"
+.IX Header "DESCRIPTION"
+The Probabilistic Alignment Kit (\s-1PRANK\s0) is a probabilistic multiple alignment
+program for \s-1DNA,\s0 codon and amino-acid sequences. It's based on a novel algorithm
+that treats insertions correctly and avoids over-estimation of the number of
+deletion events.
+.PP
+In addition, \s-1PRANK\s0 borrows ideas from maximum likelihood methods used in
+phylogenetics and correctly takes into account the evolutionary distances
+between sequences. Lastly, \s-1PRANK\s0 allows for defining a potential structure for
+sequences to be aligned and then, simultaneously with the alignment, predicts
+the locations of structural units in the sequences.
+.SH "OPTIONS"
+.IX Header "OPTIONS"
+.SS "\s-1INPUT/OUTPUT PARAMETERS\s0"
+.IX Subsection "INPUT/OUTPUT PARAMETERS"
+.IP "\fB\-d=\f(BIsequence_file\fB\fR" 8
+.IX Item "-d=sequence_file"
+The input sequence file in \s-1FASTA\s0 format.
+.IP "\fB\-t=\f(BItree_file\fB\fR" 8
+.IX Item "-t=tree_file"
+The tree file to use. If unset, an appriximated \s-1NJ\s0 tree is generated.
+.IP "\fB\-o=\f(BIoutput_file\fB\fR" 8
+.IX Item "-o=output_file"
+Set the name of the output file. If unset, \fIoutput_file\fR is set to \fBoutput\fR.
+.IP "\fB\-f=\f(BIoutput_format\fB\fR" 8
+.IX Item "-f=output_format"
+Set the output format. \fIoutput_format\fR can be one of \fBfasta\fR (default),
+\&\fBphylipi\fR, \fBphylips\fR, \fBpaml\fR, or \fBnexus\fR.
+.IP "\fB\-m=\f(BImodel_file\fB\fR" 8
+.IX Item "-m=model_file"
+The model file to use. If unset, \fImodel_file\fR is set to \fB\s-1HKY2/WAG\s0\fR.
+.IP "\fB\-support\fR" 8
+.IX Item "-support"
+Compute posterior support.
+.IP "\fB\-showxml\fR" 8
+.IX Item "-showxml"
+Output alignment xml-file.
+.IP "\fB\-showtree\fR" 8
+.IX Item "-showtree"
+Output alignment guidetree.
+.IP "\fB\-showanc\fR" 8
+.IX Item "-showanc"
+Output ancestral sequences.
+.IP "\fB\-showall\fR" 8
+.IX Item "-showall"
+Output all of these.
+.IP "\fB\-noanchors\fR" 8
+.IX Item "-noanchors"
+Do not use Exonerate anchoring. (Exonerate to be installed separately.)
+.IP "\fB\-nomafft\fR" 8
+.IX Item "-nomafft"
+Do not use \s-1MAFFT\s0 for guide tree. (\s-1MAFFT\s0 to be installed separately.)
+.IP "\fB\-njtree\fR" 8
+.IX Item "-njtree"
+Estimate tree from an input alignment (and realign).
+.IP "\fB\-shortnames\fR" 8
+.IX Item "-shortnames"
+Truncate names at first space character.
+.IP "\fB\-quiet\fR" 8
+.IX Item "-quiet"
+Reduce output.
+.SS "\s-1ALIGNMENT MERGE\s0"
+.IX Subsection "ALIGNMENT MERGE"
+.IP "\fB\-d1=\f(BIalignment_file\fB\fR" 8
+.IX Item "-d1=alignment_file"
+The first input alignment file in \s-1FASTA\s0 format.
+.IP "\fB\-d2=\f(BIalignment_file\fB\fR" 8
+.IX Item "-d2=alignment_file"
+The second input alignment file in \s-1FASTA\s0 format.
+.IP "\fB\-t1=\f(BItree_file\fB\fR" 8
+.IX Item "-t1=tree_file"
+The tree file for the first alignment. If unset, an appriximated \s-1NJ\s0 tree is generated.
+.IP "\fB\-t2=\f(BItree_file\fB\fR" 8
+.IX Item "-t2=tree_file"
+The tree file for the second alignment. If unset, an appriximated \s-1NJ\s0 tree is generated.
+.SS "\s-1MODEL PARAMETERS\s0"
+.IX Subsection "MODEL PARAMETERS"
+.IP "\fB\-F\fR, \fB+F\fR" 8
+.IX Item "-F, +F"
+Force insertions to be always skipped.
+.IP "\fB\-gaprate=\f(BI#\fB\fR" 8
+.IX Item "-gaprate=#"
+Set the gap opening rate. The default is \fB0.025\fR for \s-1DNA\s0 and \fB0.005\fR for
+proteins.
+.IP "\fB\-gapext=\f(BI#\fB\fR" 8
+.IX Item "-gapext=#"
+Set the gap extension probability. The default is \fB0.75\fR for \s-1DNA\s0 and \fB0.5\fR for
+proteins.
+.IP "\fB\-codon\fR" 8
+.IX Item "-codon"
+Use empirical codon model for coding \s-1DNA.\s0
+.IP "\fB\-DNA\fR, \fB\-protein\fR" 8
+.IX Item "-DNA, -protein"
+Use \s-1DNA\s0 or protein model, respectively. Disables auto-detection of model.
+.IP "\fB\-termgap\fR" 8
+.IX Item "-termgap"
+Penalise terminal gaps normally.
+.IP "\fB\-nomissing\fR" 8
+.IX Item "-nomissing"
+No missing data. Use \fB\-F\fR for terminal gaps.
+.IP "\fB\-keep\fR" 8
+.IX Item "-keep"
+Do not remove gaps from pre-aligned sequences.
+.SS "\s-1OTHER PARAMETERS\s0"
+.IX Subsection "OTHER PARAMETERS"
+.IP "\fB\-iterate=#\fR" 8
+.IX Item "-iterate=#"
+Rounds of re-alignment iteration; by default, iterate five times and keep the best result.
+.IP "\fB\-once\fR" 8
+.IX Item "-once"
+Run only once. Same as \-iterate=1.
+.IP "\fB\-prunetree\fR" 8
+.IX Item "-prunetree"
+Prune guide tree branches with no sequence data.
+.IP "\fB\-prunedata\fR" 8
+.IX Item "-prunedata"
+Prune sequence data with no guide tree leaves.
+.IP "\fB\-uselogs\fR" 8
+.IX Item "-uselogs"
+Slower but should work for a greater number of sequences.
+.IP "\fB\-translate\fR" 8
+.IX Item "-translate"
+Translate input data to protein sequences.
+.IP "\fB\-mttranslate\fR" 8
+.IX Item "-mttranslate"
+Translate input data to protein sequencess using mt table.
+.IP "\fB\-convert\fR" 8
+.IX Item "-convert"
+Do not align, just convert to a different format.
+.IP "\fB\-dna=\f(BIdna_sequence_file\fB\fR" 8
+.IX Item "-dna=dna_sequence_file"
+\&\s-1DNA\s0 sequence file for backtranslation of protein alignment.
+.IP "\fB\-help\fR" 8
+.IX Item "-help"
+Show an extended help page with more options.
+.IP "\fB\-version\fR" 8
+.IX Item "-version"
+Show version and check for updates.
+.SH "AUTHORS"
+.IX Header "AUTHORS"
+\&\fBprank\fR was written by Ari Loytynoja.
+.PP
+This manual page was originally written by Manuel Prinz <manuel at debian.org> for
+the Debian project (and may be used by others).


=====================================
src/prank.cpp
=====================================
@@ -38,7 +38,7 @@ bool verbose = false;
 
 int main(int argc, char *argv[])
 {
-    version = 170703;
+    version = 250331;
 
     readArguments(argc, argv);
     int time1 = time(0);


=====================================
src/progressivealignment.h
=====================================
@@ -562,23 +562,23 @@ private:
         }
 
 
-        if (DNA && !isDna)
+        if (DNA && !(*isDna))
         {
             cout<<"Warning autodetection suggests protein but DNA model forced.\n";
             *isDna = true;
         }
-        if (CODON && !isDna)
+        if (CODON && !(*isDna))
         {
             cout<<"Warning autodetection suggests protein but codon model forced.\n";
             *isDna = true;
         }
-        if (PROTEIN && isDna)
+        if (PROTEIN && *isDna)
         {
             cout<<"Warning autodetection suggests DNA but protein model forced.\n";
             *isDna = false;
         }
 
-        if(isDna)
+        if(*isDna)
         {
             DNA = true;
             PROTEIN = false;



View it on GitLab: https://salsa.debian.org/med-team/prank/-/commit/6c01e227dd0fbcfe31d778019107386e57dc75e8

-- 
View it on GitLab: https://salsa.debian.org/med-team/prank/-/commit/6c01e227dd0fbcfe31d778019107386e57dc75e8
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20250905/a98b9ed4/attachment-0001.htm>