[med-svn] [Git][med-team/paml][upstream] New upstream version 4.9j+dfsg

Steffen Möller gitlab at salsa.debian.org
Sun Nov 10 18:02:55 GMT 2019



Steffen Möller pushed to branch upstream at Debian Med / paml


Commits:
cdd16ab7 by Steffen Moeller at 2019-11-10T17:54:43Z
New upstream version 4.9j+dfsg
- - - - -


9 changed files:

- doc/pamlHistory.txt
- − doc/pamlHistory.txt~
- − paml4.9i.tgz
- src/mcmctree.c
- src/paml.h
- − src/pll_avx.c
- − src/pll_sse.c
- src/tools.c
- src/treesub.c


Changes:

=====================================
doc/pamlHistory.txt
=====================================
@@ -8,6 +8,31 @@ discussion site
 https://groups.google.com/forum/#!forum/pamlsoftware.
 
 
+
+Version 4.9j, October 2019
+
+(*) mcmctree, initial values have ages that are too old, exceeding max
+bounds.  In theory bounds in mcmctree are always soft so that the node
+ages will move to the area of posterior mode when burnin is long
+enough.  In practice the poor starting values are problematic and
+requires long burn-in.  I have rewritten the code for generating
+initial values to respect the min and max bounds specified in the
+fossil calibrations.
+
+(*) codeml, clade labels in the tree file.  A bug introduced in 4.9i
+caused the clade labels ($) to be ignored.  This affects the branch,
+branch-site and clade models.  If your tree has branch labels (#) only
+and no clade models, everything will be fine.  If you have used the
+clade label ($) in the tree with or without the branch label (#),
+either the program will abort or the results will be incorrect.  The
+clade label ($) is supposed to label all branches in the clade as well
+as the branch itself, but all clade labels in the tree are ignored in
+4.9i.  Earlier versions are correct.
+
+Thanks to Yonghua Wu for pointing out the bug.
+
+
+
 Version 4.9i, March 2019
 
 (*) mcmctree: I have added an option (duplication = 1) for dating a


=====================================
doc/pamlHistory.txt~ deleted
=====================================
@@ -1,1969 +0,0 @@
-History and bug fixes of PAML (Phylogenetic Analysis by Maximum Likelihood)
-
-Ziheng Yang
-
-
-If you find problems or have questions, please visit the PAML
-discussion site
-https://groups.google.com/forum/#!forum/pamlsoftware.
-
-
-Version 4.9i, February 2019
-
-(*) mcmctree: I have added an option (duplication = 1) for dating a
-tree with both speciations and gene duplications, so that some nodes
-on the tree share divergence times.  Nodes sharing ages are identified
-using labels in the tree file: #1, #2, ....  I have yet to update the
-document about specification of the model.
-
-(*) mcmctree: The TipDate option was written for one locus or
-partition and never worked for more than two loci/partitions.  I have
-edited the code so that it works for multiple partitions, some of
-which may be molecular and the others morphological.
-
-(*) codeml: The option estFreq = 0 when codonFreq = 6 (FMutSel0) and 7
-(FMutSel) is not working in versions 4.9g and 4.9h.  This is fixed
-now.  This option uses the observed codon or amino acid frequencies
-for the mutation-selection models of codon usage.  Instead the program
-estimates the frequencies using maximum likeihood, which is what the
-option estFreq = 1 does.  Look at the README file in the
-examples/mtCDNAape/ folder.
-
-(*) codeml clade model D: The bounds for the w (dN/dS) ratios in the
-first site classes are set tp (0.0001, 0.5) for w0 and (0.5, 1.5) for
-w1, in versions 4.9b,c,d,e,f,g, since I added the BEB calculation for
-clade model D in 4.9b.  The motivation for the bounds is that site
-class 0 represents strong purifying selection with a small w0, while
-site class 1 should include sites under weak purifying selection with
-a larger w1.  However the bounds are arbitrary.  In some datasets, the
-MLEs are found to be at the bounds, making the interpretation awkward.
-I have changed the bounds to the following: 
-w0b[]={0.0001, 1.0}, w1b[]={0.01, 1.5}.
-This means that the user should swap the estimates of w0 and w2 if w0 > w1.
-
-
-
-Version 4.9h, March 2018
-
-(*) mcmctree: gamma-Dirichlet versus conditional i.i.d. priors for
-rates for loci.  Since 4.9d, the program and the documentation are
-inconsistent about the two priors, and which value (0 or 1) means
-which prior.  I have now checked the program and the documentation to
-make sure that they are consistent:
-
-prior = 0: gamma-Dirichlet (dos Reis 2014).  This is the default.
-prior = 1: conditional i.i.d. prior (Zhu et al. 2015).
-
-I believe these two are similar especially if the number of loci
-(partitions) is large, but no serious comparisons between the two
-priors have been published.
-
-Thanks to Adnan Moussalli for pointing out the errors.
-
-(*) codeml.  It was discovered that the mechanistic amino acid
-substitution model implemented in Yang et al. (1998; see table 3),
-specified by seqtype = 2 model = 6, has been broken for a long time,
-since version 3.0 (2000) at least.  Version 2.0 (1999) seems to be
-correct.  This means that the model become broken soon since it was
-published.  I have now fixed this.
-
-This model of amino acid substitution starts from a Markov chain for
-codons and then aggregate the states and merge the synonymous codons
-into one state (the coded amino acid).  This is an approximate
-formulation since the process after state aggregation is not Markovian
-anymore.
-
-I have now added another codon-based amino acid substitution model
-that treats amino acids as ambiguities codons.  The model is specified
-by seqtype = 2 model = 5.  This is an exact formulation.
-
-
-(*) codeml.  The number of categories in the BEB calculation under M2
-and M8 is unintentionally set to 4 rather than 10.  I have changed
-this back to 10.  The details of this calculation are in Yang et
-al. 2005 MBE.
-
-
-Version 4.9g, December 2017
-
-(*) codeml.  A bug caused the BEB calculation under the site model M8
-(NSsites = 8) to be incorrect, with the program printing out warming
-messages like "strange: f[ 5] = -0.0587063 very small."  This bug was
-introduced in version 4.9b and affects versions 4.9b-f.  A different
-bug was introduced in version 4.9f that causes the log likelihood
-function under the site model M8 (NSsites = 8) to be calculated
-incorrectly.  These are now fixed.
-
-
-
-Version 4.9f, October 2017
-
-(*) baseml, nonhomogeneous models (nhomo & fix_kappa).  Those models
-allow different branches on the tree to have different Q matrices.
-Roughly nhomo controls the base frequency parameters while gix_kappa
-controls kappa or the exchangeability parameters (a b c d e in
-GTR/REV, for example).  I added the option (nhomo = 5, fix_kappa = 2),
-which lets the user to define branch types, so that branches of the
-same type have the same exchangeability parameters (a b c d e for GTR)
-and base composition parameters, while branches of different types
-have different parameters.  Branch types are labeled (using # and $),
-0, 1, 2, ....  The labels should be consecutive positive integers.
-The old options nhomo = 3 or 4 work for some models like GTR, but not
-some other models which also have base composition parameters.  In
-this update, I think those options should work with all those models.
-I have also edited the documentation (look for option variable nhomo
-for baseml).
-
-
-(*) baseml & codeml.  i added an option fix_blength = 3
-(proportional), which means that branch lengths will be proporational
-to those given in the tree file, and the proportionality factor is
-estimated by ML.
-
-(*) codeml.  The program does not count the parameters correctly for
-model M0 when fix_kappa = 1.  The bug was introduced in version 4.9c
-and affects versions 4.9c-e.  This is now fixed.
-
-(*) codeml (seqtype = 2 model = 2).  If you are analyzing multiple
-protein data sets (ndata > 1) under the empirical models such as wag,
-jtt, dayhoff.  The results for the first data set are correct, but all
-later data sets are analyzed incorrectly under the corresponding +F
-models, that is, wag+F, jtt+F, dayhoff+F, etc.  A bug in the program
-means that for the second and later data sets, the equilibrium amino
-acid frequencies are taken from the real data and not correctly set to
-those specified by the empirical models.  I note that this bug was
-recorded in the update Version 3.14b, April 2005, but it was somehow
-not fixed, even in that version.  This is now fixed.  Thanks to Nick
-Goldman for reporting this again.
-
-(*) evolver (options 5, 6, 7 for simulating nucleotide, codon and
-amino acid alignments).  If you choose the option of printing out the
-site pattern counts instead of the sequences (specified at the
-beginning of the control file such as MCbase.dat), and if you are
-simulating two or more alignments, the program crashes after finishing
-the first alignment.  This is now fixed.
-
-(*) mcmctree.  The program crashes if you have a mixture of
-morphological loci and molecular loci, if not all the morphological
-loci are before the molecular loci.  I have now fixed this.  
-I think this was never described anyway.
-
-
-
-Version 4.9e, March 2017
-
-(*) Edited the readme files to change the license to GPL.
-
-(*) mcmctree.  A bug was introduced in version 4.9b which causes the
-program to read the fossil calibration information in the tree file
-incorrectly, if joint (minimum and maximum) bounds are specified using
-the symbol '<' and '>'.  If you use the notation "B()", "L()", and
-'U()', the information is read correctly.  This bug was introduced in
-version 4.9b and exists in 4.9c and 4.9d.  Versions 4.9a and earlier
-were correct.
-
-
-
-Version 4.9d, February 2017
-
-(*) mcmctree.  Changed the default prior for rates for loci to
-gamma-Dirichlet (dos Reis 2014), and updated the documentation as
-well.  It was set to the conditional i.i.d. prior (Zhu et al. 2015).
-
-(*) mcmctree.  Added Bayes factor calculation.  A program called
-BFdriver is included in the release, as well as a pdf document in the
-folder examples/DatingSoftBound/BFdriverDOC.pdf.  We suggest that you 
-use the exact likelihood calculation when you use this option, since the 
-normal approximation is unreliable when the power posterior is close to 
-the prior (when beta is small).
-
-
-
-Version 4.9c, September 2016
-
-(*) Added GPL license statement in various places.
-
-(*) codeml.  When two or more trees are in the tree file and the
-specified model is the branch model, the program counts the number of
-parameters correctly only for the first tree.  For the second or later
-trees, it overcounts the parameters in the model.  The lnL and
-parameter estimates are still correct, but the printed values for the
-extra parameters are more or less random numbers and vary among
-different runs of the same analysis.  The bug was introduced in
-version 4.7a and affects the versions since then until 4.9b.  It is
-now fixed.  Thanks to Casey Bergman.
-
-
-
-Version 4.9b, March 2016
-
-(*) codeml.  I have added the BEB calculation for clade model D.  I
-deleted the option of having two site classes under model D so that
-both clade models C and D now assume three site classes.  The
-difference between models C and D is that in C, w1 = 1 is fixed, while
-in D, w1 is a free parameter, just like w0.  Both the ML iteration and
-the BEB calculation under models C and D allow two or more clades
-(branch types), but note that an increase in the number of clades by 1
-roughly increases the computation by 10 times.  Also the BEB
-calculation under model D is about 10 times more expensive than that
-under model C.  I updated the codeml section of the documentation to
-remove the description of the old models (like M1 and M2) and focus on
-the new models that are in the program (like M1a and M2a).
-
-baseml & codeml.  If the tree has labels using the symbol #,
-the program will crash.  This is caused by a bug in the routine for
-reading trees from the tree file.  This is now fixed.
-
-
-
-Version 4.9a, September 2015
-
-codeml (seqtype = 2 for amino acid sequences) crashes when the dataset
-is large, with more than 2M sites, say.  This is due to a stupid bug
-in an irrelevant part of the code.  This is fixed in 4.9a.
-
-
-Version 4.9, January 2015
-
-(*) mcmctree.  There are now two priors for locus rates (mu_i or rates
-for loci), as summarized in Zhu et al. (2014 Sys Biol).  They are
-specified as follows
-
-   rgene_gamma = alpha_u beta_u alpha prior  * gamma prior for overall rates for genes
-
-where prior = 0 (default) is the conditional i.i.d. prior and prior =
-1 is the gamma-dirichlet prior.
-
-For example the following specifies the conditional i.i.d. prior with
-alpha_mu = 100, beta_mu = 1000, and alpha = 0.5.  Here the average
-overall rate is 0100/1000 = .1 changes per site per time unit.  alpha
-= 0.5 describes the extent of rate variation among loci.
-
-   rgene_gamma = 100 1000 0.5 0  * conditional i.i.d. prior for locus rates
-  sigma2_gamma = 4 100 0.5       * prior for sigma^2     (for clock=2 or 3)
-
-The sigma2_i prior has parameters alpha_u, beta_u, alpha, specified in
-the same format.  
-
-(*) mcmctree.  I have changed the option of setting the finetune step
-lengths for the MCMC algorithms so that each parameter in the model
-has its own step lengths.  As a result, the step lengths are all
-automatically adjusted during the burn-in, and cannot be fixed or
-specified by the user.  The line 
-  finetune = 1: .0040  0.030  0.015  0.03 .55  * times, rates, mixing, paras, RateParas
-in the control file, if present, is read but ignored.  
-
-(*) mcmctree.  The calibration density for S2N (skew 2 normals) is
-incorrectly implemented.  While the correct formula should be 
-  p0 * f1(t) + (1 - p0) f2(t)
-the incorrect formula used is 
-  p0 * f1(t) + f2(t).
-Basically the two skew-normal densities was not given the correct weights.
-If you run the program without data to make sure that the effective prior 
-is sensible, then this mistake should have little impact.  The bug is now 
-fixed.  Thanks to Mario dos Reis.
-
-
-
-
-Version 4.8a, August 2014
-
-(*) baseml (nhomo = 3 or 4).  I have added the nonhomogeneous GTR
-(REV) model with a set of abcde parameters for every branch.  The
-specifications are as follows.
-
-model=7 nhomo=4 fix_kappa=0: one set of rate parameters abcde in GTR for the whole tree.
-model=7 nhomo=4 fix_kappa=1: one set of rate parameters abcde in GTR for every branch.
-
-In version 4.8, only the second option (one set of abcde parameters
-for the whole tree) is implemented, no matter whether you choose 0 or
-1 for fix_kappa.  After this change, the use of fix_kappa in the nhomo
-models is consistent between HKY and GTR.  For HKY, the specifications
-have always been as follows.
-
-model=4 nhomo=4 fix_kappa=0: one kappa in HKY for the whole tree.
-model=4 nhomo=4 fix_kappa=1: one kappa in HKY for every branch.
- 
-Note that nhomo=4 means one set of frequency parameters for every
-branch on the tree.
-
-I have also added a method for calculating the expected number of
-changes from any nucleotide i to nucleotide j along every branch under
-the nonhomogeneous models (nhomo=3 or 4).  This is called EMC, and
-accounts for the fact that the base frequencies are changing over
-time, and so does the rate of substitution.  It is invoked by using
-RateAncestor = 1 and verbose = 1.
-
-I tested only the nhomo=4 option carefully and the options nhomo=3 and
-5 are not well tested.
-
-
-
-Version 4.8, March 2014
-
-(*) mcmctree.  The prior for rates for loci (with ndata = 2 or more)
-has changed in this version, and the gamma-Dirichlet prior is used.
-The old i.i.d. prior is replaced and unavailable anymore.  The new
-gamma-Dirichlet prior is decribed in a paper 
-
-dos Reis, M., T. Zhu, and Z. Yang. 2014. The impact of the rate prior
-on Bayesian estimation of divergence times with multiple
-loci. Syst. Biol. 63:555-565.
-
-(*) codeml.  Added a Bayesian method for estimating distance t and dN/dS
-ratio w in pairwise comparisons.  The option is 
-      runmode = -3  * -3: pairwise Bayesian
-      runmode = -3 1.1 1.1 1.1 2.2   * -3: pairwise Bayesian
-The four numbers are the parameters in the gamma priors for t and w (a_t b_t a_w b_w).
-The method is described in the following paper:
-
-Angelis, K., M. dos Reis, and Z. Yang. 2014. Bayesian estimation of
-nonsynonymous/synonymous rate ratios for pairwise sequence
-comparisons. Mol. Biol. Evol. 31:1902-1913.
-
-(*) baseml, codeml, mcmctree).  In version 4.7a, I disabled the option
-of reading species indexes in the tree file (for example, the number
-19, instead of the sequence name, may be used in the tree file to mean
-the 19th sequence in the sequence alignment).  I have added this back.
-Version 4.7 does not have this problem.
-
-(*) codeml in paml 4.6 and 4.7 has a bug if your sequences have stop
-codons.  Stop codons have always been disallowed in the codon models
-implemented in codeml.  In versions 4.6-4.7, if there is a stop codon
-in any sequence, the codons in all sequences at that site (column in
-alignment) are changed into ??? and then the sequences are processed
-in the usual way.  The likelihood should then be the same as the
-likelihood if that column is removed from the alignment.  However, the
-implementation of the idea was incorrect and gives wrong results if
-the sequences do not contain any ambiguity characters and if you
-choose cleandata = 0.  I have now fixed this.  The idea is not really
-useful, so the good advice is that you delete the column of stop
-codons before the data are analyzed using codeml.
- 
-
-Version 4.7a, October 2013
-
-(*) codeml.  The mutation-selection model of Yang and Nielsen (2008)
-is noted to work for the site and branch-site models but not for the
-branch models.  This is now fixed.
-
-(*) mcmctree.  I fixed and modified the infinitesites program, which
-generates the limiting distribution when the amount of sequence data
-approaches infinity.  This works for both the clock (clock=1) and
-relaxed clock models (clock=2 or 3).  The mcmctree tutorial explains
-how to run this program.
-
-
-Version 4.7, January 2013
-
-(*) mcmctree.  This version includes a method for dating divergences
-using sampled tips, as in viral datasets.  The prior for node ages in
-the given rooted tree is generated using a
-birth-death-sequential-sampling model, described in Stadler and Yang
-(2012 Syst Biol, submitted).  The SIV/HIV-2 example dataset analyzed
-in the paper is in the examples/TipDate/ folder.  Look at the readme
-file there.
-
-(*) codeml.  Version 4.6 crashes if you have a lot of ambiguity codons
-in the alignment, or precisely if the total number of fully determined
-codons (61 in the standard code) and the ambiguous codons exceed 127.
-I was using "unsigned char" to represent the codons, which go from 0
-to 255 and can represent 256 distinct codons.  Somehow I changed this
-to "char", which on some systems goes from -128 to 127.  Can't
-remember why I made the change.  Anyway this is now fixed.  Versions
-4.5 or earlier do not have this problem.  Thanks to David Yu for
-pointing out the problem.
-
-
-
-Version 4.6, September 2012
-
-(*) baseml. The model of autocorrelated-rates among sites (Yang 1995
-Genetics 139:993-1005) is broken in paml versions 4.3 to 4.5
-inclusive.  Earlier versions were apparently fine.  When you select
-the auto-discrete-gamma model (alpha > 0, rho > 0) or the
-nonparametric autocorrelated-rates model (fix_alpha = 0, ncatG = 3,
-nparK = 4), the program aborts with the following error message:
-
-"Error: use 10, 20, 32, 64, 128, 512, 1024 for npoints for legendre.."
-
-This is now fixed.  The option of nparK = 3 seems to have problems in
-all versions, and the program prints out the following warning
-message.  
-
-"nparK==3, double stochastic, may not work.  Use nparK=4?"
-
-This option is supposed to fit a Markov-dependent model with equal
-proportions in the rate categies.  I have not got time to investigate
-this option, but a glance at table 3 in that paper indicates that this
-model was not used in the table and probably never properly
-implemented, so please follow the advice and use nparK = 4 instead.
-
-(*) codeml.  I have now added the site model M2a_rel of Weadick & Chang
-(2012 Mol. Biol. Evol.), in which w2 > 0.  This is specified using
-NSsites = 22 while model M2a is specified as NSsites = 2 (with w2 >
-1).  M2a_rel is the null model for the likelihood ratio test based on
-clade model C, suggested by Weadick & Chang (2012 Mol. Biol. Evol.).
-Note that the very old site model M2 from Nielsen & Yang (1998
-Genetics) assumes w0 = 0.  Yang, Wong, Nielsen (2005 MBE) changed the
-model so that 0 < w0 < 1 and renamed the model M2a.  My tests using
-the versions around that time reveals the following:
-
-3.13d (M2):  w0 = 0, w2 > 0
-3.14a (M2a): w0 > 0, w2 >= 1
-3.14b (M2a): w0 > 0, w2 >= 1
-
-If seems that when we changed M2 into M2a, we also changed the
-constraint on w2.  I don't have a copy of version 3.14 to check.
-
-(*) basseml & codeml.  To following option
-    bootstrap = 1000
-
-should produce 1000 bootstrap samples in the file boot.txt.  A bug is
-introduced in versions 4.4 and 4.5, so that garbiage is created
-instead of bootstrap datasets.  Versions 4.3 and earlier were correct.
-This bug is now fixed.
-
-(*) evolver.  The simulation of codon sequences (evolver 6
-MCcodon.dat) is incorrect when a genetic code different from the
-universal code is used.  This is a conceptual error I made and it has
-been in all previous versions when simulation of codon sequences is
-allowed.  I have assumed incorrectly that setting the frequencies of
-stop codons to 0 should be sufficient to get correct results.  The
-zero frequencies allow the program to identify the stop codons
-correctly, but one still needs the code table to know which changes
-are synonymous and which nonsynonymous.  evolver has been using the
-universal code to simulate codon sequences.  Simulation using the
-universal code has been correct, and simulation using the
-mitochondrial code (for example) is affcted and the results should be
-incorrect although the impact may be slight.  This bug is now fixed.
-A variable is added in the control file MCcodon.dat right below the
-codon frequencies, which tells the program which genetic code to use.
-
-1   * genetic code (0:universal; 1:mammalian mt; 2-10:see below)
-
-The program also checks that the stop codons have frequency 0 and that
-the frequencies of the sense codons sum to 1.  Thanks to Mario dos
-Reis for pointing this out.
-
-(*) mcmctree.  A bug is found in the calculation of the prior on rates
-under the correlated-rates model (clock=3).  This bug has been in
-version 4 (July 2007) to version 4.5 (December 2011), inclusive.  The
-prior is calculated by multiplying the conditional densities of
-equation (7) in Rannala & Yand (2007) over all interior nodes on the
-master tree (see also Figure 1 in that paper).  However, the rA used
-in the program was not for the current branch: it was instead for the
-ancestral branch.  The correct formula used was correct for the root
-of the master tree, but incorrect for other interior nodes.  The
-impact of the bug on posterior time estimates may be expected to be
-small, but if the clock is seriously violated and rates change over
-neighboring branches, this may be a concern.
-
-
-
-Version 4.5, December 2011
-
-baseml.  I have added the joint ancestral reconstruction under the
-nonhomogeneous models (nhomo = 3 or 4).  This works for the model of
-one rate for all sites, and does not work for the model of gamma rates
-for sites or the partition models.  Also I implemented the joint
-reconstruction only and not the marginal reconstruction.
-
-mcmctree.  The program now prints out a file called FigTree.tre in the
-current working directory that is readable by FigTree.  The branch
-lengths show the posterior means of times while the bars generated by
-FigTree will indicate the 95% CIs.  I have also added an option to
-deal with TipDate data, such as viral sequences with dates.  See
-examples/TipDate/README.txt for more notes.
-
-
-
-Version 4.4e, May 2011
-
-mcmctree.  I have added an option of automatically adjusting the step
-lengths for proposals in the MCMC algorithm.  The program will use the
-burn in to automatically adjust the step lengths, using those values
-specified in the control file as initial values.
-
-codeml: The iteration method that updates one branch length at a time
-(method = 1) is broken for the free-ratios model (model = 1) from
-versions 4.3 to 4.4d inclusive.  When codeml runs, you will see
-messages like the following on the screen: 
-Warning rounding error? b=3 cycle=0 lnL=6036.9413757 != 6070.1069284 
-Version 4.2 and earlier are correct.  This bug is now fixed.  
-Thanks to Tae-Kun Seo for reporting the bug.
-
-
-
-Version 4.4d, March 2011
-
-yn00.  The option of using weighting (weighting = 1) in the
-calculation of dN and dS by the YN00 method (Yang and Nielsen 2000)
-has been broken since Version 4.4 (February 2010).  A bug was
-introduced when I changed the character coding in Version 4.4, so that
-this bug affects Versions 4.4, 4.4a, 4.4b, and 4.4c.  The bug
-typically causes a crash.  The result for weighting = 0 is not
-affected.  Versions before 4.4 are correct.  This bug is now fixed.
-Thanks for Charlie
-(http://www.ucl.ac.uk/discussions/viewtopic.php?t=8726&highlight=yn00).
-
-
-
-Version 4.4c, August 2010
-
-mcmctree.  A bug causes the exotic calibrations (gamma, skew normal
-and skew t) to be sometimes ignored, if you have multiple loci
-(partitions) and if some species/sequences are missing at some loci.
-If all calibrations are bounds (minimum, maximum, and joint), or if
-the partitions have the same number of species/sequences, the
-calculation is still correct.  This bug is in versions 4.1-4.4b.  This
-is now fixed.  Thanks to Mike Steiper.
-
-
-
-Version 4.4b, July 2010
-
-mcmctree.  A serious bug has been discovered which affects the option
-of approximate likelihood calculation (usedata = 2).  If the first
-child of the root in the tree for any locus is a tip, the MLEs of
-branch lengths in the in.BV file are assigned to wrong branches, so
-that the results are entirely incorrect, and also very different from
-those obtained using the exact likelihood calculation (usedata = 1).
-Suppose the tree for a locus is (A, B), with a bifurcation at the
-root, and with A and B representing species or clades.  If A is a tip,
-the program is incorrect, while if A is an internal node, the program
-is correct.  This bug is in versions 4.2b - 4.4a.  Hopefully this is
-now corrected.  Thanks to Mario dos Reis.
-
-mcmctree.  I also changed the specification on L, U, and B bounds to
-allow the user to specify the tail probabilities (which used to be
-fixed at 2.5%) for each fossil calibration.  See table 8 in
-pamlDOC.pdf for details.  You will still have to run the program
-without data to get the effective prior of times used by the program.
-
-
-
-Version 4.4a, June 2010
-
-codeml (RateAncestor=1).  For ancestral reconstruction under codon
-models, the program prints out the translated amino acids when it
-prints the codons.  A bug caused the translation to be incorrect,
-so instead of the amino acid, the program prints something like h or
-H.  This bug was introduced in Version 4.4, February 2010 when I
-changed the way that the characters are coded.  This is fixed now.
-
-baseml (clock = 1, 2, or 3 Mgene option).  If you assign different
-rates for different genes (or site parations) and the model assumes
-global clock or local clock, the program fails to print out the
-relative rates for genes.  This bug seems to be introduced between
-3.13 and 3.14 and has since.  The log likelihood value and estimates
-of parameters are all correct, and the problem is only that MLEs of
-rates for genes are not printed in the output file properly.
-
-
-
-Version 4.4a, April 2010
-
-In versions 4.2, 4.3, 4.4, Clade model D does not print the w
-estimates for the site classes correctly.  For the example run in
-examples/datasets2/, the incorrect output looks like the following
-
-site class             0         1         2
-proportion        0.48590   0.10718   0.40692
-branch type 0:    0.10557   4.17186   4.17186
-branch type 1:    0.10557   4.17186   3.05340
-
-while the correct results (from version 4.1) are 
-
-site class         0         1         2
-proportion       0.48590  0.10718  0.40692
-background w     0.10557  4.17186  3.05340
-foreground w     0.10557  4.17186  0.95081
-
-The ratios w2 and w3 for site class 2 are incorrectly printed, with w2
-equal to w1.  The branch lengths and lnL values etc. are all correct.
-This bug affects versions 4.2, 4.3, and 4.4, and version 4.1 and
-earlier versions appear to be correct.  Also the bug affects clade
-model D, and clade model C appears to be fine.  This is fixed now.
-Thanks to cajawe for reporting the bug (see
-http://www.ucl.ac.uk/discussions/viewtopic.php?p=28240#28240).
-
-
-
-Version 4.4, February 2010
-
-I changed the way that the characters are coded in the programs, which
-means that essentially every program in the package is modified.  It
-is likely that some bugs are introduced during the process.  
-
-codeml.  The iteration algorithm specified by method = 1 when applied
-to amino acid model of gamma rates among sites (seqtype = 2, alpha >
-0, method = 1) is broken in versions 4.2-4.3b.  The bug causes the
-iteration to fail to converge, with messages like the following
-printed on the monitor: 
-"Warning rounding error? b=5 cycle=0 lnL=1718.5667685 != 1749.4724545".
-
-mcmctree.  If the gamma prior for rates has alpha <1/3, the initial
-values are generated incorrectly.  This is now fixed.
-
-
-
-Version 4.3b, November 2009
-
-mcmctree.  A bug was introduced in version 4.2, which causes the
-program to ignore the minimum age constraint on the root age and uses
-the maximum bound specified by RootAge in the control file, if there
-is a calibration on the root and it is a minimum-age constraint.  If
-you have no calibration on the root, or if the calibration is an
-maximum bound, both minimum and maximum bound or gamma distribution,
-there is no problem.  The error affects versions 4.2 and 4.3, and
-version 4.1 is correct.  This is now fixed.  Thanks to Mathieu
-Groussin.
-
-evolver.  I have now added back the option of printing out the site
-pattern counts instead of the sequences (specified at the beginning of
-the control file such as MCbase.dat).  Use of pattern counts takes
-less disk space.  The option was removed in version 4.2b after the
-algorithm for collaspsing sites into patterns was rewritten.
-
-
-
-Version 4.3a, October 2009
-
-mcmctree.  A bug was introduced in version 4.3, which causes the root
-age to be proposed incorrectly.  The effect of the bug appears to be
-subtle and is hard to assess.  It means that the results from all
-three clock models (clock = 0, 1, 2) are incorrect.  Here is a bit more
-detail about the bug.  The internal maximum node age was set too high,
-and caused loss of significant digits in a smart reflection algorithm
-which reflects proposed values into the feasible interval.  This
-update fixes the bug.  Please use this version to reanalyze your data
-if you have used version 4.3.  Version 4.2 does not have this bug.
-
-mcmctree.  I now supplies compiling options so that you can use hard
-minimum hard maximum, or hard minimum soft maximum.  The executables
-are called mcmctreeHH and mcmctreeHS.  See the commands for compiling
-at the beginning of the file mcmctree.c.
-
-
-
-Version 4.3, August 2009
-
-mcmctree.  The approximate likelihood calculation was unreliable,
-because the Hessian matrix was not calculated reliably, especially if
-the maximum likelihood estimates of some branch lengths are 0.  I have
-now implemented the approach of Seo, Kishino and Thorne (2004), which
-appears to be more stable.
-
-baseml & codeml: I changed these two programs to read fasta alignments
-directly, working out the number of sequences and the sequence length
-automatically.  However, this does not work when you have multiple
-alignments in one file, that is, if ndata = 2 or higher.
-
-
-
-Version 4.2b, April 2009
-
-mcmctree. There is a bug that leads to crash when usedata = 3 and when
-the data at some loci have ambiguity characters while other loci do
-not.  usedata = 3 means that the program invokes baseml or codeml to
-calculate the maximum likelihood estimates (MLEs) of branch lengths
-and the variances of those MLEs.  The temporary sequence alignment
-files for the loci (tmp?.txt) generated by mcmctree have garbiage and
-are unreadable by baseml (or codeml).  This bug is now corrected.
-(You can replace the wrong alignment file for the locus by your
-original alignment and run baseml outside mcmctree.)
-
-
-All programs.  I rewrote the code for collapsing sites into patterns.
-This should have no impact on user files, except that evolver does not
-produce data files in the "pattern-count" format anymore.  Now two
-formats (paml/phylip format and PAUP nexus format) are used, specified
-by the first line of the control data file (e.g., MCbase.dat,
-MCcodon.dat, MCaa.dat).  This pattern-count format is still accepted
-as input by baseml, codeml, etc. (See the section on "Site pattern
-counts" in the documentation).
-
-baseml & codeml.  A bug was introduced to the local clock model
-(clock=6), so that the program generates the following error message:
-"need fossils for this locus".  For example, running codeml on the
-control file /examples/MouseLemurs/codonml2.ctl will do so.  This bug
-is now fixed.
-
-codeml: The bug introduced in version 4.2 concerning the free-ratios
-model (Yang 1998 MBE) was not fixed properly in version 4.2a (see the
-notes for 4.2a below).  The results are incorrect when there are
-exactly 7 branch types (for example when you run the free-ratios model
-on a 5-species tree) but are correct for other numbers of branch
-types.  I hope this is now finally fixed.
-
-
-
-Version 4.2a, February 2009
-
-
-codeml: Two bugs were known to have been introduced in version 4.2
-when I rewrote the code for the branch-site and clade models.  The
-first is the null hypothesis in the branch-site test (model = 2
-NSsites = 2, fix_omega = 1 omega = 1).  The reference for this model
-is Yang, Wong & Nielsen (2005), and Zhang et al. (2005).  The
-alternative model (fix_omega = 0) is fine.
-
-The second bug is in the free-ratios model (Yang 1998 MBE).
-
-The results from those two models are incorrect and should be ignored
-if you used version 4.2.  Both mistakes are fixed in this update.
-
-Thanks to W.P. Zhou and BYLee.
-
-
-
-Version 4.2, December 2008
-
-MCMCtree: The implementation of the lower bound and of the upper bound
-is changed.  The improper density of equation (15) of Yang and Rannala
-(2006) for the lower bound is noted to push up divergence times, a
-feature we consider undesirable.  Now a truncated Cauchy distribution
-is used instead.  This will be described in more detail somewhere.
-
-The default file name for the MCMC samples is mcmc.out.  To change
-this, add the following line in the control file mcmctree.ctl:
-      mcmcfile = mcmc.out 
-
-codeml: Clade models C and D are extended to accommodate more than two
-branch types.  The old models accept two branch types only, referred
-to as foreground and background branches.  The structure of the new
-models are below.  You use branch/node numbers to specify the branch
-types (with #0 to be the default, followed by #1, #2, #3, ...).  As
-before, the BEB calculation is implemented for clade model C only, and
-is not available for clade model D.  The BEB calculation is expensive,
-so I expect clade model C will work if you have only a few branch
-types.  For each additional branch type, the BEB computation is 10
-times more expensive.
-
-Site class   Proportion    w ratio
-
-  0             p0         0 < w0 < 1
-  1             p1         w1 (w1 = 1 is fixed in model C)
-  2             p2         w2, w3, w4, ...
-
-I also added an option of fixing the last w to 1 in clade model C,
-specified in the control file as follows.  For example, if there are
-three branch types in the tree (see the table above), the last w fixed
-will be w4.
-
-       fix_omega = 1
-       omega = 1
-
-The two versions of the clade model can be compared to test whether
-the last w is > 1.
-
-
-codeml: branch model with amino acid chemical properties, specified
-using model = 2 and aaDist = 7.  Under this model, the types of
-branches are specified just like the branch model (by labelling
-branches).  For each branch type, the program estimates a few w's,
-depending on specifications in the file OmegaAA.dat.  This model seems
-to be broken for a long time since its implementation.  Look at the
-readme.txt file in examples/mtCDNAape/.
-
-
-
-Version 4.1, August 2008
-
-MCMCtree: A few changes have been introduced to this program.  The
-format for specifying fossil calibrations in the tree file has been
-changed, so that the gamma distribution is specified as 'G(alpha,
-beta)' and the old format '>0.6=0.7<0.8' does not work anymore.  The
-bounds can be specified using either the old form '<0.8' or the new
-form 'U(0.8)'.  Two exotic distributions, skew normal and skew t, are
-added, using the format 'SN(location, scale, shape)' and 'ST(location,
-scale, shape, df)'.  I edited the documentation pamlDOC.pdf and
-rewrote the part for mcmctree.  The format of the MCMC output file
-mcmc.out has been modified so that it is readable by Andrew Rambaut's
-Tracer program.  Note that MCMCtree starts taking samples and writing
-into the file only after burnin.
-
-codeml. The mutation-selection models of Yang and Nielsen (2008
-Mol. Biol. Evol. 25, 568-579) are specified using the options
-
-    CodonFreq = 6 or 7  * 6:FMutSel0, 7:FMutSel
-      estFreq = 1
-
-Use examples/mtCDNAape/codeml.HC.ctl to duplicate results in table 1
-in the paper.  It seems that some results are in the mlc file while
-some other results are in the rst and rst1 files.
-
-
-codeml (branch model): When codon sequences are analyzed under the
-branch model (model = 1 or 2, NSsites = 0), the main output file
-includes a tree under the line "w ratios as labels for TreeView:".
-This line was introduced in version 3.15.  However, the branch lengths
-in the tree had the dN values, while I intended them to be the
-original branch lengths (which are t, the number of nucleotide
-substitutions per codon).  This bug may also affect the results of
-ancestral reconstruction under the same branch model.  This bug is
-fixed now.
-
-
-baseml and codeml (discrete-gamma model): My intention has been to use
-the mean and not the median to represent all rates in each category
-when the discrete-gamma model is implemented.  See Yang (1994
-J. Mol. Evol. 39:306-314) for the distinction.  However, a bug caused
-a few recent versions to use the median instead.  This affects baseml
-versions 3.14c and 3.15, while version 3.14b or earlier and version 4
-or later are correct.  codeml in version 4a & 4b were incorrect while
-other versions are correct.  I hope both baseml and codeml are correct
-from version 4c.
-
-
-baseml and codeml (branch and clade labels): The local clock models
-(clock = 2) in the two programs and also the branch or branch-site
-models in codeml use labels to identify branches.  When nested clade
-label ($) are used, the program may get confused so that the branches
-are not labeled correctly.  I think this is not fixed in version 4c.
-
-
-Version 4b, January 2008
-
-mcmctree.  Changed a variable name in the control file from MaxRootAge
-to RootAge.  This should be specified in any of the following formats: 
-
-       RootAge = <1.0  * constraint on root age, used if no fossil for root.
-       RootAge = '<1.0'  * constraint on root age, used if no fossil for root.
-       RootAge = <1.0>0.8  * constraint on root age, used if no fossil for root.
-       RootAge = '<1.0>0.8'  * constraint on root age, used if no fossil for root.
-
-Added some text around the output, to make the results more comprehensible.
-
-evolver (option 8).  option 8 for calculating bipartition distances
-between trees is broken.  I changed the program to prepare for a
-figure and forgot to remove the changes afterwards.  This is now
-fixed.
-
-
-Version 4, July 2007
-
-mcmctree.  The MCMC algorithm of Rannala and Yang (2007) dealing with
-rate drift is added.  The section of manual on mcmctree is now
-rewritten.  An example data set is included in the
-examples/DatingSoftBound/ folder.
-
-codeml. This now implements the mutation-selection models of codon
-substitution of Yang and Nielsen (2008).  I should include more
-details after the models are better tested, for example, after the
-paper is written.
-
-baseml.  The log likelihood under the gamma-rates model (that is, when
-alpha > 0 in the control file) is not calculated correctly, or not as
-intended.  Version 3.14a is correct, but versions 3.14c and 3.15 are
-incorrect.  I am not sure about version 3.14b, as I have not kept a
-copy.  The bug causes the median to be used to represent all rates in
-each category of the discre-gamma model whereas the mean was intended.
-See the discussion around equation (10) and the paragraph below it in
-Yang (1994. J. Mol. Evol. 39:306-314).  The bug is not expected to
-have much effect on tree topologies or branch lengths.  Thanks to
-Simon Whelan for reporting the bug.
-
-
-
-Version 3.15, March 2006
-
-evolverNSbranches (simulation program to generate data under the
-branch models of codon substitution).  A bug was introduced after
-version 3.14a so that versions 3.14b (?), 3.14c, and 3.15 have a bug
-that makes the program abort prematurely.  This is now corrected.
-Thanks to N. Clark for reporting the bug.
-
-
-Version 3.15, January 2006
-
-evolver.  The simulation option of the evolver program in version
-3.14b (?), 3.14c, and 3.15 has a serious bug that makes the program
-generate nonsensible sequences when the user tree has 0 internal
-branch lengths.  The bug was introduced between versions 3.14a (which
-is correct) and 3.14c (which is wrong).  I don't seem to have kept
-3.14b to check whether it is in error.  In this case, the smart idea
-that led to the bug was the thought that no evolution occurs if the
-branch length is 0 (or less than 1e-20, to be more precise).  If the
-branch length is 0, there is no need to evolve the sequence along the
-branch, so I decided to copy the sequence at the start of the branch
-to the node at the end of the branch.  This would be fine, except that
-I introduced a bug at the same time that caused the program to skip
-generating sequences at all nodes that are descendents of the
-zero-length branch.  As a result, sequences at tips that are
-descendents of the zero-length branch have no data.  On some systems
-the uninitialized space is initialized as 0, and then the sequences
-will be printed out as consisting of all Ts (T is the first nucleotide
-in my program).
-
-
-At the same time, I also turned on the option of simulating data on
-random trees in evolver.  You change fixtree=1 into 0, and recompile.
-Then use MCbaseRandomTree.dat to simulate data instead of MCbase.dat:
-
-    evolver 5 MCbaseRandomTree.dat
-
-
-
-Version 3.15,  November 2005
-
-mcmctree: This implements the MCMC algorithm of Yang and Rannala (2005
-MBE) for estimating species divergence times using soft bounds.  The
-example file is in examples/SoftBound/.  Look at the readme file and
-see whether you can duplicate our results in the paper.  Note that
-this program works with DNA sequence data only, and assumes the
-molecular clock (clock = 1).  Note that assumption of the clock can
-lead to seriously biased results if the clock is violated.  Work is
-under way to relax the clock assumption.  
-
-yn00: Added three methods for estimating dS and dN: LWL85 (Li, Wu &
-Luo 1985), LPB93 (Li 1993; Pamilo and Bianchi 1993), and LWL85m (Yang
-2006 Computaional Molecular Evolution, Oxford University Press.  Eqs.
-2.12 & 2.13).
-
-codonml.  Added example files for clade model C, in folder
-examples/CladeModelC.  Thanks for Joe Bielawski for preparing these.
-
-Added a sequence data format, in which the numbers or frequencies of
-site patterns are inputed instead of the sequences.  The format is
-indicated by the option variable P on the first line of the data file,
-used in the same way as option variables I (for interleaved format)
-and S (for sequential format, which is the default).  See the section
-on sequence data format in the paml documentation for details.  The
-old control variable readpattf for baseml and codeml is now deleted.
-You can also use evolver to simulate data using this format.  Look at
-the control files MCbase.dat, MCcodon.dat, and MCaa.dat.  This format
-is useful for very large data sets with >1M sites, when collapsing
-sites into patterns can take a long time.  
-
-
-
-Version 3.14b, May 2005
-
-pamp crashes when there are more than 127 or 255 (depending on the
-system) species in the data.  The bug is due to my declaration of the
-variable visit[] in the routine PathwayMP1 in pamp.c as char rather
-than int.  Another problem is that the formula used to calculate
-distances under REV (GTR) is sometimes inapplicable and the program
-prints out messages like "err: DistanceREV".  I introduced some ad hoc
-way of dealing with the problem.
-
-
-
-Version 3.14b, April 2005
-
-aaml (codeml with seqtype = 2): 
-
-Under the REV model (model = 9), codeml failed to print out the
-estimated amino acid substitution rate matrix.  This is fixed.
-
-
-aaml (codeml with seqtype = 2 or 3):
-
-If you have multiple data sets (ndata > 1) and analysing protein data
-sets under the empirical models such as wag, jtt, dayhoff.  The
-results for the first data set are correct, but all later data sets
-are analyzed incorrectly under the corresponding +F models, that is,
-wag+F, jtt+F, dayhoff+F.  A bug in the program means that for the
-second and later data sets, the equilibrium amino acid frequencies are
-from the real data and not correctly set to those specified by the
-empirical models.  Thanks to Ben Evans.
-
-
-Version 3.14b, February 2005
-
-codonml and aaml:
-
-There is a bug that affects the joint reconstruction of ancestral
-sequences when some branch lengths are equal to each other and when
-those branches are visited one after the other in the program.  For
-example, if some branch lengths are estimated to be 0, such a problem
-might occur.  I think this bug is in the program for a long time,
-although I note that joint ancestral reconstruction is turned off in
-some versions for other reasons.  The marginal reconstruction is not
-affected by this bug.  The update fixes this bug.
-
-
-baseml and codeml:
-
-The program crashes on some systems for the local clock models (clock
-= 5 and 6) after the calculation is finished.  If you manage to run
-the model, the results should be fine.  There are bugs in the program
-when freeing memory at the end of the calculation.  They are fixed in
-this update.  See the discussion site at
-
-http://www.rannala.org/phpBB2/viewtopic.php?t=228
-
-Thanks to fjvera.
-
-
-
-Version 3.14a, December 2004
-
-codonml (codeml for codons):
-
-The mechanisctic amino acid substitution models (table 3 in Yang et
-al. 1998 MBBE 15: 1600-1611) appear to be broken in paml/codeml
-versions 3.13 and 3.14.  They were correct in version 3.12.  This is
-now fixed.  If you use those models, please run the examples in the
-folder examples/mtCDNA/ and compare results with those published in
-the paper to confirm the program.  The results in the paper are
-correct, but the program implementation may be broken due to lack of
-maintenance.
-
-codonml (codeml for codons)
-
-When you fit the branch models with three or more branch types (omega
-ratios), the program aborts with an error message saying that only two
-omega ratios are allowed.  This is due to a bug in the program and is
-now fixed.
-
-
-Version 3.14, September 2004
-
-
-1. codonml (codeml for codons): 
-
-(1a) NSsites models M1 (neutral), M2 (selection), and M8 (beta& ) have
-been changed.  The references are two manuscripts that are not yet
-published (Wong, Yang, Goldman & Nielsen, 2004 Genetics 168:
-1041-1051; Yang, Wong, and Nielsen 2005 MBE).  The details are as in the
-following table.
-
-[table missing or never written.]
-
-The old models M1 and M2 are noted to be rather unrealistic as they do
-not account for sites with 0 < w0 < 1.  Thus in the new models 0 < w0 < 1
-is estimated from the data.  Also insisting on a class of sites with 0
-= 1 appears to help avoid false positives as sites under weak
-selection will be absorbed in this neutral class rather than being
-claimed to be under positive selection.  Under M8, the constraint s >
-1 is now automatically enforced to help avoid multiple local peaks and
-also to make the model a positive-selection model.  Note that the
-newer models replace the old ones, which are available in older
-versions of paml/codeml only.
-
-After the changes, the models are nested as follows: M0 (one ratio) <
-M1a (NearlyNeutral) < M2a (PositiveSelection) < M3 (discrete), and M7
-(beta) < M8 (beta& ), where "<" means the model on the left is a
-special case of the model on the right:.  Two LRTs can be constructed
-using those models.  The first one compares M1a against M2a, using a
-chi square with df 2, and the second one compares M7 against M8, using
-a chi square with d.f. = 2.
-
-(1b) Similarly branch-site model A (Yang and Nielsen 2002) is changed.
-The old model assumes 0 = 0 and is unrealistic.  This is replaced by 0
-< 0 < 1, estimated from the data.  The new model is still called
-branch-site model A.  It can be compared with the new M1a
-(NearlyNeutral) to form a likelihood ratio test, with d.f.  2.  This
-is called test 1.  Another test, called test 2, uses a null model A
-but with 2 = 1 fixed (use fix_omega = 1 and omega = 1).  Test 2 is
-very conservative, but test 1 might mistake relaxed selective
-constraint on the foreground branch as positive selection.
-
-
-(1c) Similarly the clade model C ( see also Forsberg and Christiansen
-2003; Bielawski and Yang 2004) is changed as follows.  The new model C
-replaces 0 = 0 by 0 < 0 < 1 and has five parameters in the
-distribution: p0, p1, 0, 2, and 3.  The new model C can be compared
-with the new M1a (NearlyNeutral), with d.f.  3.
-
-(1d) The empirical Bayes procedure for calculating posterior
-probabilities of site classes under the site models, branch-site
-models, and clade models are called naïve empirical Bayes (NEB).  They
-are naïve because they use the MLEs of parameters without accounting
-for sampling errors in them.  This problem is now fixed using a
-procedure called Bayes empirical Bayes (BEB) (Yang, Wong, and Nielsen
-2005).  The BEB procedure is implemented for the site models M2a and
-M8, the (modified) branch-site model A, and the (modified) clade model
-C.  The old NEB calculation is kept in the output and the new BEB
-results follow the NEB result.  Perhaps I will remove the NEB
-calculation in later versions.
-
-
-baseml/codeml: rewrote likelihood clock and local clock models.
-Implemented models for combined analysis of multiple genes
-incorporating multiple calibration points (Yang and Yoder 2003).  The
-ad hoc rate smoothing procedure of Yang (2004) is implemented for
-nucleotide, amino acid, and codon substitution models, and the
-implementation also deals with missing species at some loci.  The
-option variable clock in the control files is now used differently
-from before.  See later in this documentation and also the readme and
-readme2 files in the folder examples/MouseLemurs/.
-
-baseml/codeml local clock models.  clock = 5 and 6 are added in
-3.14beta5.
-
-codeml: added branch-site models C and D (Bielawski and Yang in press).
-
-mcmctree: this program is disabled in this release.  The old program
-died and the new program is still under construction.
-
-The main body of this documentation may not be up to date. 
-
-
-
-Bug fixes and minor corrections
-
-
-baseml/codeml.  The output in the file rates under the gamma model
-lists inferred rate for each site.  The output is incorrect if the
-tree is large and scaling is used to avoid underflow.  The use of
-scaling is indicated by two lines of output on the monitor like the
-following:
-
-"2 node(s) used for scaling (Yang 2000 J Mol Evol 51:423-432):
- 155 350".  
-
-The results are clearly wrong as the probabilities are much greater
-than 1, and the rates are many orders of magnitude too large, etc.
-Also under the same problematic condition the expected numbers of
-sites with certain site patterns in the file lnf are incorrect.  When
-scaling is not used, the results should be o.k. and look reasonable.
-This affects versions 3.13 and 3.14beta1-3.  This is fixed in version
-3.14.  Thanks for Nick Goldman for pointing out the error.
-
-yn00.  The program crashes for large datasets with many codons due to
-a memory allocation error.  This affects versions 3.13 and some
-versions of 3.14beta1.  The problem is fixed in version 3.14.
-
-aaml (codeml for amino acids).  Model REVaa_0 is broken due to lack of
-maintenance in versions 3.1, 3.11, 3,12, 3,13, and 3,14beta1-4.  It
-seems to be working in version 3.  This is fixed in 3.14beta5.  2
-April 2004.
-
-codonml (codeml for codons) in branch-site models (model = 2 NSsites =
-2 or 3) prior to 3.14beta3 have a scaling problem, which makes the
-length of the foreground branch to be incorrectly estimated.  Other
-branch lengths and substitution parameters, such as the parameters in
-the distribution, are corrected calculated, as well as the log
-likelihood values and the posterior probabilities.  27 March 2004
-
-pamp: The gamma parameter for variable rates among sites using the
-method of Yang & Kumar (1996) is not estimated correctly in 3.14beta1,
-beta2, beta3.  Versions 3.13d and earlier seem fine.
-
-codonml (codeml for codons) in version 3.14beta and 3.14beta1 does not
-work when fix_omega = 1 is used for branch models.  The results are
-incorrect.  Please use newer versions 3.14beta3 or later.  Also
-version 3.13 is fine.
-
-baseml: Models TN93 and REV are wrong when used with Mgene = 3 or 4.
-This seems correct in version 3.12 but went wrong in 3.13 and 3.14
-since I inserted TN92 between HKY85 and TN93.  Thanks for Lee Bofkin
-for reporting the error.
-
-codonml (codeml for codons): When codon models are used to reconstruct
-ancestral sequences (with RateAncestor = 1), the program lists
-synonymous and nonsynonymous changes at each site under the heading
-"Changes at sites (syn nonsyn)."  This listing is incorrect due to a
-bug in the program.  This bug affects versions 3.12, 3.13 and 3.14,
-and version 3.11 seems correct.  Thanks to Joe Bielawski.
-
-baseml/codeml: The SE's for divergence times under the clock models
-are calculated incorrectly.  This happens when you use clock = 1 or 2,
-supply fossil date to calculate absolute times, and request standard
-errors for times.  The estimates of times themselves are correct, but
-standard errors for times are wrong.  The SEs for times and for the
-rate under the TipDate model (clock = 3) are wrong as well.  The
-programs print out the variances after the +-, instead of their square
-roots.  This error was introduced in version 3.13.  Versions prior to
-3.13 are correct.  I posted an update 3.13d to fix this bug.
-
-evolver: the simulation program can now accept species names in the
-tree.  Note that the data file formats for evolver (MCbase.dat,
-MCcodon.dat, MCaa.dat) have changed and you might have to change your
-own data files.  Look at the files included in the package.
-
-The documentation in paml v3.13 from August 2002 - 12 December 2002
-had a mistake about the critical values for the newer test using a
-modified M8.  The critical values for the test are 2.71 at the 5%
-significance level and 5.41 at the 1% level, rather than 1.95 and 3.32
-as in the documentation.
-
-
-
-
-
-Version 3.14beta2, November 2003
-
-(1) baseml: Models TN93 and REV are wrong when used with Mgene = 3 or
-    4.  This seems correct in version 3.12 but went wrong in 3.13 and
-    3.14 since I inserted TN92 between HKY85 and TN93.  Thanks for Lee
-    Bofkin for reporting the error.
-
-(2) codonml (codeml for codons): When codon models are used to
-    reconstruct ancestral sequences (with RateAncestor = 1), the
-    program lists synonymous and nonsynonymous changes at each site
-    under the heading "Changes at sites (syn nonsyn)."  This listing
-    is incorrect due to a bug in the program.  This bug affects
-    versions 3.12, 3.13 and 3.14, and version 3.11 seems correct.
-    Thanks to Joe Bielawski.
-
-
-
-Version 3.14beta, 20 July 2003
-
-(1) baseml/codeml: rewrote likelihood clock and local clock models.
-   Implemented models for combined analysis of multiple genes using
-   multiple calibration points (Yang and Yoder 2003).  The option
-   variable clock in the control files is now used differently from
-   before.  See documentation below and also the readme and example
-   files in the folder paml/examples/MouseLemurs/.
-
-(2) codeml: added branch-site models C and D (Bielawski and Yang
-   submitted).
-
-(3) baseml/codeml: The SE's for divergence times under the clock
-   models are calculated incorrectly.  This happens when you use clock
-   = 1 or 2, supply fossil date to calculate absolute times, and
-   request standard errors for times.  The estimates of times
-   themselves are correct, but standard errors for times are wrong.
-   The SEs for times and for the rate under the TipDate model (clock =
-   3) are wrong as well.  The programs print out the variances after
-   the +-, instead of their square roots.  This error was introduced
-   in version 3.13.  Versions prior to 3.13 are correct.  I posted an
-   update 3.13d to fix this bug.
-
-(4) mcmctree: this program is decommissioned in this release.  The old
-   program died and the new program is still under construction.
-
-(5) evolver: the simulation program can now accept species names in
-   the tree.  Note that the data file formats for evolver (MCbase.dat,
-   MCcodon.dat, MCaa.dat) have changed and you might have to change
-   your own data files.  Look at the files included in the package.
-
-(6) Improved the iteration algorithm a little.  No change to the
-   interface.
-
-(7) The documentation in paml v3.13 from August 2002 - 12 December
-   2002 had a mistake about the critical values for the newer test
-   using a modified M8.  The critical values for the test are 2.71 at
-   the 5% significance level and 5.41 at the 1% level, rather than
-   1.95 and 3.32 as in the documentation.
-
-(8) Edited the manual pamlDOC.pdf.
-
-
-
-
-15 May 2003
-paml3.13d
-
-baseml/codeml: The SE's for times under the clock models are
-calculated incorrectly.  This happens when you use clock = 1 or 2,
-supply fossil date to calculate absolute times, and request standard
-errors for times.  The estimates of times themselves are correct, but
-standard errors for times are wrong.  The SEs for times and for the
-rate under the TipDate model (clock = 3) are wrong as well.  The
-programs print out the variances after the +-, instead of their square
-roots.  
-
-This error was introduced in version 3.13.  Versions prior to 3.13 are
-correct.
-
-
-12 December 2002 correction of documentation
-
-The paml documentation in paml v3.13 from August 2002 - 12 December 2002
-has a mistake about the critical values for the newer test using a
-modified M8.
-
-The critical values for the test are 2.71 at the 5% significance level
-and 5.41 at the 1% level, rather than 1.95 and 3.32 as in the
-documentation.  I made a mistake and quickly discovered it.  I thought
-I corrected this a few months ago, but apparently the document
-included in the paml release had this mistake.  I am sorry.  I'll
-correct it now.
-
-
-
-Version 3.13, August 2002
-
-(1) baseml: nhomo = 1 and model = 6 (for both the one-rate and gamma-rates
-    models) in version 3.12 gives incorrect results.  The error was introduced 
-    in version 3.12 and is now fixed.  Sorry for the trouble and thanks to
-    Joshua Scott Rest.
-
-(2) baseml: Tamura's (1992) model is inserted between HKY85 and TN93.  
-    Unfortunately this means that the model code for TN93 and REV are shifted, 
-    and you have to change your control file for those two models.  I also 
-    added user-specified REV or UNREST models.
-    0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85, 5:T92, 6:TN93,
-    7:REV, 8:UNREST, 9:REVu; 10:UNRESTu
-
-(4) I implemented the nonhomogeneous models of Galtier and Gouy
-    (1998).  The options are model = 5 (T92) and nhomo = 3 (N1) or 4
-    (N2) or 5 (user).  nhomo = 5 is added for the model of Yang and
-    Roberts (1995) as well.  It means that the user specifies, in the
-    tree, how many sets of base frequencies (or how many GC content
-    parameters in the model of Galtier and Gouy 1998) should be used
-    and which branches has which sets.
-
-(5) codeml: For pairwise amino acid sequence comparison (seqtype = 2
-    and runmode = -2), the program crashes.  This is fixed now.
-    Thanks to Dennis Paul Wall.  
-
-(6) codeml: The mechanistic models of amino acid substitution (model = 6 or 7)
-    are broken for quite some time due to lack of maintenance.  I have
-    now fixed them.  See the file examples/mtCDNA/readme.txt.
-
-(7) codemlsites is now removed, which is for batch run of multiple NSsites
-    models.  This is now done by specifying several NSsites models in
-    the control file codeml.ctl for codeml, for example: NSsites = 0 1
-    7 8.
-
-(8) baseml & codeml used to ask the user what to do if the tree has branch 
-    lengths.  I have now introduced a variable fix_blength in the control 
-    file that does the same thing. 
-    fix_blength = -1  * 0: ignore, -1: random, 1: initial, 2: fixed
-
-
-
-Version 3.12, March 2002
-
-(1) baseml and codeml, Mgene=2 or 4 (different pi's) and RateAncestor
-    = 1 Ancestral reconstruction does not work properly although the
-    ML iteration is fine.  The output likelihood values (in the form
-    "Node 20: lnL = -1047.965624") are different from the optimum
-    likelihood already calculated.  The option was fine in version
-    3.0e and 3.1, and was broken in version 3.11.
-
-(2) codeml stops with an error message "pi=0" when some amino acids are 
-    missing in the sequences.  Earlier versions of the program are correct.
-
-(3) yn00 now accepts multiple data sets (specified by ndata).
-
-
-Version 3.11, September 2001
-
-(1) baseml and codeml (clock = 1 or clock = 2) sometimes print out
-    ridiculous (e.g., large negative) tree lengths and branch lengths.
-    I thought this was fixed in version 3.1, but it was not.
-    Hopefully it is now fixed.
-
-(2) baseml and codeml: Malpha (a separate alpha for each site
-    partition) does not work with Mgene = 0.  The output will give you
-    the same alpha value for each gene as you specify in the control
-    file.  This is now fixed.
-
-(3) baseml: When the REV or UNREST models (model = 6 or 7) are used
-    together with Mgene = 2, 3, 4, the program outputs one Q matrix,
-    when each site partition should have a different Q matrix.  The ML
-    parameter estimates are correct, but I did not bother to format
-    the output correctly.  The single Q matrix in the output is the
-    one for the first partition.  I have added the output for the
-    other matrices now.
-
-(4) baseml and codeml.  Ancestral reconstruction was disabled for
-    large trees with more than 100 or 200 sequences in previous
-    versions.  It is now allowed.  On large trees, the multiplication
-    of transition probabilities across branches in the tree cause the
-    product to become too small to be represented in the computer.
-    This is known as underflows.  I implemented a scaling algorithm to
-    avoid underflows some time ago but did not bother to do it with
-    the ancestral reconstruction algorithm at the time.  The algorithm
-    should now work, up to the upper limit set in the program, that
-    is, 1000 or 2000 sequences.
-
-
-
-Version 3.1, June 2001
-
-Changed the way of counting base or amino acid frequencies so that
-this version of baseml or codeml might generate different results from
-previous versions under the following model specifications.  
-
-   . baseml & aaml: if you have multiple genes and the model uses the same
-   frequencies in the model for the different site partitions.  
-
-   . codonml:
-   if you have ambiguity characters at all.  Ambiguity characters were
-   ignored when counting frequencies in previous versions but are used now.
-
-There is no difference if your data do not contain any alignment gaps
-or ambiguity characters or if you choose to remove them before any
-analysis (cleandata = 1).  The early versions used two ad hoc methods
-in dealing with such ambiguity characters in different places of the
-programs.  The new version uses another ad hoc method and in all
-places.  I think it will be fine which ever version you use, but do
-not use different versions in the same analysis.
-
-codonml (codeml with seqtype = 1) with Mgene = 1 does not work
-correctly if the sequences have alignment gaps or ambiguity
-characters.  Mgene = 1 means separate analysis.  The bug causes data
-for the genes except the first to be read out of frames, so that most
-likely you will see a lot of messages "stop codon XXX" before the
-program aborts.  This is now fixed.
-
-
-codonml (codeml with seqtype = 1) with fix_alpha=0: In the final
-output for the codon + gamma model, parameter alpha is not correctly
-identified.  If you have fix_kappa = 0, fix_omega =0, and fix_alpha=0,
-the last three numbers should be kappa, omega (dn/ds), and alpha.  The
-detailed output did not identify alpha correctly, although the
-likelihood value and the outputted rates and frequencies (r and f) are
-correct.  I believe this is now corrected.
-
-
-
-Version 3.0e, 14 May 2001
-
-codonml (codeml with seqtype = 1) with model = 1: The program crashes
-when you have a small tree with <= 5 branches.  This is a new error
-introduced in version 3.0c.  It is now fixed.
-
-baseml and codeml (clock = 1 or clock = 2) sometimes print out ridiculous
-tree lengths and branch lengths.  
-
-yn00 in version 3.0c ignores icode and uses the universal code only.
-It was correct in vesion 3.0b.  This is fixed in version 3.0d.
-
-
-Versions 3.0c-3.0d, 20 February 2001
-
-We bought a Pentium III cluster running redhat linux 6.2.  The gcc
-compiler 2.61 seems to have bugs and codeml does not run for NSsites =
-7 or 8.  The program might output "points out of order" or "Det goes
-to 0" error messages.  3.0d seems to be safer for these models.
-
-codonml (codeml with seqtype = 1) with NSsites = 3 might generate a
-message "error: sortwM3." and then stop with no output after the
-iteration.  This is now fixed.  During the iteration, there was no
-restriction on the w values, so that w0 can be greater than w1 and so
-on.  At the end of the iteration, I sort the three w values so that w0
-< w1 < w2 before outputting.  3.0c has a bug at sorting.  If you don't
-see the error message, everything is fine.
-
-
-10 November 2000
-
-The probabilities for Joint ancestral reconstruction are not correct.
-Many values are larger than 1.  Fixed in version 3.0c.
-
-
-
-Version 3.0c, 30 October 2000
-
-(1) Changed parameterization of clock models (clock = 1, 2, or 3), so that
-the output now lists node ages, and the SEs are for node ages as well.
-Node ages are measured by the expected number of nucleotide or amino
-acid substitutions per site (nucleotide, amino acid, or codon) from
-the node to the present time, and are proportional to the divergence
-times.  In earlier versions, the output list proportions.
-
-(2) codonml: problem with NSsites model with multiple gene data
-(Mgene=1) is now fixed(?).  In the earlier version, the likelihood
-calculation was o.k., but identification of sites under positive
-selection was incorrect as the program attempted to list sites in
-another gene when it was analyzing data for one gene.  (1) Mgene
-options for codons and amino acids:
-
-
-
-Version 3.0b, 6 October 2000
-
-The local clock models (clock = 2) are not implemented correctly in paml3.0 or 3.0a. 
-
-Yoder, A. D., and Z. Yang.  2000.  Estimation of primate speciation
-dates using local molecular clocks.  Molecular Biology and Evolution
-17 (7): 1081-1090.
-
-I introduced a serious error before I made the models available to the
-public, in ver3.0 (May 2000).  Two symptoms are noted right now: (1)
-crashes (which are good as they make you realise that something is
-wrong immediately) and (2) the local clock model (clock=2) having a
-worse likelihood than the global clock model (clock = 1).  The results
-in the paper are correct except for a typo ("37.30 - 48.48" in table 3
-for the cetacean-whale divergence using the C3 calibration should be
-"57.30 - 48.48").  I now include the example data sets and results
-from that paper to help you prepare your own data files.
-
-Sorry for any damages done.
-
-Many thanks to Toby Johnson and Philip Awadalla for spotting the error.
-
-
-
-Version 3, 4 February 2000
-
-The program evolver may not generate sequences correctly if branch
-lengths are very large, say >10.  
-
-Scaling by nodes to avoid underflow.  When there are many sequences in
-the data (say > 200 or 300), the probability of observing data at a
-site can become too small to represent in the machine.  This is
-because the probabilities are multiplied along branches in the tree
-and a large tree has many branches.  Since version 2.0, this underflow
-is dealt with by scaling, and it works with both the one-rate and
-gamma-rates models.  It did not work with the NSsites models in
-codonml, but probably nobody has analysed large data sets to notice
-this.  Anyway, this version hopefully fixes that problem, when I was
-analysing the data set of Bush et al. (1999 MBE 16:1457-1465), which
-has about 350 sequences.  If you use the program on very large data
-sets, watch out for anything strange and let me know.  I suppose with
-the option of using supplied branch lengths, data sets of such sizes
-will become manageable, and this scaling will become important.
-
-
-
-Version 2.0h, Oct 14, 1999
-
-Fixed the following errors, all simple errors.
-
-The RemoveIndel routine in 2.0e and 20.f has a bug that removes most
-sites in the sequence except those with all T's and C's for baseml.
-The same routine in 2.0g has a different bug that removes most sites
-except those will all T's and C's in the sequence for aaml.  In either
-case, the results will be grossly wrong.  The bugs affect the analysis
-only if you have gaps or ambiguity characters in the data and use
-runmode = 1 for aaml in 2.0e and 2.0f, and runmode = 1 for codonml for
-2.0g.
-
-The newly introduced option for amino acid reconstruction by baseml
-(icode) does not work.  The program aborts with an error message.
-This is now fixed.
-
-
-
-
-Version 2.0g, Sept 21, 1999
-
-Fixed a bug in 2.0f that causes crash in codeml when getSE = 1.  
-
-If the tree has branch lengths, codeml and baseml in 2.0f will simply
-use the branch lengths rather than estimating them by iteration.  I
-have added a question for the user to choose to ignore the branch
-lengths, use them as fixed or use them as initial values for the
-iteration.
-
-
-
-Version 2.0d, June 30, 1999
-Fixed some convergence problems in ML pairwise estimation of dS and 
-dN, in particular, in presence of data partitions (the G option).
-
-Fixed a bug in evolver.  The bug means that no sensible data are
-generated if the tree has branch lengths smaller than 1e-7.  At some
-point, I did something "clever" (taking advantage of the fact that at
-such small branch lengths, no evolution would occur), which destroyed
-the calculation.  
-
-
-
-Version 2.0c, May 12, 1999
-Corrected a few problems with the evolver program, and with running 
-multiple data sets using baseml.  Added REV in MCbase.dat.
-Thanks to John Mercer.
-
-Note--Uncomment ndata in baseml or codeml to analyse multiple
-data sets generated from evolver.
-
-
-
-April 16, 1999
-
-Although posted only a few days ago, paml2.0 has an error in the
-program codeml, which invalidates most analyses based on codon
-substitution models.  The symptom is relatively easy to spot and is
-indicated by one of the parameters (say omega) not being estimated at
-all and in the output you will have exactly the same number as you
-specified in the control file.
-
-Amino acid-based analysis or pairwise comparison of codon sequences
-are not affected.  Use paml2.0a to redo the analysis.  Apologies for
-the trouble.
-
-
-
-version 2.0, 31 March 1999
-Added back the faster calculation when there is no missing data.
-Enabled the programs/analysis that broke in the temporary versions, 
-such as pamp, joint reconstruction.
-
-Renamed listtree as evolver, and added simulation options.
-
-
-
-PAML 1.4 Temporary versions
-Dec98 & Jan99 
-
-  pamlTMP.Jan99.notes
-  ===================
-  Ziheng Yang
-  Fri Jan 15 10:01:42 GMT 1999
-
-  .Type make to compile the programs.  If you got error messages, see
-  whether you know how to change the file Makefile.  If it does not work
-  either, look at the files paml.cc, paml.gcc, etc.
-
-  .At the suggestion of David Posada, I added ML estimation of pairwise 
-  distances between amino acid sequences in the program codeml.  
-
-  The relavant variables in the control file codeml.ctl are 
-
-   runmode = -2 for pairwise distance calculation
-     aaRatefile = jones.dat * only used for aa seqs with model=empirical(_F)
-              * dayhoff.dat, jones.dat, mtmam.dat, or your own
-     model = 3 
-     alpha = 0
-
-  model specifies the amino acid substitution model.  You should choose
-  a value from 0, 1, 2, 3.  If you want to estimate the rate matrix from
-  your data, you should do that using many sequences and then move the
-  estimated substitution rate matrix (in the output file mlc) into a
-  file and change the variable aaRatefile.  
-
-  Choosing alpha>0 means using gamma distance with that specified alpha
-  parameter.  If you see anything strange, please let me know.
-
-  .I have included only three programs baseml, codeml, and listtree.
-  I have not tested the others and so they are not included.
-
-
-
-  pamlTMP.Dec98.notes
-  ===================
-  Ziheng Yang
-  Wed Dec  9 10:44:37 GMT 1998
-
-
-  This temporary version of paml has involved a number of changes, and
-  is quite likely to contain errors.  Please let me know if you notice
-  anything strange.  Some of the options do not work for now.  Details
-  follow.  I will compile the Win32 and PowerMac versions after
-  everything is fixed and tested.
-
-  . Missing data: baseml and codeml now handle missing data.  Choose
-  DelMissing = 1 in the control file to remove sites with ambiguity
-  characters or alignment gaps, as did previous versions.
-
-  . Clock: The clock is now fixed.  It may still crash, but the option
-  is not worse than the no-clock options.  The definitions of the node
-  times or branch lengths are now different from the documentation
-  (pamlDOC.html), and so check the branch lengths in the output tree
-  topology.  Thanks to Jeff Thorne.
-
-  . Nexus file formats: Paml programs now read sequence files and tree
-  structure files in the Nexus format used by paup and MacClade.  Only
-  the data or tree are read, and everything else in the file is ignored.
-
-  .  Ancestral reconstruction: Marginal ancestral reconstructions are
-  now done under the gamma-rates model as well as the constant-rate
-  model.  For protein-coding genes, reconstructions can now be done at
-  the nucleotide level with baseml (in particular, by using models that
-  account for differences at the three codon positions), at the amino
-  acid level with aaml, and at the codon level with aaml.  According to
-  Belinda Chang, those different reconstructions tend to be similar.
-  Results for ancestral reconstruction are now listed by site, and not
-  by site pattern as in earlier versions.  Missing data are now handled
-  in those analysis.  See instructions in this doc about baseml.ctl for
-  more options.  Thanks to Belinda Chang.  I would like to take this
-  opportunity to point out that the reconstructed ancestral sequences
-  are not the real observed sequences and may contain errors.
-
-  . I have disabled the option for variable rates among sites in
-  combination with variable dN/dS rate ratios among lineages in codonml.
-  This option has never been implemented in the program but was not
-  disabled.
-
-  . Distance matrices from codonml.  The program codonml outputs
-  matrices of synonymous and nonsynonymous rates in all pairwise
-  comparisons using the method of Nei & Gojobori (1986) into two files
-  named DistanceNG.dN & DistanceNG.dN.  The matrices are lower-diagonal
-  and can be used with programs, such as neighbor and fitch, in phylip
-  (and possibly paup* too) for tree reconstruction or branch length
-  estimation by distance methods.  If you choose runmode = -2 for ML
-  pairwise comparison, the program will also output two matrices of ML
-  estimates into two files named DistanceML.dS & DistanceML.dN, for
-  synonymous and nonsynonymous rates.  Since many users seem interested
-  in looking at dN/dS rate ratios among lineages, examination of the
-  tree shapes indicated by branch lengths calculated from the two rates
-  may be interesting although the analysis is intuitive.  Obviously, you
-  should NOT name your own data files with those names; otherwise they
-  will get overwritten.  If your species name has more than 10
-  characters, you will have to edit the files and cut the names short
-  before you can use Phylip programs.  I have decided to leave this work
-  to the user.  I plan to create some other distance matrix files for
-  nucleotides and amino acids as well.
-
-  . An minor prompt error when running codonml with model = 2 (variable
-  dN/dS ratios among lineages) is fixed.  The example data set
-  lysozymeSmall.nuc is included for the user to duplicate my results in
-  Yang (1998 Molecular Biology and Evolution 15: 568-573).  The control
-  file codemlLysozyme.ctl explains in more detail how to specify the
-  options.  I would like to remind you that it is not correct to use the
-  option model = 1 to estimate dN/dS ratios for branches to find out
-  which ratios are greater than one, and then to use model = 2 to test
-  whether that ratio is significantly greater than one.  The problem
-  with such an analysis is that the hypothesis being tested is derived
-  from the data which you use to test the hypothesis, and as a result,
-  you tend to get significantly results too often.  Strictly speaking,
-  the hypothesis should be formulated before the data are analysed.
-
-  Things that do not work in this version include
-
-  . All parsimony-based analyses are now broken.  The program pamp is
-  not included.  Programs baseml and codeml used to calculate the MP
-  score for each tree before doing an ML analysis on that tree.  Now a
-  -1 is used to indicate this is not possible.
-
-
-
-Version 1.4 (UCL)
-July 1998 
-
-     1. Most of the changes are in the program codeml (aaml for amino
-  acid sequences and codonml for codon sequences). A few new models of
-  codon substitution are implemented (see Yang and Nielsen 1998; Nielsen
-  and Yang 1998; Yang 1998).
-     2. Some problems relating to ancestral sequence reconstruction (in
-  baseml and codeml) are fixed. The earlier algorithm does not work
-  properly if the data contain more than several sequences. The
-  algorithm makes a guess at the likely character states at the interior
-  nodes of the tree, and then uses those to generate reconstructions
-  (joint reconstructions, see eq. 2 in Yang et al. 1995) to be
-  evaluated. Sometimes this strategy misses important reconstructions,
-  and as a result, the probabilities for reconstructed characters at
-  specific interior nodes (marginal reconstructions; see eq. 4 in Yang
-  et al. 1995) are substantially underestimated if the number of
-  sequences is not very small. I have now written separate codes to do
-  the marginal reconstruction, evaluating all possible character states
-  for each interior node. This works also under the gamma model of
-  substitution rates among sites, a feature that was not implemented
-  before. The joint reconstruction works with one rate for all sites
-  only and has the old problem of possibly missing important
-  reconstructions. However, the posterior probabilities for those joint
-  reconstructions that do get evaluated are accurate. I have added an
-  option (choose RateAncestor = 2 rather than 1) for the user to specify
-  the reconstruction to be evaluated. The two approaches are expected to
-  produce very similar results, and since the marginal reconstruction is
-  always reliable, perhaps it can always be used for data analysis.
-  (Thanks to Belinda Chang.)
-     3. The output file for estimated rates for sites under the
-  variable-rates models is now "Rates.out" instead of "rst".
-     4. I have implemented ancestral sequence reconstructions based on
-  codon-substitution models (codonml). This has not been tested
-  carefully, and I would appreciate your comments if you use it.
-     5. A number of minor changes have been made. I have fixed several
-  floating exception errors that seem to occur on DEC Alpha machines
-  only.
-     6. The documentation is now changed into the html format. 
-
-
-
-Version 1.3a,b,c (UCL)
-June 1998 
-
-  1. Basemlg was put back at the request of a user.
-
-  2. Some changes are made to codeml concerning codon-substitution
-  models (Yang and Nielsen 1998 JME; Yang 1998 MBE; Nielsen and Yang
-  1998 Genetics)
-
-
-
-PAML Version 1.3: (UC Berkeley)
-July 1997 
-
-  1. Starting with this version, the program basemlg will not be
-  included in the package.  This program implements the (continuous)
-  gamma model of substitution rate variation among nucleotide sites
-  (Yang 1993).  It involves intensive computation and has been
-  superseded by the discrete-gamma model implemented in the program
-  baseml.  If you want to use this program, you should use previous
-  versions of paml, which can be obtained from me if nowhere else.
-
-  2. The simulation program mcml.c is removed from the package.  This is
-  now superced by the program listtree.c, which does different things
-  besides simulating data sets.
-
-  3. I have implemented the stepwise addition algorithm in the programs
-  baseml and codeml.  This algorithm is faster than the
-  star-decomposition algorithm implemented before, but my implementation
-  is crude and inefficient.  The NNI perturbation is implemented too.
-  These work without the molecular clock.
-
-  4. Many changes are made with the program codonml (codeml with
-  seqtype=1).  A few new codon-based models are implemented.  One
-  assumes different nonsynonymous/synonymous (dN/dS) rate ratios among
-  branches in the phylogeny, and may be useful for detecting episodic
-  positive selection.  Another allows the dN/dS ratio to vary among
-  sites and may be useful for testing neutrality and detecting
-  positively-selected sites in the sequence.  These are based on my
-  collaborative work with Rasmus Nielsen.  I made some changes to the
-  output of estimates of dS and dN in pairwise comparisons (codonml with
-  runmode=-2), so that the results are easy to understand.  Another
-  change with this program is in the codon-substitution model of Goldman
-  and Yang (1994).  We used the amino acid distance matrix of Grantham
-  (1974) to modify the rate of nonsynonymous substitutions, but
-  unfortunately the model does not fit data well.  It does not always
-  fit the data better than if we simply assume equal distance between
-  any two amino acids.  So I have added the option of ignoring the amino
-  acid differences.  Goldman and Yang's original formula is still
-  available but not recommended for use.  As a result of these changes,
-  the formulation of the Goldman and Yang model is now somewhat
-  different from the paper and previous versions of the program.  See
-  Section 6.2 for details.
-
-  5. I have again made some changes with the user interface, based on my
-  guess of what the "user" would like to see.
-
-
-
-Version 1.1a-d: (University of California, Berkeley)
-1995-1996 
-
-1.1a No records of changes remain.
-
-1.1b November, 1995:
-
-The following line at the beginning of the file tools.h 
-      #define __cplusplus
-is removed.  I added this line to avoid some (unjustified) warning
-messages from a compiler I used to use and forgot to remove it when I
-was preparing paml1.1.  This line made some compilers unable to
-compile.
-
-A bug that caused baseml and basemlg to give wrong estimates of kappa
-under the F84 model is now fixed.  Estimates of kappa under the F84
-models listed in the paml 1.0 and 1.1 documents are all incorrect and
-are a bit too small.  For example, the estimate of kappa under the F84
-model for the mtDNA data of Brown et al. (1982) should be 4.344 while
-the wrong estimate given in the large table of the document is 3.420.
-Likelihood value, branch lengths and other parameters seem to have
-been correctly calculated by paml 1.0 and 1.1.  Versions prior to 1.0
-do not have this bug, and results in Yang (J. Mol. Evol. 1994,
-39:306-314) and Yang, Lauder, and Lin (J. Mol. Evol. 1995,
-41:587-596), where the F84 model was used, are correct.
-
-1.1c March 1996 
-
-Representation of tree topologies using branches (see the document)
-are now allowed (restored) in the tree structure file.  This is for
-coping with cases where some taxa are ancestors of others.  If this
-representation is used, the line starts with a dollar sign.  So the
-following line in the file trees.5s (the tree file for 5 species)
-
-$ 5      6 3  6 4  6 5  3 1  3 2
-
-represents the tree ((12)34), with 5 to be the ancestor of 1 and 2.
-
-Alignment gaps are now allowed when option G is used (for multiple
-genes).  The gaps are removed before analysis.
-
-A small program listtree lists all the bifurcating trees for a given
-(small) number of species.
-
-codonml (codeml.c with seqtype=1) now has an option for calculating
-the number of synonymous substitutions per synonymous site (Ks) and
-the number of nonsynonymous substitutions per nonsynonymous site (Ka)
-and their ratio for each pairwise comparison, using the method of
-Goldman and Yang (1994).  This is implemented with
-runmode=-2. Estimates of Ks and Ka are collected in the file rst and
-estimates of model parameters in the file mlc.  This option produces
-table 2 in the Goldman & Yang paper.
-
-The interior nodes were incorrectly identified when the ancestral
-sequences were listed in the file rst.  They were all one less than
-their correct numbers.  Nodes 7, 8, 9, 10 for the stewart.aa data were
-incorrectly identified as nodes 6, 7, 8, 9.  This is corrected now.
-
-1.1d. July 1996
-
-An unintended version of codeml.c was included in version 1.1c, which
-causes the program to produce different results under the codon-based
-model of Goldman and Yang (1994).  The small number in baseml.c was
-set too small, making the program do unnecessary iterations; this is
-fixed now.  I also changed the program pamp so that it will estimate
-the substitution rate matrix without using a tree.
-
-I have included a few other genetic code tables in this version of
-codeml; see the file codeml.ctl.
-
-
-PAML Version 1.2: November 1996 (UC Berkeley)
-
-  1. I have done some changes with the user interface.  Name of files
-(sequence data file, control file, tree structure file) are now
-specified in the control files (The default are baseml.ctl,
-codeml.ctl, pamp.ctl, mcmctree.ctl).  The command line for executing a
-program is now ProgramName ControlFileName with ControlFileName
-optional.  I have added some text in the output file of baseml so that
-the results are eassier to understand.  Choose verbose=0 in the
-control file if you do not want the detailed output.
-
-  2. A new program mcmctree is included, which performs Bayesian
-estimation of phylogenies using Markov chain Monte Carlo methods
-(Rannala and Yang 1996; Yang and Rannala 1997).
-
-  3. A number of minor changes are made, such as using species names
-in the parenthesis tree representation.
-
-
-
-July 1995 
-Version 1.1: (IMEG PennState)
-
-  Models for analyzing data of multiple genes (Yang 1996 JME) are
-  implemented in baseml and codeml.  Two more programs are included:
-  pamp for parsimony-based analysis implementing the methods of Yang and
-  Kumar (1996 MBE) and mcml for Monte Carlo simulation (prepared for
-  Arndt von Haeseler but apparently not used).
-
-
-February 1995 
-Version 1.0 (IMEG, PennState)
-
-  This is the first official version of paml.  Three main programs are
-  included: baseml, basemlg and codeml.  Control files are used for all
-  of them.  The program codeml is formed by merging two programs codonml
-  and aaml, for codon sequences and amino acid sequences, respectively.
-  One rate, gamma rates, and auto-discrete-gamma rates for sites are
-  implemented for nucleotide, amino acid, and codon-based models.  The
-  document is provided as postscript files (paml1_3.ps).
-
-
-October 1994 
-Version 0.12, no name for package (BAU and IMEG PennState) 
-
-  The bracket notation of tree topologies is introduced.  Tree
-  search by star decomposition is implemented in baseml and codonml.
-  The options for baseml are moved into an option file baseml.ctl.  The
-  molecular clock seems to work with the automatic tree search model
-  (runmode=1 or 2).  The old readme file is renamed manual.
-
-
-March 1994 
-Version 0.10, no name for package: (NHM, England)
-
-  Programs tripml and comptree are renamed codonml and rell.  The readme
-  file is expanded and the numerical optimization routine, ming1, used
-  by dnaml and codonml, is improved.  The discrete-gamma model of Yang
-  (1994 JME) is implemented in both baseml and codonml.
-
-  Preliminary version 0.11, no name for package: April 1994 (NHM,
-  England) Programs dnaml and dnamls are renamed baseml and basemlg to
-  avoid confusion with Joe Felsenstein's programs.  The
-  auto-discrete-gamma model and the nonparametric models of rate
-  variation and correlation mong sites of Yang (1995 Genetics) are added
-  in baseml.  Nonhomogeneous-process models of Yang and Roberts (1995
-  MBE) are implemented in baseml.
-
-
-April 1993 
-Version 0.01, no name for package (Cambridge).  Three programs are included in 
-the package:
-
-  1. dnamls implements the model of gamma rates among sites of Yang
-  (1993) and works with arbitrary topologies and substitution models
-  (JC69, K80, F81, HKY85).
-
-  2. dnaml implements the model of a single rate for all sites. (The
-  same name as Joe Felsenstein's program was used.)
-
-  3. tripml implements the codon-based model of Goldman and Yang (1994).
-  Trees are represented by the "branch notation".
-
-// end of file


=====================================
paml4.9i.tgz deleted
=====================================
Binary files a/paml4.9i.tgz and /dev/null differ


=====================================
src/mcmctree.c
=====================================
@@ -272,7 +272,7 @@ int main(int argc, char *argv[])
       if ((k = stree.nodes[i].fossil) == 0) continue;
       printf("Node %3d: %3s ( ", i + 1, fossils[k]);
       for (j = 0; j < npfossils[k]; j++) {
-         printf("%6.4f", stree.nodes[i].pfossil[j + (k == UPPER_F)]);
+         printf("%7.4f", stree.nodes[i].pfossil[j + (k == UPPER_F)]);
          printf("%s", (j == npfossils[k] - 1 ? " )\n" : ", "));
       }
    }
@@ -1038,7 +1038,7 @@ int GetOptions(char *ctlf)
 {
    int  transform0 = ARCSIN_B; /* default transform: SQRT_B, LOG_B, ARCSIN_B */
    int  iopt, i, j, nopt = 31, lline = 4096;
-   char line[4096], *pline, *peq, opt[33], *comment = "*#";
+   char line[4096], *pline, opt[33], *comment = "*#";
    char *optstr[] = { "seed", "seqfile","treefile", "outfile", "mcmcfile", "BayesFactorBeta",
         "seqtype", "aaRatefile", "icode", "noisy", "usedata", "ndata", "duplication", "model", "clock",
         "TipDate", "RootAge", "fossilerror", "alpha", "ncatG", "cleandata",
@@ -1595,128 +1595,213 @@ double Infinitesites(FILE *fout)
 
 int GetInitialsTimes(void)
 {
-/* This sets calibration node ages and root age first, and then populates other node ages, by
-   a break-stick strategy.
-   It ensures that each node age is younger than its parent's age, but does not check 
-   the consistency of ages against the fossil constraints.  The use of soft bounds means that 
-   any ages are possible, even if the chain may start from a poor place.
-   In the case of shared/mirrored node ages due to gene duplication, it is not so clear whether 
-   conflicts may still arise.
-*/
-   int s = stree.nspecies, i, j, k, k1, ir, ncorrections, driver;
-   int maxdepth = 0, from, to, depth;
-   double *p, a, b, d, *r, agefrom, ageto;
+   /* This sets ages for nodes with bounds first.  Then it goes through a loop looking for nodes 
+      with the largest min calibration, and another loop looking for the longest path, each 
+      case populating node ages on the path until reaching an ancestral node with assigned age or 
+      with max bound.
+      In the case of shared/mirrored node ages due to gene duplication, conflicts may still arise,
+      so that a loop of corrections is used.
+
+      stree.duplication:
+      stree.nodes[i].label = 0:  usual node, which may and may not have calibration.
+      stree.nodes[j].label =-1:  node i is the driver node, whose age is shared by other nodes.
+                                 It may and may not have calibration.
+      stree.nodes[j].label = i:  node j shares the age of node i, and doesn't have calibration.
+      */
+   int s = stree.nspecies, s21 = s * 2 - 1, root = stree.root;
+   int i, j, k, k1, ir, ncorrections, driver, from, to, depth, maxdepth;
+   double *p, a, b, d, tbpath[2], *tmin, *tmax, *r;
+   char *pptable, *flag;
+   int debug = 1;
 
    if (com.TipDate && stree.nodes[stree.root].fossil != BOUND_F)
       error2("\nTipDate model requires bounds on the root age..\n");
 
-   /* set up ages of calibration nodes */
-   for (i = s; i < s * 2 - 1; i++)  stree.nodes[i].age = -1;   
-   for (i = s; i < s*2-1; i++) {
-      if (stree.nodes[i].fossil == 0) continue;
-      p = stree.nodes[i].pfossil;
+   /* set up pop-pop table of ancestor-descendent relationships. */
+   pptable = (char*)malloc(s21*(s21 + 1) * sizeof(char));
+   if (pptable == NULL) error2("oom pptable");
+   flag = pptable + s21*s21;
+   memset(pptable, 0, s21*(s21 + 1) * sizeof(char));
+   tmin = (double*)malloc(s21 * 2 * sizeof(double));
+   if (tmin == NULL) error2("oom tmin");
+   memset(tmin, 0, s21 * 2 * sizeof(double));
+   tmax = tmin + s21;
+   r = (double*)malloc(s * sizeof(double));
+   if (r == NULL) error2("oom r");
 
-      switch(stree.nodes[i].fossil) {
-      case (LOWER_F):  stree.nodes[i].age = p[0] * (1.05 + 0.1*rndu());               break;
-      case (UPPER_F):  stree.nodes[i].age = p[1] * (0.6 + 0.4*rndu());                break;
-      case (BOUND_F):  stree.nodes[i].age = p[0] + (p[1] - p[0])*(0.2 + rndu()*1.6);  break;
-      case (GAMMA_F):  stree.nodes[i].age = p[0] / p[1] * (0.7 + rndu()*0.6);         break;
-      case (SKEWN_F): case(SKEWT_F): 
-         d = p[2] / sqrt(1 + p[2] * p[2]);
-         a = p[0] + p[1] * d*sqrt(2 / Pi);
-         stree.nodes[i].age = a * (0.6 + 0.4*rndu());
-         break;
-      case (S2N_F): 
-         d = p[3] / sqrt(1 + p[3] * p[3]);
-         a = (p[1] + p[2] * d*sqrt(2 / Pi));   /* mean of SN 1 */
-         d = p[6] / sqrt(1 + p[6] * p[6]);
-         b = (p[4] + p[5] * d*sqrt(2 / Pi));   /* mean of SN 2 */
-         stree.nodes[i].age = (p[0] * a + b) * (0.6 + 0.4*rndu());
-         break;
+   for (i = 0; i < s21; i++) pptable[i*s21 + i] = pptable[i*s21 + root] = (char)1;
+   for (i = 0; i < s21; i++) {
+      for (j = i; j != root; j = stree.nodes[j].father)
+         pptable[i*s21 + j] = (char)1;
+   }
+   for (i = s; i < s21; i++) {
+      if (stree.nodes[i].label != -1) continue;    /* -1 means driver */
+      for (j = s; j < s21; j++) {
+         if (stree.nodes[j].label == i) /* if j mirrors i, ancestors of j are also ancestors of i. */
+            for (k = j; k != root && pptable[i*s21 + k] == 0; k = stree.nodes[k].father) {
+               pptable[i*s21 + k] = (char)1;
+            }
       }
-
-      /* copy nodes[i].age to all mirrored nodes */
-      driver = (stree.nodes[i].label >= s ? stree.nodes[i].label : (stree.nodes[i].label == -1 ? i : 0));
-      if (driver) {
-         for (k = s; k < s * 2 - 1; k++) {
-            if (k!=i && (k==driver || stree.nodes[k].label == driver))
-               stree.nodes[k].age = stree.nodes[i].age;
+      /* if j mirrors i, copy ancestors of i to j */
+      for (j = s; j < s21; j++) {
+         if (stree.nodes[j].label == i)
+            memcpy(pptable + j*s21 + s, pptable + i*s21 + s, (s - 1) * sizeof(char));
+      }
+   }
+   if (debug && s<20) {
+      printf("\n\nSetting up initial node ages...\npop-pop table:\n     ");
+      for (j = 0; j < s21; j++) printf(" %2d", j);
+      for (i = 0; i < s21; i++) {
+         printf("\n%2d:  ", i);
+         for (j = 0; j < s21; j++) printf(" %2d", pptable[i*s21 + j]);
+         if (i >= s) {
+            printf(" %3s ", (stree.nodes[i].fossil ? fossils[stree.nodes[i].fossil] : ""));
+            printf("  label = %2d, ", stree.nodes[i].label);
+            if (stree.nodes[i].label == -1)      printf("driver");
+            else if (stree.nodes[i].label > 0)   printf("mirror");
          }
       }
    }
 
-   /* check for consistency among calibration and mirrored nodes, and correct if needed. */
-   for (ir = 0; ir < 100; ir++) {
-      ncorrections = 0;
-      for (i = 0; i < s * 2 - 1; i++) {
-         if (stree.nodes[i].age == -1) continue;
-         /* j is calibration node ancestral to i. */
-         for (j = stree.nodes[i].father; j != -1; j = stree.nodes[j].father) 
-            if (stree.nodes[j].age > 0) break;
-         if (j != -1 && stree.nodes[j].age < stree.nodes[i].age) {
-            stree.nodes[j].age = stree.nodes[i].age * (1.001 + 0.01*rndu());
-            ncorrections++;
-            /* copy nodes[j].age to all mirrored nodes */
-            driver = (stree.nodes[j].label >= s ? stree.nodes[j].label : (stree.nodes[j].label == -1 ? j : 0));
-            if (driver) {
-               for (k = s; k < s * 2 - 1; k++) {
-                  if (k != j && (k == driver || stree.nodes[k].label == driver))
-                     stree.nodes[k].age = stree.nodes[j].age;
-               }
-            }
+   /* retrieve tmin & rmax from calibrations */
+   for (i = s; i < s21; i++) stree.nodes[i].age = -1;
+   for (i = s; i < s21; i++) {
+      if (stree.nodes[i].fossil == 0)  continue;
+      p = stree.nodes[i].pfossil;
+
+      switch (stree.nodes[i].fossil) {
+      case (LOWER_F):  tmin[i] = p[0];                  break;
+      case (UPPER_F):  tmax[i] = p[1];                  break;
+      case (BOUND_F):  tmin[i] = p[0]; tmax[i] = p[1];  break;
+      case (GAMMA_F):
+         tmin[i] = QuantileGamma(0.025, p[0], p[1]); tmax[i] = QuantileGamma(0.975, p[0], p[1]);
+         break;
+      case (SKEWN_F):
+      case (SKEWT_F):
+      case (S2N_F):
+         error2("implement SkewN, Skewt, S2N now");
+      }
+   }
+   for (i = s; i < s21; i++) {
+      for (j = s; j < s21; j++)
+         if (j != i && pptable[i*s21 + j] && tmax[j] > 0 && tmin[i] > tmax[j]) {
+            printf("\n\ndaughter node %3d minimun age %9.4f \nmother node %3d max age %9.4f\n", i + 1, tmin[i], j + 1, tmax[j]);
+            error2("strange calibration?");
+         }
+   }
+   for (i = s; i < s21; i++) {
+      if (tmin[i])   /* remove duplicated min bounds for ancestors of i */
+         for (j = s; j < s21; j++)
+            if (j != i && pptable[i*s21 + j] && tmin[j] <= tmin[i])
+               tmin[j] = 0;
+      if (tmax[i])   /* remove duplicated max bounds for descendents of i */
+         for (j = s; j < s21; j++) {
+            if (j != i && pptable[j*s21 + i] && tmax[j] >= tmax[i])
+               tmax[j] = 0;
          }
+   }
+   if (debug) {
+      printf("\n\nbounds on nodes after removing duplicated min and max bounds");
+      for (i = s; i < s21; i++) {
+         if(tmin[i] || tmax[i]) 
+            printf("\n%2d:  (%8.4f, %8.4f)", i+1, tmin[i], tmax[i]);
       }
-      if (ncorrections == 0) break;
    }
-   if (ir == 100)
-      puts("i did 100 rounds, but we need more!");
+   /* generate ages for bounded nodes if possible. */
+   for (i = s; i < s21; i++) {
+      if (tmin[i] && tmax[i])
+         stree.nodes[i].age = tmin[i] + (tmax[i] - tmin[i])*rndu();
+   }
+   if (tmax[root] == 0) error2("we need a maximum-age bound on root??");
 
-   /* look for the deepest path from a node to its ancestor, and assign ages to all nodes inbetween. */
-   r = (double*)malloc(s*sizeof(double));
-   if (r == NULL) error2("oom r");
+   /* look for largest unused tmin, and assign ages on path from node to ancestor with age or with tmax. */
+   for (ir = 0; ir < s; ir++) {
+      tbpath[0] = 0;
+      for (i = s; i < s21; i++) {  /* find max tmin and store in tbpath[0]. */
+         if (stree.nodes[i].age == -1 && tbpath[0] < tmin[i]) {
+            tbpath[0] = tmin[i];  from = i;
+         }
+      }
+      if (tbpath[0] == 0) break;
+      maxdepth = 1;
+      for (j = stree.nodes[from].father; j != -1; j = stree.nodes[j].father, maxdepth++) {
+         if (stree.nodes[j].age > 0) 
+            break;
+         else if (tmax[j] > 0) {
+            maxdepth++;
+            break;
+         }
+      }
+      to = j;
+      tbpath[1] = (stree.nodes[to].age > 0 ? stree.nodes[to].age : tmax[to]);
 
-   for (ir = 0; ir < s * 2 - 1; ir++) {
-      for (i = 0, maxdepth = 0; i < s * 2 - 1; i++) {
-         if (stree.nodes[i].age == -1) continue;
-         depth = 0;
-         for (j = stree.nodes[i].father; j != -1; j = stree.nodes[j].father) {
-            if (stree.nodes[j].age > 0) break;
-            depth++;
+      for (j = 0; j < maxdepth; j++) r[j] = rndu();
+      if (maxdepth>1) qsort(r, (size_t)maxdepth, sizeof(double), comparedouble);
+      for (j = from, k = 0; k<maxdepth; j = stree.nodes[j].father, k++) {
+         stree.nodes[j].age = tbpath[0] + (tbpath[1] - tbpath[0])*r[k];
+         driver = (stree.nodes[j].label >= s ? stree.nodes[j].label : (stree.nodes[j].label == -1 ? j : 0));
+         if (driver) {
+            for (k1 = s; k1 < s21; k1++) {
+               if (k1 != j && (k1 == driver || stree.nodes[k1].label == driver))
+                  stree.nodes[k1].age = stree.nodes[j].age;
+            }
+         }
+      }
+      printf("\n*** min-age round %2d: from nodes %3d (%9.5f) to %3d, %3d nodes ", ir+1, from+1, stree.nodes[from].age, to+1, maxdepth);
+      if(debug && s<200) printStree();
+   }
+
+   /* look for longest path, and assign ages on path from node to ancestor with age or with tmax. */
+   for (ir = 0; ir < s; ir++) {
+      maxdepth = 0;
+      for (i = s; i < s21; i++) {
+         if (stree.nodes[i].age > 0) continue;
+         for (j = stree.nodes[i].father, depth = 1; j != -1; j = stree.nodes[j].father, depth++) {
+            if (stree.nodes[j].age > 0) 
+               break;
+            else if (tmax[j] > 0) {
+               depth++;
+               break;
+            }
          }
          if (maxdepth < depth) {
             maxdepth = depth;  from = i;  to = j;
          }
       }
       if (maxdepth == 0) break;  /* all ages are assigned. */
+      int *sons = stree.nodes[from].sons;
+      tbpath[0] = max2(stree.nodes[sons[0]].age, stree.nodes[sons[1]].age);
+      tbpath[1] = (stree.nodes[to].age > 0 ? stree.nodes[to].age : tmax[to]);
+
       for (j = 0; j < maxdepth; j++) r[j] = rndu();
-      qsort(r, (size_t)maxdepth, sizeof(double), comparedouble);
-      agefrom = stree.nodes[from].age;
-      ageto = stree.nodes[to].age;
-      for (j = stree.nodes[from].father, k = 0; j != to; j = stree.nodes[j].father, k++) {
-         stree.nodes[j].age = agefrom + (ageto - agefrom)*r[k];
+      if(maxdepth>1) qsort(r, (size_t)maxdepth, sizeof(double), comparedouble);
+      for (j = from, k = 0; k<maxdepth; j = stree.nodes[j].father, k++) {
+         stree.nodes[j].age = tbpath[0] + (tbpath[1] - tbpath[0])*r[k];
          driver = (stree.nodes[j].label >= s ? stree.nodes[j].label : (stree.nodes[j].label == -1 ? j : 0));
          if (driver) {
-            for (k1 = s; k1 < s * 2 - 1; k1++) {
+            for (k1 = s; k1 < s21; k1++) {
                if (k1 != j && (k1 == driver || stree.nodes[k1].label == driver))
                   stree.nodes[k1].age = stree.nodes[j].age;
             }
          }
       }
+      printf("\n*** longest-path round %2d: from nodes %3d (%9.5f) to %3d, %3d nodes ", ir + 1, from + 1, stree.nodes[from].age, to + 1, maxdepth);
+      if (debug && s<200) printStree();
    }
 
-
    /* check for consistency for all nodes, and correct if needed. */
    for (ir = 0; ir < 100; ir++) {
       ncorrections = 0;
-      for (i = 0; i < s * 2 - 1; i++) {
+      for (i = 0; i < s21; i++) {
          j = stree.nodes[i].father;
          if (j == -1 || stree.nodes[i].age < stree.nodes[j].age) continue;
+         printf("\n*** correction round %2d nodes %2d & %2d, ages %9.5f %9.5f", ir+1, i+1, j+1, stree.nodes[i].age, stree.nodes[j].age);
          ncorrections++;
          stree.nodes[j].age = stree.nodes[i].age * (1.001 + 0.01*rndu());
          /* copy nodes[j].age to all mirrored nodes */
          driver = (stree.nodes[j].label >= s ? stree.nodes[j].label : (stree.nodes[j].label == -1 ? j : 0));
          if (driver) {
-            for (k = s; k < s * 2 - 1; k++) {
+            for (k = s; k < s21; k++) {
                if (k != j && (k == driver || stree.nodes[k].label == driver))
                   stree.nodes[k].age = stree.nodes[j].age;
             }
@@ -1724,10 +1809,8 @@ int GetInitialsTimes(void)
       }
       if (ncorrections == 0) break;
    }
-   if (ir == 100)
-      puts("i did 100 rounds, but we need more!");
 
-   free(r);
+   free(pptable);  free(tmin);  free(r);
    return (0);
 }
 
@@ -3808,14 +3891,14 @@ int DescriptiveStatisticsSimpleMCMCTREE(FILE *fout, char infile[], int nbin)
       }
    }
 
-   printf("\n\nPosterior mean (95%% Equal-tail CI) (95%% HPD CI) HPD-CI-width\n\n");
+   printf("\n\nPosterior means (95%% Equal-tail CI) (95%% HPD CI) HPD-CI-width\n\n");
    for (j = SkipC1; j < p; j++) {
       printf("%-10s ", varstr[j]);
-      printf("%10.4f (%6.4f, %6.4f) (%6.4f, %6.4f) %6.4f", mean[j], x025[j], x975[j], xHPD025[j], xHPD975[j], xHPD975[j] - xHPD025[j]);
+      printf("%10.4f (%7.4f, %7.4f) (%7.4f, %7.4f) %7.4f", mean[j], x025[j], x975[j], xHPD025[j], xHPD975[j], xHPD975[j] - xHPD025[j]);
       if (j < SkipC1 + stree.nspecies - 1) {
          printf("  (Jnode %2d)", 2 * stree.nspecies - 1 - 1 - j + SkipC1);
          if (com.TipDate)
-            printf(" time: %6.2f (%5.2f, %5.2f)", com.TipDate - mean[j] * com.TipDate_TimeUnit,
+            printf(" time: %7.3f (%6.3f, %6.3f)", com.TipDate - mean[j] * com.TipDate_TimeUnit,
                com.TipDate - x975[j] * com.TipDate_TimeUnit, com.TipDate - x025[j] * com.TipDate_TimeUnit);
       }
       printf("\n");
@@ -3827,11 +3910,11 @@ int DescriptiveStatisticsSimpleMCMCTREE(FILE *fout, char infile[], int nbin)
       fprintf(fout, "\n\nPosterior mean (95%% Equal-tail CI) (95%% HPD CI) HPD-CI-width\n\n");
       for (j = SkipC1; j < p; j++) {
          fprintf(fout, "%-10s ", varstr[j]);
-         fprintf(fout, "%10.4f (%6.4f, %6.4f) (%6.4f, %6.4f) %6.4f", mean[j], x025[j], x975[j], xHPD025[j], xHPD975[j], xHPD975[j] - xHPD025[j]);
+         fprintf(fout, "%10.4f (%7.4f, %7.4f) (%7.4f, %7.4f) %7.4f", mean[j], x025[j], x975[j], xHPD025[j], xHPD975[j], xHPD975[j] - xHPD025[j]);
          if (j < SkipC1 + stree.nspecies - 1) {
             fprintf(fout, " (Jnode %2d)", 2 * stree.nspecies - 1 - 1 - j + SkipC1);
             if (com.TipDate)
-               fprintf(fout, " time: %6.2f (%5.2f, %5.2f)", com.TipDate - mean[j] * com.TipDate_TimeUnit,
+               fprintf(fout, " time: %7.3f (%6.3f, %6.3f)", com.TipDate - mean[j] * com.TipDate_TimeUnit,
                   com.TipDate - x975[j] * com.TipDate_TimeUnit, com.TipDate - x025[j] * com.TipDate_TimeUnit);
          }
          fprintf(fout, "\n");
@@ -3867,7 +3950,7 @@ int MCMC(FILE* fout)
    printf("\n%d burnin, sampled every %d, %d samples\n",
       mcmc.burnin, mcmc.sampfreq, mcmc.nsample);
    if (mcmc.usedata) puts("Approximating posterior");
-   else             puts("Approximating prior");
+   else              puts("Approximating prior");
 
    printf("(Settings: cleandata=%d  print=%d  saveconP=%d)\n",
       com.cleandata, mcmc.print, mcmc.saveconP);
@@ -3877,9 +3960,10 @@ int MCMC(FILE* fout)
    for (j = 0; j < mcmc.nsteplength; j++)
       mcmc.steplength[j] = 0.001 + 0.1*rndu();
 
+   /*
    printf("\nsteplength values: ");
    matout2(F0, mcmc.steplength, 1, mcmc.nsteplength, 9, 5);
-
+   */
    printStree();
 
    x = (double*)malloc(com.np * 2 * sizeof(double));


=====================================
src/paml.h
=====================================
@@ -19,6 +19,14 @@
 #define FOR(i,n) for(i=0; i<n; i++)
 #define FPN(file) fputc('\n', file)
 #define F0 stdout
+#if !defined(MAX)
+#define MAX(a,b)                            (((a) > (b)) ? (a) : (b))
+#endif
+#if !defined(MIN)
+#define MIN(a,b)                            (((a) < (b)) ? (a) : (b))
+#endif
+#define SQUARE(a)                           ((a)*(a))
+
 #define min2(a,b) ((a)<(b)?(a):(b))
 #define max2(a,b) ((a)>(b)?(a):(b))
 #define swap2(a,b,y) { y=a; a=b; b=y; }
@@ -386,6 +394,6 @@ enum {PrBranch=1, PrNodeNum=2, PrLabel=4, PrNodeStr=8, PrAge=16, PrOmega=32} Out
 
 #define PAML_RELEASE      0
 
-#define pamlVerStr "paml version 4.9i, September 2018"
+#define pamlVerStr "paml version 4.9j, October 2019"
 
 #endif


=====================================
src/pll_avx.c deleted
=====================================
@@ -1,444 +0,0 @@
-/*
-    Copyright (C) 2015 Tomas Flouri
-
-    This program is free software: you can redistribute it and/or modify
-    it under the terms of the GNU Affero General Public License as
-    published by the Free Software Foundation, either version 3 of the
-    License, or (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU Affero General Public License for more details.
-
-    You should have received a copy of the GNU Affero General Public License
-    along with this program.  If not, see <http://www.gnu.org/licenses/>.
-
-    Contact: Tomas Flouri <Tomas.Flouri at h-its.org>,
-    Exelixis Lab, Heidelberg Instutute for Theoretical Studies
-    Schloss-Wolfsbrunnenweg 35, D-69118 Heidelberg, Germany
-*/
-
-#if defined(_MSC_VER)
-static __declspec(align(32)) double lmatrix[4*4];
-static __declspec(align(32)) double rmatrix[4*4];
-#else
-static double lmatrix[4*4] __attribute__ ((aligned (32)));
-static double rmatrix[4*4] __attribute__ ((aligned (32)));
-#endif
-
-void pmat_jc69(double * pmatrix, double t)
-{
-  if (t < -0.0001)
-    printf ("\nt = %.5f in pijJC69", t);
-  
-  if (t < 1e-100)
-  {
-    pmatrix[0]  = 1;
-    pmatrix[1]  = 0;
-    pmatrix[2]  = 0;
-    pmatrix[3]  = 0;
-
-    pmatrix[4]  = 0;
-    pmatrix[5]  = 1;
-    pmatrix[6]  = 0;
-    pmatrix[7]  = 0;
-
-    pmatrix[8]  = 0;
-    pmatrix[9]  = 0;
-    pmatrix[10] = 1;
-    pmatrix[11] = 0;
-
-    pmatrix[12] = 0;
-    pmatrix[13] = 0;
-    pmatrix[14] = 0;
-    pmatrix[15] = 1;
-  }
-  else
-  {
-    double a =  (1 + 3*exp(-4*t/3) ) / 4;
-    double b = (1 - a) / 3;
-
-    pmatrix[0]  = a;
-    pmatrix[1]  = b;
-    pmatrix[2]  = b;
-    pmatrix[3]  = b;
-
-    pmatrix[4]  = b;
-    pmatrix[5]  = a;
-    pmatrix[6]  = b;
-    pmatrix[7]  = b;
-
-    pmatrix[8]  = b;
-    pmatrix[9]  = b;
-    pmatrix[10] = a;
-    pmatrix[11] = b;
-
-    pmatrix[12] = b;
-    pmatrix[13] = b;
-    pmatrix[14] = b;
-    pmatrix[15] = a;
-  }
-}
-
-static void tip_tip(int lchild, int rchild, double * clv)
-{
-  int sites = com.npatt;
-  int states = 4;
-  int n,i;
-
-  __m256d ymm0,ymm1,ymm2;
-
-  pmat_jc69(lmatrix, nodes[lchild].branch);
-  pmat_jc69(rmatrix, nodes[rchild].branch);
-
-  for (n = 0; n < sites; ++n)
-  {
-    double * lmat = lmatrix + com.z[lchild][n] * states;
-    double * rmat = rmatrix + com.z[rchild][n] * states;
-
-    // for (i = 0; i < states; ++i)
-    //   clv[i] = lmat[i] * rmat[i];
-    
-    ymm0 = _mm256_load_pd(lmat);
-    ymm1 = _mm256_load_pd(rmat);
-    ymm2 = _mm256_mul_pd(ymm0,ymm1);
-
-    _mm256_store_pd(clv,ymm2);
-
-    clv += states;
-  }
-}
-
-static void tip_tip_amb(int lchild, int rchild, double * clv)
-{
-  int sites = com.npatt;
-  int states = 4;
-  int n,k;
-
-  double lvec[4];
-  double rvec[4];
-
-  pmat_jc69(lmatrix, nodes[lchild].branch);
-  pmat_jc69(rmatrix, nodes[rchild].branch);
-  
-  __m256d xmm0,xmm1,ymm0,ymm1;
-
-  for (n = 0; n < sites; ++n)
-  {
-    double * lmat = lmatrix + (CharaMap[com.z[lchild][n]][0] << 2); // *states;
-    double * rmat = rmatrix + (CharaMap[com.z[rchild][n]][0] << 2); // *states;
-    
-    xmm0 = _mm256_load_pd(lmat);
-    for (k = 1; k < nChara[com.z[lchild][n]]; ++k)
-    {
-      lmat = lmatrix + (CharaMap[com.z[lchild][n]][k] << 2); // *states;
-      xmm1 = _mm256_load_pd(lmat);
-      xmm0 = _mm256_add_pd(xmm0,xmm1);
-    }
-
-    ymm0 = _mm256_load_pd(rmat);
-    for (k = 1; k < nChara[com.z[rchild][n]]; ++k)
-    {
-      rmat = rmatrix + (CharaMap[com.z[rchild][n]][k] << 2); // *states;
-      ymm1 = _mm256_load_pd(rmat);
-      ymm0 = _mm256_add_pd(ymm0,ymm1);
-    }
-
-    xmm1 = _mm256_mul_pd(xmm0,ymm0);
-    _mm256_store_pd(clv,xmm1);
-
-    clv += states;
-  }
-}
-
-static void tip_inner(int lchild, int rchild, double * clv)
-{
-  int sites = com.npatt;
-  int states = 4;
-  int h;
-
-  pmat_jc69(lmatrix, nodes[lchild].branch);
-  pmat_jc69(rmatrix, nodes[rchild].branch);
-
-  double * rclv = nodes[rchild].conP;
-  
-  __m256d ymm0,ymm1,ymm2,ymm3,ymm4,ymm5,ymm6,ymm7;
-  __m256d xmm0,xmm4;
-
-  for (h = 0; h < sites; ++h)
-  {
-    double * lmat = lmatrix + com.z[lchild][h] * states;
-    double * rmat = rmatrix;
-
-    xmm4 = _mm256_load_pd(lmat);
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm5 = _mm256_load_pd(rclv);
-    ymm0 = _mm256_mul_pd(ymm4,ymm5);
-
-    rmat += states;
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm1 = _mm256_mul_pd(ymm4,ymm5);
-
-    rmat += states;
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm2 = _mm256_mul_pd(ymm4,ymm5);
-
-    rmat += states;
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm3 = _mm256_mul_pd(ymm4,ymm5);
-
-    /* compute y */
-    ymm4 = _mm256_unpackhi_pd(ymm0,ymm1);
-    ymm5 = _mm256_unpacklo_pd(ymm0,ymm1);
-
-    ymm6 = _mm256_unpackhi_pd(ymm2,ymm3);
-    ymm7 = _mm256_unpacklo_pd(ymm2,ymm3);
-
-    ymm0 = _mm256_add_pd(ymm4,ymm5);
-    ymm1 = _mm256_add_pd(ymm6,ymm7);
-
-    ymm2 = _mm256_permute2f128_pd(ymm0,ymm1, _MM_SHUFFLE(0,2,0,1));
-    ymm3 = _mm256_blend_pd(ymm0,ymm1,12);
-    ymm4 = _mm256_add_pd(ymm2,ymm3);
-
-    /* compute x*y */
-    xmm0 = _mm256_mul_pd(xmm4,ymm4);
-
-    _mm256_store_pd(clv, xmm0);
-
-    clv += states;
-    rclv += states;
-  }
-}
-
-static void tip_inner_amb(int lchild, int rchild, double * clv)
-{
-  int sites = com.npatt;
-  int states = 4;
-  int h,j,k;
-  double y;
-  double x;
-
-  pmat_jc69(lmatrix, nodes[lchild].branch);
-  pmat_jc69(rmatrix, nodes[rchild].branch);
-
-  __m256d ymm0,ymm1,ymm2,ymm3,ymm4,ymm5,ymm6,ymm7;
-  __m256d xmm0,xmm1;
-
-  double * rclv = nodes[rchild].conP;
-
-  for (h = 0; h < sites; ++h)
-  {
-    double * lmat = lmatrix + (CharaMap[com.z[lchild][h]][0] << 2); // *states;
-    double * rmat = rmatrix;
-
-    /* compute term for left child (tip) */
-    xmm0 = _mm256_load_pd(lmat);
-    for (k = 1; k < nChara[com.z[lchild][h]]; ++k)
-    {
-      lmat = lmatrix + (CharaMap[com.z[lchild][h]][k] << 2); // *states;
-      xmm1 = _mm256_load_pd(lmat);
-      xmm0 = _mm256_add_pd(xmm0,xmm1);
-    }
-
-    /* compute term for right child (inner) */
-    ymm4 = _mm256_load_pd(rmat);
-    ymm5 = _mm256_load_pd(rclv);
-    ymm0 = _mm256_mul_pd(ymm4,ymm5);
-
-    rmat += states;
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm1 = _mm256_mul_pd(ymm4,ymm5);
-
-    rmat += states;
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm2 = _mm256_mul_pd(ymm4,ymm5);
-
-    rmat += states;
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm3 = _mm256_mul_pd(ymm4,ymm5);
-
-    ymm4 = _mm256_unpackhi_pd(ymm0,ymm1);
-    ymm5 = _mm256_unpacklo_pd(ymm0,ymm1);
-
-    ymm6 = _mm256_unpackhi_pd(ymm2,ymm3);
-    ymm7 = _mm256_unpacklo_pd(ymm2,ymm3);
-
-    ymm0 = _mm256_add_pd(ymm4,ymm5);
-    ymm1 = _mm256_add_pd(ymm6,ymm7);
-
-    ymm2 = _mm256_permute2f128_pd(ymm0,ymm1, _MM_SHUFFLE(0,2,0,1));
-    ymm3 = _mm256_blend_pd(ymm0,ymm1,12);
-    ymm4 = _mm256_add_pd(ymm2,ymm3);
-
-    ymm0 = _mm256_mul_pd(xmm0,ymm4);
-
-    _mm256_store_pd(clv,ymm0);
-
-    clv += states;
-    rclv += states;
-  }
-  
-}
-
-static void inner_inner(int lchild, int rchild, double * clv)
-{
-  int sites = com.npatt;
-  int states = 4;
-  int h;
-
-  __m256d ymm0,ymm1,ymm2,ymm3,ymm4,ymm5,ymm6,ymm7;
-  __m256d xmm0,xmm1,xmm2,xmm3,xmm4,xmm5,xmm6,xmm7;
-
-  pmat_jc69(lmatrix, nodes[lchild].branch);
-  pmat_jc69(rmatrix, nodes[rchild].branch);
-
-  double * lclv = nodes[lchild].conP;
-  double * rclv = nodes[rchild].conP;
-
-  for (h = 0; h < sites; ++h)
-  {
-    double * lmat = lmatrix;
-    double * rmat = rmatrix;
-
-    /* compute vector of x */
-    xmm4 = _mm256_load_pd(lmat);
-    xmm5 = _mm256_load_pd(lclv);
-    xmm0 = _mm256_mul_pd(xmm4,xmm5);
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm5 = _mm256_load_pd(rclv);
-    ymm0 = _mm256_mul_pd(ymm4,ymm5);
-
-    lmat += states;
-    rmat += states;
-
-    xmm4 = _mm256_load_pd(lmat);
-    xmm1 = _mm256_mul_pd(xmm4,xmm5);
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm1 = _mm256_mul_pd(ymm4,ymm5);
-
-    lmat += states;
-    rmat += states;
-
-    xmm4 = _mm256_load_pd(lmat);
-    xmm2 = _mm256_mul_pd(xmm4,xmm5);
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm2 = _mm256_mul_pd(ymm4,ymm5);
-
-    lmat += states;
-    rmat += states;
-
-    xmm4 = _mm256_load_pd(lmat);
-    xmm3 = _mm256_mul_pd(xmm4,xmm5);
-
-    ymm4 = _mm256_load_pd(rmat);
-    ymm3 = _mm256_mul_pd(ymm4,ymm5);
-
-    /* compute x */
-    xmm4 = _mm256_unpackhi_pd(xmm0,xmm1);
-    xmm5 = _mm256_unpacklo_pd(xmm0,xmm1);
-
-    xmm6 = _mm256_unpackhi_pd(xmm2,xmm3);
-    xmm7 = _mm256_unpacklo_pd(xmm2,xmm3);
-
-    xmm0 = _mm256_add_pd(xmm4,xmm5);
-    xmm1 = _mm256_add_pd(xmm6,xmm7);
-
-    xmm2 = _mm256_permute2f128_pd(xmm0,xmm1, _MM_SHUFFLE(0,2,0,1));
-    xmm3 = _mm256_blend_pd(xmm0,xmm1,12);
-    xmm4 = _mm256_add_pd(xmm2,xmm3);
-
-    /* compute y */
-    ymm4 = _mm256_unpackhi_pd(ymm0,ymm1);
-    ymm5 = _mm256_unpacklo_pd(ymm0,ymm1);
-
-    ymm6 = _mm256_unpackhi_pd(ymm2,ymm3);
-    ymm7 = _mm256_unpacklo_pd(ymm2,ymm3);
-
-    ymm0 = _mm256_add_pd(ymm4,ymm5);
-    ymm1 = _mm256_add_pd(ymm6,ymm7);
-
-    ymm2 = _mm256_permute2f128_pd(ymm0,ymm1, _MM_SHUFFLE(0,2,0,1));
-    ymm3 = _mm256_blend_pd(ymm0,ymm1,12);
-    ymm4 = _mm256_add_pd(ymm2,ymm3);
-
-    /* compute x*y */
-    xmm0 = _mm256_mul_pd(xmm4,ymm4);
-
-    _mm256_store_pd(clv, xmm0);
-
-    clv += states;
-    lclv += states;
-    rclv += states;
-  }
-}
-
-int ConditonalPNode (int inode)
-{
-  int states = 4;
-  int sites = com.npatt;
-  int i;
-  int child, lchild, rchild;
-
-  /* recursive call the ConditionalPNode on all children of the current node */
-  for (i = 0; i < nodes[inode].nson; ++i)
-  {
-    /* get id of i-th child of current node */
-    child = nodes[inode].sons[i];
-
-    /* if child is an inner node then recursively compute its conditional
-       probabilities vector */
-    if (nodes[child].nson > 0 && (!mcmc.saveconP || !com.oldconP[child]))
-      ConditonalPNode(nodes[inode].sons[i]);
-  }
-
-  /* initialize CLV entries of current node to 1 */
-  int n = sites * states;
-  for (i = 0; i < n; ++i)
-    nodes[inode].conP[i] = 1;
-
-  if (nodes[inode].nson == 0) return (0);
-
-  lchild = nodes[inode].sons[0];
-  rchild = nodes[inode].sons[1];
-
-  int ltip = (nodes[lchild].nson == 0);
-  int rtip = (nodes[rchild].nson == 0);
-
-  if (ltip && rtip)
-  {
-    if (com.cleandata)
-      tip_tip(lchild,rchild,nodes[inode].conP);
-    else
-      tip_tip_amb(lchild,rchild,nodes[inode].conP);
-  }
-  else if (ltip)
-  {
-    if (com.cleandata)
-      tip_inner(lchild,rchild,nodes[inode].conP);
-    else
-      tip_inner_amb(lchild,rchild,nodes[inode].conP);
-  }
-  else if (rtip)
-  {
-    if (com.cleandata)
-      tip_inner(rchild,lchild,nodes[inode].conP);
-    else
-      tip_inner_amb(rchild,lchild,nodes[inode].conP);
-  }
-  else
-    inner_inner(lchild,rchild,nodes[inode].conP);
-
-  return (0);
-}


=====================================
src/pll_sse.c deleted
=====================================
@@ -1,608 +0,0 @@
-/*
-    Copyright (C) 2015 Tomas Flouri
-
-    This program is free software: you can redistribute it and/or modify
-    it under the terms of the GNU Affero General Public License as
-    published by the Free Software Foundation, either version 3 of the
-    License, or (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU Affero General Public License for more details.
-
-    You should have received a copy of the GNU Affero General Public License
-    along with this program.  If not, see <http://www.gnu.org/licenses/>.
-
-    Contact: Tomas Flouri <Tomas.Flouri at h-its.org>,
-    Exelixis Lab, Heidelberg Instutute for Theoretical Studies
-    Schloss-Wolfsbrunnenweg 35, D-69118 Heidelberg, Germany
-*/
-
-#if defined(_MSC_VER)
-static __declspec(align(16)) double lmatrix[4*4];
-static __declspec(align(16)) double rmatrix[4*4];
-#else
-static double lmatrix[4*4] __attribute__ ((aligned (16)));
-static double rmatrix[4*4] __attribute__ ((aligned (16)));
-#endif
-
-void pmat_jc69(double * pmatrix, double t)
-{
-  if (t < -0.0001)
-    printf ("\nt = %.5f in pijJC69", t);
-  
-  if (t < 1e-100)
-  {
-    pmatrix[0]  = 1;
-    pmatrix[1]  = 0;
-    pmatrix[2]  = 0;
-    pmatrix[3]  = 0;
-
-    pmatrix[4]  = 0;
-    pmatrix[5]  = 1;
-    pmatrix[6]  = 0;
-    pmatrix[7]  = 0;
-
-    pmatrix[8]  = 0;
-    pmatrix[9]  = 0;
-    pmatrix[10] = 1;
-    pmatrix[11] = 0;
-
-    pmatrix[12] = 0;
-    pmatrix[13] = 0;
-    pmatrix[14] = 0;
-    pmatrix[15] = 1;
-  }
-  else
-  {
-    double a =  (1 + 3*exp(-4*t/3) ) / 4;
-    double b = (1 - a) / 3;
-
-    pmatrix[0]  = a;
-    pmatrix[1]  = b;
-    pmatrix[2]  = b;
-    pmatrix[3]  = b;
-
-    pmatrix[4]  = b;
-    pmatrix[5]  = a;
-    pmatrix[6]  = b;
-    pmatrix[7]  = b;
-
-    pmatrix[8]  = b;
-    pmatrix[9]  = b;
-    pmatrix[10] = a;
-    pmatrix[11] = b;
-
-    pmatrix[12] = b;
-    pmatrix[13] = b;
-    pmatrix[14] = b;
-    pmatrix[15] = a;
-  }
-}
-
-static void tip_tip(int lchild, int rchild, double * clv)
-{
-  int sites = com.npatt;
-  int states = 4;
-  int n,i;
-
-  pmat_jc69(lmatrix, nodes[lchild].branch);
-  pmat_jc69(rmatrix, nodes[rchild].branch);
-
-  __m128d xmm0,xmm1,xmm2;
-
-  for (n = 0; n < sites; ++n)
-  {
-    double * lmat = lmatrix + (com.z[lchild][n] << 2); // *states;
-    double * rmat = rmatrix + (com.z[rchild][n] << 2); // *states;
-
-    /*
-
-    Vectorization of the following code snippet
-    (only applicable to JC69)
-
-    for (i = 0; i < states; ++i)
-      clv[i] = lmat[i] * rmat[i];
-    
-    */
-
-    xmm0 = _mm_load_pd(lmat);
-    xmm1 = _mm_load_pd(rmat);
-    xmm2 = _mm_mul_pd(xmm0,xmm1);
-
-    _mm_store_pd(clv,xmm2);
-
-    xmm0 = _mm_load_pd(lmat+2);
-    xmm1 = _mm_load_pd(rmat+2);
-    xmm2 = _mm_mul_pd(xmm0,xmm1);
-
-    _mm_store_pd(clv+2,xmm2);
-
-    clv += states;
-  }
-}
-
-static void tip_tip_amb(int lchild, int rchild, double * clv)
-{
-  int sites = com.npatt;
-  int states = 4;
-  int n,k;
-
-  double lvec[4];
-  double rvec[4];
-
-  pmat_jc69(lmatrix, nodes[lchild].branch);
-  pmat_jc69(rmatrix, nodes[rchild].branch);
-  
-  __m128d xmm0,xmm1,xmm2,xmm3;
-  __m128d ymm0,ymm1,ymm2,ymm3;
-
-  for (n = 0; n < sites; ++n)
-  {
-    double * lmat = lmatrix + (CharaMap[com.z[lchild][n]][0] << 2); // *states;
-    double * rmat = rmatrix + (CharaMap[com.z[rchild][n]][0] << 2); // *states;
-    
-    xmm0 = _mm_load_pd(lmat);
-    xmm1 = _mm_load_pd(lmat+2);
-    for (k = 1; k < nChara[com.z[lchild][n]]; ++k)
-    {
-      lmat = lmatrix + (CharaMap[com.z[lchild][n]][k] << 2); // *states;
-      xmm2 = _mm_load_pd(lmat);
-      xmm3 = _mm_load_pd(lmat+2);
-      xmm0 = _mm_add_pd(xmm0,xmm2);
-      xmm1 = _mm_add_pd(xmm1,xmm3);
-    }
-
-    ymm0 = _mm_load_pd(rmat);
-    ymm1 = _mm_load_pd(rmat+2);
-    for (k = 1; k < nChara[com.z[rchild][n]]; ++k)
-    {
-      rmat = rmatrix + (CharaMap[com.z[rchild][n]][k] << 2); // *states;
-      ymm2 = _mm_load_pd(rmat);
-      ymm3 = _mm_load_pd(rmat+2);
-      ymm0 = _mm_add_pd(ymm0,ymm2);
-      ymm1 = _mm_add_pd(ymm1,ymm3);
-    }
-
-    xmm2 = _mm_mul_pd(xmm0,ymm0);
-    xmm3 = _mm_mul_pd(xmm1,ymm1);
-
-    _mm_store_pd(clv,xmm2);
-    _mm_store_pd(clv+2,xmm3);
-
-    clv += states;
-  }
-}
-
-static void tip_inner(int lchild, int rchild, double * clv)
-{
-  int sites = com.npatt;
-  int states = 4;
-  int h,j,k;
-  double y;
-
-  __m128d xmm0,xmm1,xmm2,xmm3,xmm4;
-  __m128d xmm5,xmm6,xmm7,xmm8,xmm9;
-
-  pmat_jc69(lmatrix, nodes[lchild].branch);
-  pmat_jc69(rmatrix, nodes[rchild].branch);
-
-  double * rclv = nodes[rchild].conP;
-
-  for (h = 0; h < sites; ++h)
-  {
-    double * lmat = lmatrix + (com.z[lchild][h] << 2); // *states;
-    double * rmat = rmatrix;
-
-    /* 
-    
-    Vectorization of the following code snippet
-
-    for (j = 0; j < states; ++j)
-    {
-      y = 0;
-      for (k = 0; k < states; ++k)
-        y += rmat[k] * rclv[k];
-
-      rmat += states;
-
-      clv[j] = y*lmat[j];
-    }
-    */
-
-    xmm0 = _mm_load_pd(rclv);           /* needed */
-    xmm2 = _mm_load_pd(rmat);
-    xmm3 = _mm_mul_pd(xmm0,xmm2);
-
-    xmm1 = _mm_load_pd(rclv+2);         /* needed */
-    xmm2 = _mm_load_pd(rmat+2);
-    xmm4 = _mm_mul_pd(xmm1,xmm2);
-
-    /* calculate (b1*d1 + b2*d2 | b3*d3 + b4*d4) */
-    xmm2 = _mm_hadd_pd(xmm3,xmm4);      /* needed (1) */
-
-    rmat += states;
-
-    xmm3 = _mm_load_pd(rmat);
-    xmm4 = _mm_mul_pd(xmm0,xmm3);
-
-    xmm5 = _mm_load_pd(rmat+2);
-    xmm6 = _mm_mul_pd(xmm1,xmm5);
-
-    /* calculate (b5*d1 + b6*d2 | b7*d3 + b8*d4) */
-    xmm3 = _mm_hadd_pd(xmm4,xmm6);     /* needed (2) */
-
-    rmat += states;
-
-    xmm4 = _mm_load_pd(rmat);
-    xmm5 = _mm_mul_pd(xmm0,xmm4);
-
-    xmm6 = _mm_load_pd(rmat+2);
-    xmm7 = _mm_mul_pd(xmm1,xmm6);
-
-    /* calculate (b9*d1 + b10*d2 | b11*d3 + b12*d4) */
-    xmm4 = _mm_hadd_pd(xmm5,xmm7);     /* needed (3) */
-
-    rmat += states;
-
-    xmm5 = _mm_load_pd(rmat);
-    xmm6 = _mm_mul_pd(xmm0,xmm5);
-
-    xmm7 = _mm_load_pd(rmat+2);
-    xmm8 = _mm_mul_pd(xmm1,xmm7);
-
-    /* calculate (b13*d1 + b14*d2 | b15*d3 + b16*d4) */
-    xmm5 = _mm_hadd_pd(xmm6,xmm8);     /* needed (4) */
-    
-    /* calculate (b1*d1 + b2*d2 + b3*d3 + b4*d3 |
-                  b5*d1 + b6*d2 + b7*d3 + b8*d4 ) */
-    xmm0 = _mm_hadd_pd(xmm2,xmm3);
-
-    /* calculate (b9*d1 + b10*d2 + b11*d3 + b12*d3 |
-                  b13*d1 + b14*d2 + b15*d3 + b16*d4 ) */
-    xmm1 = _mm_hadd_pd(xmm4,xmm5);
-
-    xmm2 = _mm_load_pd(lmat);
-    xmm3 = _mm_load_pd(lmat+2);
-    xmm4 = _mm_mul_pd(xmm0,xmm2);
-    xmm5 = _mm_mul_pd(xmm1,xmm3);
-
-    _mm_store_pd(clv,xmm4);
-    _mm_store_pd(clv+2,xmm5);
-
-    clv += states;
-    rclv += states;
-  }
-}
-
-static void tip_inner_amb(int lchild, int rchild, double * clv)
-{
-  int sites = com.npatt;
-  int states = 4;
-  int h,k;
-
-  pmat_jc69(lmatrix, nodes[lchild].branch);
-  pmat_jc69(rmatrix, nodes[rchild].branch);
-
-  __m128d xmm0,xmm1,xmm2,xmm3;
-  __m128d ymm2,ymm3,ymm4,ymm5,ymm6,ymm7,ymm8,ymm9;
-
-  double * rclv = nodes[rchild].conP;
-
-  for (h = 0; h < sites; ++h)
-  {
-    double * lmat = lmatrix + (CharaMap[com.z[lchild][h]][0] << 2); // *states;
-    double * rmat = rmatrix;
-
-    /* compute term for left child (tip) */
-    xmm0 = _mm_load_pd(lmat);
-    xmm1 = _mm_load_pd(lmat+2);
-    for (k = 1; k < nChara[com.z[lchild][h]]; ++k)
-    {
-      lmat = lmatrix + (CharaMap[com.z[lchild][h]][k] << 2); // *states;
-      xmm2 = _mm_load_pd(lmat);
-      xmm3 = _mm_load_pd(lmat+2);
-      xmm0 = _mm_add_pd(xmm0,xmm2);
-      xmm1 = _mm_add_pd(xmm1,xmm3);
-    }
-
-    /* compute term for right child (inner) */
-    ymm3 = _mm_load_pd(rclv);              /* needed */
-    ymm4 = _mm_load_pd(rmat);
-    ymm6 = _mm_mul_pd(ymm3,ymm4);
-
-    ymm4 = _mm_load_pd(rclv+2);            /* needed */
-    ymm7 = _mm_load_pd(rmat+2);
-    ymm8 = _mm_mul_pd(ymm4,ymm7);
-
-    /* calculate (b1*d1 + b2*d2 | b3*d3 + b4*d4) */
-    ymm5 = _mm_hadd_pd(ymm6,ymm8);         /* needed (2) */
-
-    rmat += states;
-
-    ymm7 = _mm_load_pd(rmat);
-    ymm8 = _mm_mul_pd(ymm3,ymm7);
-
-    ymm7 = _mm_load_pd(rmat+2);
-    ymm9 = _mm_mul_pd(ymm4,ymm7);
-    
-    /* calculate (b5*d1 + b6*d2 | b7*d3 + b8*d4) */
-    ymm7 = _mm_hadd_pd(ymm8,ymm9);         /* needed (4) */
-
-    ymm2 = _mm_hadd_pd(ymm5,ymm7);
-    ymm9 = _mm_mul_pd(xmm0,ymm2);
-    _mm_store_pd(clv, ymm9);
-
-    rmat += states;
-
-    ymm6 = _mm_load_pd(rmat);
-    ymm7 = _mm_mul_pd(ymm3,ymm6);
-
-    ymm6 = _mm_load_pd(rmat+2);
-    ymm8 = _mm_mul_pd(ymm4,ymm6);
-    
-    /* calculate (b9*d1 + b10*d2 | b11*d3 + b12*d4) */
-    ymm5 = _mm_hadd_pd(ymm7,ymm8);         /* needed (2) */
-    
-    rmat += states;
-
-    ymm7 = _mm_load_pd(rmat);
-    ymm8 = _mm_mul_pd(ymm3,ymm7);
-
-    ymm7 = _mm_load_pd(rmat+2);
-    ymm9 = _mm_mul_pd(ymm4,ymm7);
-    
-    /* calculate (b13*d1 + b14*d2 | b15*d3 + b16*d4) */
-    ymm7 = _mm_hadd_pd(ymm8,ymm9);         /* needed (4) */
-
-    ymm2 = _mm_hadd_pd(ymm5,ymm7);
-    ymm9 = _mm_mul_pd(xmm1,ymm2);
-    _mm_store_pd(clv+2, ymm9);
-
-    clv += states;
-    rclv += states;
-  }
-}
-
-static void inner_inner(int lchild, int rchild, double * clv)
-{
-  int sites = com.npatt;
-  int states = 4;
-  int h,j,k;
-  double x,y;
-
-  pmat_jc69(lmatrix, nodes[lchild].branch);
-  pmat_jc69(rmatrix, nodes[rchild].branch);
-
-  double * lclv = nodes[lchild].conP;
-  double * rclv = nodes[rchild].conP;
-
-  __m128d xmm0,xmm1,xmm2,xmm3,xmm4;
-  __m128d xmm5,xmm6,xmm7,xmm8,xmm9;
-
-  /* 
-
-  +-----+-----+-----+-----+     +-----+-----+-----+-----+
-  | a1  | a2  | a3  | a4  |     | b1  | b2  | b3  | b4  |
-  +-----+-----+-----+-----+     +-----+-----+-----+-----+
-  | a5  | a6  | a7  | a8  |     | b5  | b6  | b7  | b8  |
-  +-----+-----+-----+-----+     +-----+-----+-----+-----+
-  | a9  | a10 | a11 | a12 |     | b9  | b10 | b11 | b12 |
-  +-----+-----+-----+-----+     +-----+-----+-----+-----+
-  | a13 | a14 | a15 | a16 |     | b13 | b14 | b15 | b16 |
-  +-----+-----+-----+-----+     +-----+-----+-----+-----+
-    
-              x                             x
-
-    +----+----+----+----+         +----+----+----+----+
-    | c1 | c2 | c3 | c4 |         | d1 | d2 | d3 | d4 |
-    +----+----+----+----+         +----+----+----+----+
-
-  */
-
-  for (h = 0; h < sites; ++h)
-  {
-    double * lmat = lmatrix;
-    double * rmat = rmatrix;
-
-
-    /* 
-    
-    Vectorization of the following code snippet
-
-    for (j = 0; j < states; ++j)
-    {
-      x = 0;
-      y = 0;
-      for (k = 0; k < states; ++k)
-      {
-        x += lmat[k] * lclv[k];
-        y += rmat[k] * rclv[k];
-      }
-
-      lmat += states;
-      rmat += states;
-
-      clv[j] = x*y;
-    }
-
-    */
-
-    /* compute left */
-    xmm0 = _mm_load_pd(lclv);              /* needed */
-    xmm2 = _mm_load_pd(lmat);
-    xmm3 = _mm_mul_pd(xmm0,xmm2);
-
-    xmm1 = _mm_load_pd(lclv+2);            /* needed */
-    xmm2 = _mm_load_pd(lmat+2);
-    xmm4 = _mm_mul_pd(xmm1,xmm2);
-
-    /* calculate (a1*c1 + a2*c2 | a3*c3 + a4*c4) */
-    xmm2 = _mm_hadd_pd(xmm3,xmm4);         /* needed (1) */
-
-    /* compute right */
-    xmm3 = _mm_load_pd(rclv);              /* needed */
-    xmm4 = _mm_load_pd(rmat);
-    xmm6 = _mm_mul_pd(xmm3,xmm4);
-
-    xmm4 = _mm_load_pd(rclv+2);            /* needed */
-    xmm7 = _mm_load_pd(rmat+2);
-    xmm8 = _mm_mul_pd(xmm4,xmm7);
-
-    /* calculate (b1*d1 + b2*d2 | b3*d3 + b4*d4) */
-    xmm5 = _mm_hadd_pd(xmm6,xmm8);         /* needed (2) */
-
-    rmat += states;
-    lmat += states;
-
-    /* compute left */
-    xmm6 = _mm_load_pd(lmat);
-    xmm7 = _mm_mul_pd(xmm0,xmm6);
-
-    xmm6 = _mm_load_pd(lmat+2);
-    xmm8 = _mm_mul_pd(xmm1,xmm6);
-
-    /* calculate (a5*c1 + a6*c2 | a7*c3 + a8*c4) */
-    xmm6 = _mm_hadd_pd(xmm7,xmm8);         /* needed (3) */
-
-    /* compute right */
-    xmm7 = _mm_load_pd(rmat);
-    xmm8 = _mm_mul_pd(xmm3,xmm7);
-
-    xmm7 = _mm_load_pd(rmat+2);
-    xmm9 = _mm_mul_pd(xmm4,xmm7);
-    
-    /* calculate (b5*d1 + b6*d2 | b7*d3 + b8*d4) */
-    xmm7 = _mm_hadd_pd(xmm8,xmm9);         /* needed (4) */
-    
-    xmm8 = _mm_hadd_pd(xmm2,xmm6);
-    xmm2 = _mm_hadd_pd(xmm5,xmm7);
-    xmm9 = _mm_mul_pd(xmm8,xmm2);
-    _mm_store_pd(clv, xmm9);
-
-    rmat += states;
-    lmat += states;
-
-    /* compute left */
-    xmm5 = _mm_load_pd(lmat);
-    xmm6 = _mm_mul_pd(xmm0,xmm5);
-
-    xmm5 = _mm_load_pd(lmat+2);
-    xmm7 = _mm_mul_pd(xmm1,xmm5);
-
-    /* calculate (a9*c1 + a10*c2 | a11*c3 + a12*c4) */
-    xmm2 = _mm_hadd_pd(xmm6,xmm7);         /* needed (1) */
-
-    /* compute right */
-    xmm6 = _mm_load_pd(rmat);
-    xmm7 = _mm_mul_pd(xmm3,xmm6);
-
-    xmm6 = _mm_load_pd(rmat+2);
-    xmm8 = _mm_mul_pd(xmm4,xmm6);
-    
-    /* calculate (b9*d1 + b10*d2 | b11*d3 + b12*d4) */
-    xmm5 = _mm_hadd_pd(xmm7,xmm8);         /* needed (2) */
-    
-    rmat += states;
-    lmat += states;
-
-    /* compute left */
-    xmm6 = _mm_load_pd(lmat);
-    xmm7 = _mm_mul_pd(xmm0,xmm6);
-
-    xmm6 = _mm_load_pd(lmat+2);
-    xmm8 = _mm_mul_pd(xmm1,xmm6);
-
-    /* calculate (a13*c1 + a14*c2 | a15*c3 + a16*c4) */
-    xmm6 = _mm_hadd_pd(xmm7,xmm8);         /* needed (3) */
-
-    /* compute right */
-    xmm7 = _mm_load_pd(rmat);
-    xmm8 = _mm_mul_pd(xmm3,xmm7);
-
-    xmm7 = _mm_load_pd(rmat+2);
-    xmm9 = _mm_mul_pd(xmm4,xmm7);
-    
-    /* calculate (b13*d1 + b14*d2 | b15*d3 + b16*d4) */
-    xmm7 = _mm_hadd_pd(xmm8,xmm9);         /* needed (4) */
-    
-    xmm8 = _mm_hadd_pd(xmm2,xmm6);
-    xmm2 = _mm_hadd_pd(xmm5,xmm7);
-    xmm9 = _mm_mul_pd(xmm8,xmm2);
-    _mm_store_pd(clv+2, xmm9);
-
-    clv  += states;
-    lclv += states;
-    rclv += states;
-  }
-}
-
-int ConditonalPNode (int inode)
-{
-  int states = 4;
-  int sites = com.npatt;
-  int i;
-  int child, lchild, rchild;
-
-  /* recursive call the ConditionalPNode on all children of the current node */
-  for (i = 0; i < nodes[inode].nson; ++i)
-  {
-    /* get id of i-th child of current node */
-    child = nodes[inode].sons[i];
-
-    /* if child is an inner node then recursively compute its conditional
-       probabilities vector */
-    if (nodes[child].nson > 0 && (!mcmc.saveconP || !com.oldconP[child]))
-      ConditonalPNode(nodes[inode].sons[i]);
-  }
-
-  /* initialize CLV entries of current node to 1 */
-  int n = sites * states;
-  for (i = 0; i < n; ++i)
-    nodes[inode].conP[i] = 1;
-
-  if (nodes[inode].nson == 0) return (0);
-
-  lchild = nodes[inode].sons[0];
-  rchild = nodes[inode].sons[1];
-
-  int ltip = (nodes[lchild].nson == 0);
-  int rtip = (nodes[rchild].nson == 0);
-
-  if (ltip && rtip)
-  {
-    if (com.cleandata)
-      tip_tip(lchild,rchild,nodes[inode].conP);
-    else
-    {
-      //assert(0);
-      tip_tip_amb(lchild,rchild,nodes[inode].conP);
-    }
-  }
-  else if (ltip)
-  {
-    if (com.cleandata)
-      tip_inner(lchild,rchild,nodes[inode].conP);
-    else
-    {
-      tip_inner_amb(lchild,rchild,nodes[inode].conP);
-      //assert(0);
-    }
-  }
-  else if (rtip)
-  {
-    if (com.cleandata)
-      tip_inner(rchild,lchild,nodes[inode].conP);
-    else
-    {
-      //assert(0);
-      tip_inner_amb(rchild,lchild,nodes[inode].conP);
-    }
-  }
-
-  else
-    inner_inner(lchild,rchild,nodes[inode].conP);
-
-  return (0);
-}


=====================================
src/tools.c
=====================================
@@ -1756,7 +1756,8 @@ double rndLaplace(void) {
 
 double rndt2(void)
 {
-   /* Standard Student's t_2 variate, with d.f. = 2.  t2 has mean 0 and variance infinity. */
+   /* Standard Student's t_2 variate, with d.f. = 2.  t2 has mean 0 and variance infinity. 
+   */
    double u, t2;
 
    u = 2 * rndu() - 1;
@@ -1772,9 +1773,9 @@ double rndt4(void)
       This has variance 1, and is the standard t4 variate divided by sqrt(2).
       The standard t4 variate has variance 2.
    */
-   double u, v, w, c2, r2, t4, sqrt2 = 0.707106781;
+   double u, v, w, c2, r2, t4, sqrt2 = 0.7071067811865475244;
 
-   for (; ; ) {
+   for ( ; ; ) {
       u = 2 * rndu() - 1;
       v = 2 * rndu() - 1;
       w = u*u + v*v;
@@ -4525,7 +4526,7 @@ double Binomial(double n, int k, double *scale)
    /* calculates (n choose k), where n is any real number, and k is integer.
       If(*scale!=0) the result should be c+exp(*scale).
    */
-   double c = 1, i, large = 1e99;
+   double c = 1, i, large = 1e200;
 
    *scale = 0;
    if ((int)k != k)
@@ -5676,12 +5677,12 @@ int DescriptiveStatistics(FILE *fout, char infile[], int nbin, int propternary,
       for (j = 0; j < p; j++)
          fscanf(fin, "%lf", &data[j*n + i]);
    fclose(fin);
-
+/*
    if(p>1) {
       printf("\nGreat offer!  I can smooth a few 2-D densities for free.  How many do you want? ");
       scanf("%d", &nf2d);
    }
-   
+*/   
    if (nf2d > MAXNF2D) error2("I don't want to do that many!");
    for (i = 0; i < nf2d; i++) {
       printf("pair #%d (e.g., type  1 3  to use variables #1 and #3)? ", i + 1);
@@ -5735,6 +5736,8 @@ int DescriptiveStatistics(FILE *fout, char infile[], int nbin, int propternary,
    fprintf(fout, "\n\n");
    fflush(fout);
 
+return(0);
+
    fprintf(fout, "\nCorrelation matrix");
    for (j = SkipColumns; j < p; j++) {
       fprintf(fout, "\n%-8s ", varstr[j]);


=====================================
src/treesub.c
=====================================
@@ -936,7 +936,7 @@ int ReadSeq(FILE *fout, FILE *fseq, int cleandata, int locus)
 
 
       /***** GIBBON, 2017.10.8 **/
-      //return(0);
+      /* return(0); */
 
       if (!com.readpattern) {
          PatternWeight();
@@ -2907,9 +2907,9 @@ int GetTreeFileType(FILE *ftree, int *ntree, int *pauptree, int shortform)
 int PopPaupTree(FILE *ftree);
 int PopPaupTree(FILE *ftree)
 {
-   /* This reads out the string in front of the tree in the nexus format,
-    typically "tree PAUP_1 = [&U]" with "[&U]" optional
-    */
+/* This reads out the string in front of the tree in the nexus format,
+   typically "tree PAUP_1 = [&U]" with "[&U]" optional
+*/
    int ch;
 
    for (; ;) {
@@ -2927,12 +2927,12 @@ int PopPaupTree(FILE *ftree)
 
 void DownTreeCladeLabel(int inode, int label)
 {
-   /* This goes down the tree to change $ labels (stored in nodes[].label2) into
-      # labels (stored in nodes[].label).  To deal with nested clade labels,
-      branches within a clade are labeled by negative numbers initially, and
-      converted to positive labels at the end of the algorithm.
-      nodes[].label and nodes[].label2 are initialized to -1 before this routine is called.
-   */
+/* This goes down the tree to change $ labels (stored in nodes[].label2) into
+   # labels (stored in nodes[].label).  To deal with nested clade labels,
+   branches within a clade are labeled by negative numbers initially, and
+   converted to positive labels at the end of the algorithm.
+   nodes[].label and nodes[].label2 are initialized to -1 before this routine is called.
+*/
    int i;
 
    if (nodes[inode].label2 != -1)
@@ -3130,7 +3130,7 @@ int ReadTreeN(FILE *ftree, int *haslength, int copyname, int popline)
       else {     /* label or annotation (node can be tip or internal node) */
          if (strchr(quote[0], ch)) {
             k = ReadUntil(ftree, line, quote[1], lline);
-            ch = fgetc(ftree);   /* pop off '"  */
+            ch = fgetc(ftree);   /* pop off ' "  */
          }
          else {
             ungetc(ch, ftree);
@@ -3140,8 +3140,10 @@ int ReadTreeN(FILE *ftree, int *haslength, int copyname, int popline)
          if (nodes[inode].annotation == NULL) error2("oom annotation");
          strcpy(nodes[inode].annotation, line);
          /* This line is used in MCcoal.  ProcessNodeAnnotation is used to process node annotation elsewhere. */
-         if (line[0] == '#')
+         if (line[0] == '#')        /* branch labels used in baseml and codeml */
             sscanf(line + 1, "%lf", &nodes[inode].label);
+         else if (line[0] == '$')   /* clade labels used to label branches in baseml and codeml */
+            sscanf(line + 1, "%lf", &nodes[inode].label2);
       }
    }   /* for( ; ; ) */
 
@@ -3195,9 +3197,15 @@ int OutSubTreeN(FILE *fout, int inode, int spnames, int printopt)
    }
    if ((printopt & PrNodeNum) && nodes[inode].nson)
       fprintf(fout, " %d ", inode + 1);
+
+#if (defined BPP)
+   if ((printopt & PrNodeStr) && inode < com.ns && nodes[inode].label > 0)
+      fprintf(fout, " [&theta=%.6f]", nodes[inode].label);
+#else
    if ((printopt & PrLabel) && nodes[inode].label > 0)
       fprintf(fout, " #%.6f", nodes[inode].label);
    /* fprintf(fout, " [&label=%.6f]", nodes[inode].label); */
+#endif
 
    if ((printopt & PrAge) && nodes[inode].age)
       fprintf(fout, " @%.6f", nodes[inode].age);
@@ -8544,10 +8552,9 @@ int ProcessNodeAnnotation(int *haslabel)
          stree.nodes[i].fossil = j;
          pch2 = strchr(pch, braces[0]);
          if (pch2 == NULL) pch2 = strchr(pch, braces[1]);
-         if (pch2)
-           pch = pch2 + 1;
-         else
-            error2("did not find left brace");
+         if (pch2 == NULL)
+           error2("did not find left brace");
+         pch = pch2 + 1;
 
          switch (j) {
          case (LOWER_F):
@@ -8693,7 +8700,8 @@ int ProcessNodeAnnotation(int *haslabel)
    for (i = 0, com.nbtype = 0; i < tree.nnode; i++) {
       if (i == tree.root) continue;
       j = (int)nodes[i].label;
-      if (j + 1 > com.nbtype)  com.nbtype = j + 1;
+      if (j + 1 > com.nbtype) 
+         com.nbtype = j + 1;
       if (j<0 || j>tree.nbranch - 1)
          error2("branch label in the tree (note labels start from 0 and are consecutive)");
    }
@@ -8953,13 +8961,17 @@ void copySptree(void)
 
 void printStree(void)
 {
-   int i, j, k;
+   int i, j, k, maxlname=10;
+
+   for (i = 0; i < stree.nspecies; i++)
+      if ((k = (int)strlen(stree.nodes[i].name)) > maxlname) 
+         maxlname = k;
 
    printf("\n************\nSpecies tree\nns = %d  nnode = %d", stree.nspecies, stree.nnode);
-   printf("\n%7s%7s  %-16s %8s %8s%16s\n", "father", "node", "name", "time", "sons", "fossil");
+   printf("\n%7s%7s  %-*s %9s %8s%16s\n", "father", "node", maxlname+2, "name", "time", "sons", "fossil");
    for (i = 0; i < stree.nnode; i++) {
-      printf("%7d%7d  %-20s %7.5f",
-         stree.nodes[i].father + 1, i + 1, stree.nodes[i].name, stree.nodes[i].age);
+      printf("%7d%7d  %-*s %8.5f",
+         stree.nodes[i].father + 1, i + 1, maxlname+6, stree.nodes[i].name, stree.nodes[i].age);
       if (stree.nodes[i].nson)
          printf("  (%2d %2d)", stree.nodes[i].sons[0] + 1, stree.nodes[i].sons[1] + 1);
 



View it on GitLab: https://salsa.debian.org/med-team/paml/commit/cdd16ab790640866f7c86c763d62de228be2bf99

-- 
View it on GitLab: https://salsa.debian.org/med-team/paml/commit/cdd16ab790640866f7c86c763d62de228be2bf99
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20191110/331fadab/attachment-0001.html>


More information about the debian-med-commit mailing list