[med-svn] [iqtree] 03/04: Add latex source of manual provided via mail by upstream

Andreas Tille tille at debian.org
Wed Sep 2 11:51:53 UTC 2015


This is an automated email from the git hooks/post-receive script.

tille pushed a commit to branch master
in repository iqtree.

commit b26862ae170b41c11dc70b39e7ac638b6e2bba4d
Author: Andreas Tille <tille at debian.org>
Date:   Wed Sep 2 13:29:43 2015 +0200

    Add latex source of manual provided via mail by upstream
---
 debian/Documents_source/iqtree-manual-1.0.tex | 993 ++++++++++++++++++++++++++
 1 file changed, 993 insertions(+)

diff --git a/debian/Documents_source/iqtree-manual-1.0.tex b/debian/Documents_source/iqtree-manual-1.0.tex
new file mode 100644
index 0000000..b1b8411
--- /dev/null
+++ b/debian/Documents_source/iqtree-manual-1.0.tex
@@ -0,0 +1,993 @@
+
+\documentclass[a4paper,11pt]{article}
+%\documentclass{article}
+
+  \usepackage{graphicx}
+  \usepackage{color}
+  \usepackage[round]{natbib}
+  %\usepackage{url}
+  \usepackage{hyperref}
+
+  %\newcommand{\xxx}{\rule{10mm}{1ex}}
+  %\hyphenation{IN-FILE-NAME PUZZLE}
+
+  %\sloppy
+
+\hoffset        -1in %% Initialization of documents is with a horizontal and  
+\voffset        -1in %% a vertical offset of one inch. %%
+%\raggedright %% Prevents horizontal block format. %%
+\setlength{\parindent}{0cm} %% Each paragraph is indented by 1 cm. %%
+\setlength{\parskip}{0.3cm} %% Each paragraph is indented by 1 cm. %%
+\setlength{\oddsidemargin}{1.1in} %% Defines the left side margin of a document. %%
+\setlength{\evensidemargin}{1.1in} %% Defines the right side margin of a document. %% 
+\setlength{\topmargin}{1mm} %% Space between top of page and header. %%
+\setlength{\headheight}{30mm} %% Height of header. %%
+\setlength{\textwidth}{154mm} %% Width of text. %%
+\setlength{\textheight}{215mm} %% Height of text. %%
+
+\newcommand{\iqtree}{$\mathcal{IQ-TREE}$}
+
+\begin{document}
+\begin{titlepage}
+
+\noindent
+\hfill
+\begin{center}
+\begin{LARGE}\textbf{IQ-TREE version 1.0 (July 2014)\\[2ex]Fast phylogenetic inference and ultrafast bootstrap analysis by maximum likelihood.}
+\end{LARGE}
+\end{center}
+%\hfill~
+\vfill
+
+\begin{center}
+\begin{LARGE}User Manual and Tutorial
+\end{LARGE}
+
+
+\vfill
+
+\begin{tabular}{ll}
+%\small Copyright (C) 2012-2013 by & \small Bui Quang Minh, Lam-Tung Nguyen, Heiko A. Schmidt, \\ 
+%								& and \small Arndt von Haeseler \\
+\end{tabular}
+\end{center}
+
+\begin{LARGE}Please read carefully before using IQ-TREE the first time!
+\end{LARGE}
+
+\vfill
+
+\begin{description}
+\item[Project managers:] ~\\
+Bui Quang Minh - \texttt{minh.bui(at)mfpl.ac.at}
+
+Arndt von Haeseler - \texttt{arndt.von.haeseler(at)mfpl.ac.at}
+
+\item [Core developers:] ~\\
+Lam-Tung Nguyen - \texttt{tung.nguyen(at)mfpl.ac.at}
+
+Olga Chernomor - \texttt{olga.chernomor(at)mfpl.ac.at}
+
+Diep Thi Hoang - \texttt{diep.thi.hoang(at)gmail.com}
+
+\item [Support:] ~\\
+Heiko A. Schmidt - \texttt{heiko.schmidt(at)mfpl.ac.at}
+
+\item [Contact address:] ~\\
+Center for Integrative Bioinformatics Vienna (CIBIV)\\
+Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna\\
+   Dr. Bohr-Gasse 9, A-1030 Vienna, Austria\\
+
+\end{description}
+
+
+
+\vfill
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\textbf{License Agreement}
+\label{Legal Stuff}
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or   
+(at your option) any later version. 
+
+This program is distributed in the hope that it will be useful, but    
+WITHOUT ANY WARRANTY; without even the implied warranty of             
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU       
+General Public License for more details.                               
+
+
+\vfill
+
+\end{titlepage}
+
+
+\tableofcontents
+\clearpage
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Introduction}
+\label{introduction}
+
+
+IQ-TREE is an efficient program for reconstructing large maximum likelihood trees and
+ assessing branch supports with the ultrafast bootstrap approximation.
+ IQ-TREE extends the IQPNNI algorithm with many enhancements.
+IQ-TREE is open-source and available free of charge from
+
+\url{http://www.cibiv.at/software/iqtree/}
+
+IQ-TREE has been tested on Unix, Mac OS X, and Windows.
+The code of IQ-TREE has been written in standard C/C++, which is possibly 
+compilable on other platforms.
+Please read the \emph{Installation} section \ref{Installation} for more 
+details.
+We suggest that this documentation should be read before using IQ-TREE
+the first time!
+
+For impatient users we established a very user-friendly web server:
+
+\url{http://iqtree.cibiv.univie.ac.at}
+
+Its intuitive web interface allows users to perform online tree reconstruction within a few clicks.
+Note that this online service only allows max. 12 CPU hours and 1 GB memory per job.
+In case your job exceeds these limits, you can copy and paste the command-line displayed to 
+run the analysis at your local machine.
+
+To cite IQ-TREE please use the following paper:
+
+\textbf{Bui Quang Minh, Minh Anh Thi Nguyen, and Arndt von Haeseler} (2013) Ultrafast approximation for phylogenetic bootstrap. \emph{Mol. Biol. Evol.}, 30:1188-1195.
+
+
+%A manuscript was submitted:
+
+%\textbf{Lam-Tung Nguyen, Heiko A. Schmidt, Arndt von Haeseler, and Bui Quang Minh} (2014) IQ-TREE: A fast and
+%effective stochastic algorithm for estimating maximum likelihood phylogenies.
+    
+Further readings on the methods developed:
+
+\begin{itemize}
+\item \textbf{Heiko A. Schmidt and Arndt von Haeseler} (2009) Phylogenetic Inference Using Maximum Likelihood Methods. In P. Lemey, M. Salemi, A.M. Vandamme (eds.)\emph{The Phylogenetic Handbook: a Practical Approach to Phylogenetic Analysis and Hypothesis Testing.}, 2nd Edition, 181-209, Cambridge University Press, Cambridge.
+
+\item \textbf{Bui Quang Minh, Le Sy Vinh, Arndt von Haeseler and Heiko A. Schmidt} (2005) pIQPNNI: Parallel reconstruction of large maximum likelihood phylogenies. \emph{Bioinformatics}, 21(19):3794-6. 
+
+\item \textbf{Le Sy Vinh and Arndt von Haeseler} (2004) IQPNNI: Moving fast through tree space and stopping in time. \emph{Mol. Biol. Evol.}, 21(8):1565-1571.
+
+\end{itemize}    
+    
+%\item Tung-Lam Nguyen, Heiko A. Schmidt, Bui Quang Minh, and Arndt von Haeseler (2012) IQ-TREE: Efficient algorithm
+%for phylogenetic inference by maximum likelihood and important quartet puzzling. \emph{In prep.}
+
+If you encounter bugs please send the \texttt{.log} file of the run and possibly the alignment to: \texttt{tung.nguyen(AT)univie.ac.at}  and \texttt{minh.bui(AT)univie.ac.at}.
+
+%============================================%
+\subsection{What's new in version 1.0?}
+\label{whatnews}
+
+Version 1.0 is the major release of the IQ-TREE software. We are happy to announce the following new features:
+\begin{itemize}
+\item Integration of the phylogenetic likelihood library \citep[PLL; ][]{tomas2014} for fast likelihood computation. This is enabled via \texttt{-pll} option and gives a speedup of 2X to 8X.
+\item A novel fast and effective stochastic algorithm for estimating maximum likelihood phylogenies. It outperforms RAxML and PhyML in terms of log-likelihoods while requiring similar amount of computing time. A manuscript describing the new method was submitted. See Section \ref{sec.new-tree-search} for more details.
+\item  Codon models: GY (Goldman \& Yang 1994), MG (Muse \& Gaut 1994), and ECM (Kosiol et al. 2007)
+\item Morphological models: MK and ORDERED (Lewis 2001)
+\item Ascertainment bias correction model (+ASC) for e.g., morphological or SNP data (Lewis 2001)
+\item Nearest neighbor interchange with five (instead of one) branch optimization (\texttt{-nni5}) is now the default option because of its higher
+accuracy
+\item SH-aLRT branch test also works now for partition models.
+\end{itemize}
+
+%============================================%
+\subsection{Key features}
+\label{features}
+
+IQ-TREE provides a lot of options for phylogenetic reconstruction. The main features include:
+
+\begin{itemize}
+\item Reconstruction of the maximum likelihood tree from sequence alignments \citep{vinh2004,minh2005a}.
+\item Ultrafast bootstrap approximation for assessing branch supports \citep{minh2013}.
+\item Various substitution models for binary, nucleotide, amino-acid with/without rate heterogeneity.
+\item Partition models for phylogenomic data
+\item Automatic selection of best-fit models similarly to ModelTest \citep{posada1998}.
+\item Standard non-parametric bootstrap \citep{felsenstein1985}.
+\item Single branch tests \citep[LBP, SH-aLRT; ][]{adachi1996b,guindon2010}.
+\item Test of model homogeneity assumption along the tree \citep{weiss2003}.
+\item Site-specific rate model \citep{meyer2003}.
+\item Fast consensus tree reconstruction, Robinson-Foulds distance computation.
+\end{itemize}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Installation}
+\label{Installation}
+
+  See below for information how to install/build the different
+  versions of the IQ-TREE software. Executable versions of the sequential,
+  that is, non-parallel program are intended for a number of operating 
+  systems. 
+
+\subsection{Binary release}
+
+\begin{enumerate}
+\item Download the executable version of IQ-TREE
+       for your operating system if it is available (\texttt{iqtree-XXX-OS.tar.gz}
+       or \texttt{iqtree-XXX-OS.zip}, where \texttt{XXX} is the current version number and 
+       OS the operating system) from\\
+       \url{http://www.cibiv.at/software/iqtree}
+\item Extract the files (e.g., with \texttt{tar xvzf iqtree-XXX-OS.tar.gz} under Unix).
+       This should create a directory \texttt{iqtree-XXX-OS}.
+\item You will find the executable in \texttt{iqtree-XXX-OS/}. 
+       This executable you should rename to \texttt{iqtree} (or \texttt{iqtree.exe}
+       on Windows systems) and copy it to your system search path
+       such that it is found by your system.
+\end{enumerate}
+
+\textbf{Note on multi--ompcore version:} The executable is named \texttt{iqtree-omp} 
+(or \texttt{iqtree-omp.exe} on Windows). Please also copy other files needed for OpenMP (e.g., \texttt{*.dll} on Windows)
+to the same folder that you copied \texttt{iqtree-omp} to. Finally, for Mac OS X
+you have to install MacPorts and the associated gcc47 to run \texttt{iqtree-omp} properly (see a how-to in section \ref{sec:build-openmp}).
+
+    If you encounter problems, please ask your local administrator for help.
+
+\subsection{Building source package}
+
+    To build IQ-TREE from the sources you need a C++ compiler (e.g., gcc) and the CMake tool
+    installed (This is usually the case on UNIX/Linux systems. For 
+    Windows you might want to obtain CygWin/MinWG/MS Visual C++ or XCode for MacOSX). 
+    Then you can follow the procedure below:
+
+\begin{enumerate}
+\item Download the current version of the software (\texttt{iqtree-XXX-Source.tar.gz} or\\
+       \texttt{iqtree-XXX-Source.zip}, where \texttt{XXX} is the current version number) from\\ 
+       \url{http://www.cibiv.at/software/iqtree}
+\item  Extract the files (e.g., with \texttt{tar xvzf iqtree-XXX-Source.tar.gz} under Unix).
+       This should create a directory \texttt{iqtree-XXX-Source}.
+\item  Change into this directory.
+\item  Create a sub-directory \texttt{build} and go into this sub-directory by entering:
+\begin{verbatim}
+         mkdir build
+         cd build
+\end{verbatim}
+
+\item  Configure the source codes using CMake:
+
+\begin{verbatim}
+         cmake ..
+\end{verbatim}
+
+\item Compile and build the source codes:
+\begin{verbatim}
+         make
+\end{verbatim}
+
+       This creates an executable \texttt{iqtree}
+       (or \texttt{iqtree.exe} on Windows systems).  This executable can copied to your system search path
+       such that it is found by your system.
+\end{enumerate}
+
+    If you encounter problems, please ask your local administrator for help.
+
+\subsection{Building multi-core parallel version (\textcolor{red}{Update!})}
+\label{sec:build-openmp}
+
+To build the multi-core version you need a compiler that supports the OpenMP standard (e.g., gcc).
+For Linux and Windows the gcc and MinGW compilers work just fine.
+However, in our test on Mac OS X, IQ-TREE was successfully compiled with the default the XCode gcc
+but the example run crashed for unknown reason.
+Therefore, we employed  MacPorts (with gcc47 or later) and successfully ran IQ-TREE compiled with MacPorts gcc. To this end, please first install
+MacPorts, gcc in MacPorts (\texttt{sudo port install gcc47}) and configure gcc to point to the MacPorts' gcc 
+version (\texttt{sudo port select --set gcc mp-gcc47}). 
+
+The compilation then follows the same route with slightly changed command line for cmake:
+
+\begin{verbatim}
+         cmake .. -DIQTREE_FLAGS="omp"
+\end{verbatim}
+
+All other commands remain the same. It is recommended to copy the executable file \textcolor{red}{\texttt{iqtree-omp}}
+(or \texttt{iqtree-omp.exe} on Windows)
+ to the system search path such that one can simply run \texttt{iqtree-omp} from the command-line.
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Tutorial}
+\label{tutorial}
+
+This section gives users a quick starting guide. You can either download the binary
+for your platform from the IQ-TREE website or the source code. In the later case,
+you will need to compile the source code (see \emph{Installation} section \ref{Installation}). For the next steps, the \texttt{iqtree}
+executable should be then copied into the \texttt{bin} folder such that
+IQ-TREE can be invoked by simply entering \texttt{'iqtree'} at the command-line.
+You can run \texttt{'iqtree -h'} to see a list of options available in IQ-TREE.
+
+%============================================%
+\subsection{First running example(\textcolor{red}{Update!})}
+
+From the download there is an example alignment called \texttt{example.phy}
+ in PHYLIP format (IQ-TREE also supports FASTA and NEXUS files). You can now start to reconstruct a maximum-likelihood tree
+from this alignment by typing (assuming that you are now in the same folder with \texttt{example.phy}):
+\begin{verbatim}
+  iqtree -s example.phy
+\end{verbatim}
+\texttt{'-s'} is the option to specify the name of the alignment file that is always required by
+IQ-TREE to work. At the end of the run IQ-TREE will write several output files:
+
+\begin{itemize}
+\item \texttt{example.phy.iqtree}: the main report file that is self readable for users. You
+should look at this file to see the results.
+\item \texttt{example.phy.treefile}: the ML tree in NEWICK format, which can be visualized
+by tree viewer tools such as FigTree, iTOL. Note that this newick tree is also embedded in 
+\texttt{example.phy.iqtree}.
+%\item \texttt{example.phy.bionj}: the BIONJ tree in NEWICK format, which is used internally
+%by IQ-TREE as a starting tree for the tree search procedure.
+%\item \texttt{example.phy.jcdist}: the Juke-Cantor corrected distance matrix.
+%\item \texttt{example.phy.mldist}: the ML distance matrix (based on the given substitution model).
+\item \texttt{example.phy.log}: log file of the entire run (also printed on the screen). To report
+bugs, please send this log file and the original alignment file to the authors.
+\end{itemize}
+
+Note that all output files have the default prefix as the alignment file name. You can always 
+change the prefix using the \texttt{'-pre'} option, e.g.:
+\begin{verbatim}
+  iqtree -s example.phy -pre myprefix
+\end{verbatim}
+Then IQ-TREE will write output files \texttt{myprefix.iqtree, myprefix.treefile}, etc. This is
+ helpful when you do several runs for the same input.
+
+\textcolor{red}{******** NEW IN VERSION 1.0 ********}
+
+Since version 1.0 IQ-TREE by default offers a more accurate tree search and bootstrap by 
+optimizing five branches around the nearest neighbor interchanges (NNIs). This comes with a trade-off
+of approximately 2X longer running time than 0.9.X version. To switch back to
+old behaviour of optimizing one branch around NNIs, simply use the \texttt{-nni1} option:
+
+\begin{verbatim}
+  iqtree -s example.phy -nni1
+\end{verbatim}
+
+
+%============================================%
+\subsection{Choosing the substitution model}
+
+IQ-TREE supports numerous substitution models for binary, DNA, and protein data and Gamma rate 
+heterogeneity model. If you do not specify, IQ-TREE will use the default HKY, WAG, and JC models for DNA, protein,
+and binary alignments, respectively. For most data sets these models are too simplified.
+If you have no idea about which model is appropriate for your data, let IQ-TREE automatically determine the best-fit model 
+for your alignment with:
+\begin{verbatim}
+  iqtree -s example.phy -m TEST
+\end{verbatim}
+\texttt{'-m'} is the option to specify the model name to use during the analysis. \texttt{'TEST'}
+is a key word telling IQ-TREE to first select the best-fit model. The remaining analysis
+will be done using the selected model. More specifically, IQ-TREE computes the log-likelihoods
+of the initial BIONJ tree for many different models and the Akaike information criterion (AIC), 
+corrected Akaike information criterion (AICc), and the Bayesian information criterion (BIC).
+Then IQ-TREE chooses the model that minimizes the BIC score (you can also change to AIC or AICc by 
+adding the option "-AIC" or "-AICc", respectively).
+Moreover, IQ-TREE will write an additional file:
+
+\begin{itemize}
+ \item \texttt{example.phy.model}: log-likelihoods for all models tested.
+\end{itemize}
+
+If you now look at \texttt{example.phy.iqtree} you will see that IQ-TREE selected the model \texttt{'TIM'}
+with \texttt{'Invar+Gamma'} rate heterogeneity. So \texttt{'TIM+I+G'} is the best-fit model
+for this example data. From now on you can run with e.g.:
+\begin{verbatim}
+  iqtree -s example.phy -m TIM+I+G
+\end{verbatim}
+Sometimes you only want to find the best-fit model without doing tree reconstruction, then run:
+\begin{verbatim}
+  iqtree -s example.phy -m TESTONLY
+\end{verbatim}
+Here, IQ-TREE will stop after finishing the model selection. The name of the best-fit model will be printed on the screen.
+Finally, note that IQ-TREE will check if the file \texttt{'*.model'} exists and is correct.
+If so, it will automatically reuse the log-likelihoods computed to speed up the model selection procedure.
+
+\textcolor{red}{********NEW********}
+
+Since version 0.9.6 IQ-TREE offers the partition model selection for multi-gene analysis. 
+See Section \ref{sec.partition-model-selection} for more details.
+
+
+%============================================%
+\subsection{Support for phylogenetic likelihood library (\textcolor{red}{New in version 1.0!})}
+
+In the major release 1.0, we added the support for PLL \citep{tomas2014} which helps to speed 
+up IQ-TREE by a factor of 2X to 8X. To test the new feature simply use option \texttt{'-pll'}, for example:
+
+\begin{verbatim}
+  iqtree -s example.phy -pll -m GTR+G
+\end{verbatim}
+
+Here, we also specifies model GTR+G as it is currently supported by PLL. Note that this
+option does not entirely work with other options yet (such as \texttt{'-m TEST'}). In such cases, an error message will be displayed.
+
+%============================================%
+\subsection{Novel tree search algorithm (\textcolor{red}{New in version 1.0!})}
+\label{sec.new-tree-search}
+
+IQ-TREE 1.0 implemented a new tree search algorithm, which explore the tree space much more
+efficiently than version 0.9.X. Here, IQ-TREE combines parsimony analysis to provide better
+starting trees, new stochastic algorithm to escape local optima, and new stopping rule. The new search strategies
+come with a few parameters where the default values were tested to work well for many different data sets.
+Moreover, you can change the default parameters with options:
+
+\begin{verbatim}
+  -numpars <number>    Number of initial parsimony trees (default: 100)
+  -toppars <number>    Number of top initial parsimony trees (dfault: 20)
+  -numcand <number>    Number of candidate trees during search (defaut: 5)
+  -pers <perturbation> Perturbation strength for stochastic NNI (default: 0.5)
+  -numstop <number>    Number of unsuccessful iterations to stop (default: 100)
+\end{verbatim}
+
+Finally, you can still switch back to the old algorithm of 0.9.X by options:
+
+\begin{verbatim}
+  -iqp                 Use IQP tree perturbation (default: sNNI)
+  -iqpnni              Switch entirely to old IQPNNI algorithm
+\end{verbatim}
+
+%============================================%
+\subsection{Codon models (\textcolor{red}{New in version 1.0!})}
+
+IQ-TREE 1.0 supports basic codon models (GY, MG, and ECM). You need to input a protein-coding DNA alignment and specify codon data by option \texttt{'-st CODON'} (Otherwise, IQ-TREE applies DNA model because it detects that your alignment has DNA sequences):
+
+\begin{verbatim}
+  iqtree -s coding_gene.phy -st CODON 
+\end{verbatim}
+
+If your alignment length is not divisible by 3, an error message will occur. IQ-TREE will group sites 1,2,3 into codon site 1; site 4,5,6 to codon site 2; etc. Moreover, any codon, which has at least one gap/unknown/ambiguous nucleotide, will be treated unknown codon character.
+
+If you are not sure which model to use, simply add \texttt{'-m TEST'}, which also works for codon alignments: 
+
+\begin{verbatim}
+  iqtree -s coding_gene.phy -st CODON -m TEST
+\end{verbatim}
+
+By default IQ-TREE uses the standard genetic code.
+You can change to other genetic code (see \url{http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi}) with following options:
+
+\begin{tabular}{ll}
+\hline
+Option & Genetic code\\
+\hline
+\texttt{-st CODON1} & The Standard Code (same as \texttt{-st CODON})\\
+\texttt{-st CODON2} & The Vertebrate Mitochondrial Code\\
+\texttt{-st CODON3} & The Yeast Mitochondrial Code\\
+\texttt{-st CODON4} & The Mold, Protozoan, and Coelenterate Mitochondrial Code and \\
+   & the Mycoplasma/Spiroplasma Code\\
+\texttt{-st CODON5} & The Invertebrate Mitochondrial Code\\
+\texttt{-st CODON6} & The Ciliate, Dasycladacean and Hexamita Nuclear Code\\
+\texttt{-st CODON9} & The Echinoderm and Flatworm Mitochondrial Code\\
+\texttt{-st CODON10} & The Euplotid Nuclear Code\\
+\texttt{-st CODON11} & The Bacterial, Archaeal and Plant Plastid Code\\
+\texttt{-st CODON12} & The Alternative Yeast Nuclear Code\\
+\texttt{-st CODON13} & The Ascidian Mitochondrial Code\\
+\texttt{-st CODON14} & The Alternative Flatworm Mitochondrial Code\\
+\texttt{-st CODON16} & Chlorophycean Mitochondrial Code\\
+\texttt{-st CODON21} & Trematode Mitochondrial Code\\
+\texttt{-st CODON22} & Scenedesmus obliquus Mitochondrial Code\\
+\texttt{-st CODON23} & Thraustochytrium Mitochondrial Code\\
+\texttt{-st CODON24} & Pterobranchia Mitochondrial Code\\
+\texttt{-st CODON25} & Candidate Division SR1 and Gracilibacteria Code\\
+\hline
+\end{tabular}
+
+%============================================%
+\subsection{Morphological and SNP data (\textcolor{red}{New in version 1.0!})}
+
+IQ-TREE 1.0 supports discrete morphological alignment by \texttt{'-st MORPH'} option:
+
+\begin{verbatim}
+  iqtree -s morphology.phy -st MORPH
+\end{verbatim}
+
+IQ-TREE implements to two morphological ML models (MK and ORDERED; see Lewis 2001), where MK is the default model.
+MK is a Juke-Cantor-like model. ORDERED model considers only transitions between states $i\rightarrow i-1$, $i\rightarrow i$, and $i \rightarrow i+1$. Morphological data typically do not have constant (uninformative) sites. 
+In such case, you should apply ascertainment bias correction model by e.g.:
+ 
+\begin{verbatim}
+  iqtree -s morphology.phy -st MORPH -m MK+ASC
+\end{verbatim}
+
+You can again select best-fit model with \texttt{'-m TEST'} (which also consider +G):
+
+\begin{verbatim}
+  iqtree -s morphology.phy -st MORPH -m TEST
+\end{verbatim}
+
+For SNP data (DNA) that typically do not contain constant sites, you can explicitly tell model to include
+ascertainment bias correction:
+
+\begin{verbatim}
+  iqtree -s SNP_data.phy -m GTR+ASC
+\end{verbatim}
+
+You can explicitly tell model testing to only include \texttt{'+ASC'} model with:
+\begin{verbatim}
+  iqtree -s SNP_data.phy -m TEST+ASC
+\end{verbatim}
+
+%============================================%
+\subsection{Assessing branch supports with ultrafast bootstrap approximation}
+
+The ultrafast bootstrap approximation is the most value-added feature available in IQ-TREE. Simply run:
+\begin{verbatim}
+  iqtree -s example.phy -m TIM+I+G -bb 1000
+\end{verbatim}
+\texttt{'-bb'}  specifies the number of bootstrap replicates where $1000$
+is the minimal number recommended. When you now look at the section \texttt{'MAXIMUM LIKELIHOOD TREE'}
+in \texttt{example.phy.iqtree}, you will see that every internal node of the tree figure
+will be associated a support value in percentage. Branch supports are assigned onto the ML tree and printed in 
+\texttt{example.phy.treefile} that can be viewed again in FigTree. 
+In addition, IQ-TREE writes the following files:
+\begin{itemize}
+\item \texttt{example.phy.contree}: the consensus tree with assigned branch supports where branch lengths 
+are optimized  on the original alignment.
+ \item \texttt{example.phy.splits}: support values in percentage for all splits (bipartitions),
+computed as the occurence frequencies in the bootstrap trees. This file is in "star-dot" format.
+\item \texttt{example.phy.splits.nex}: has the same information as \texttt{example.phy.splits}
+but in NEXUS format, which can be viewed with SplitsTree program. 
+\end{itemize}
+
+%============================================%
+\subsubsection{Assessing branch supports with  standard nonparametric bootstrap}
+
+The standard nonparametric bootstrap can be invoked by:
+\begin{verbatim}
+  iqtree -s example.phy -m TIM+I+G -b 100
+\end{verbatim}
+\texttt{'-b'} specifies the number of bootstrap replicates where $100$
+is the minimal number recommended. IQ-TREE will additionally writes the following files:
+
+\begin{itemize}
+ \item \texttt{example.phy.boottrees}: the set of bootstrap trees reconstructed.
+\item \texttt{example.phy.contree}: the bootstrap consensus tree with assigned branch supports where branch lengths 
+are optimized  on the original alignment.
+\end{itemize}
+
+%============================================%
+\subsubsection{Assessing branch supports with single branch tests}
+
+IQ-TREE provides an implementation of the SH-like approximate likelihood ratio test \citep[SH-aLRT; ][]{guindon2010}.
+To perform this test, simply run:
+\begin{verbatim}
+  iqtree -s example.phy -m TIM+I+G -alrt 1000
+\end{verbatim}
+\texttt{'-alrt'} specifies the number of bootstrap replicates for SH-aLRT where $1000$ is
+the minimal number recommended. IQ-TREE will perform SH-aLRT at the end of the tree reconstruction process
+and assign support values onto the ML tree. The support values will be reflected in the tree file \texttt{example.phy.treefile}.
+
+IQ-TREE also provides a fast implementation of the local bootstrap probabilities method \citep{adachi1996b}, 
+which we call Fast-LBP. Fast-LBP computes the branch support by comparing the tree log-likelihood
+with the log-likelihoods of the two alternative nearest-neighbor-interchange (NNI) trees around the branch of interest.
+However, Fast-LBP is different from LBP where we compute the log-likelihoods of the two alternative NNI trees
+by only reoptimizing five branches around the branch of interest (Similar idea is used in the SH-aLRT test).
+To perform Fast-LBP, simply run:
+\begin{verbatim}
+  iqtree -s example.phy -m TIM+I+G -lbp 1000
+\end{verbatim}
+
+You can also perform both tests:
+\begin{verbatim}
+  iqtree -s example.phy -m TIM+I+G -alrt 1000 -lbp 1000
+\end{verbatim}
+The branches of the resulting ML tree will be assigned with both SH-aLRT and Fast-LBP support values.
+Finally, you can also combine the ultrafast bootstrap approximation with single branch tests within one single run:
+\begin{verbatim}
+  iqtree -s example.phy -m TIM+I+G -bb 1000 -alrt 1000 -lbp 1000
+\end{verbatim}
+
+%============================================%
+\subsection{Partitioned analysis for multi-gene alignments}
+\label{sec.partition-model}
+
+In the partition model, you can specify a substitution model for each gene/character set individually. 
+IQ-TREE will then estimate the model parameters and branch lengths separately for every partition.
+To this end, you have to first prepare a NEXUS file including a \texttt{SETS} block with
+\texttt{CharSet} and \texttt{CharPartition} commands to specify individual genes and the partition, respectively.
+For example:
+\begin{verbatim}
+#nexus
+begin sets;
+        charset part1 = 1-100;
+        charset part2 = 101-384;
+        charpartition mine = HKY+G:part1, GTR+I+G:part2;
+end;
+\end{verbatim}
+
+Now if you save this into a file \texttt{example.nex} and run:
+\begin{verbatim}
+  iqtree -s example.phy -sp example.nex
+\end{verbatim}
+This means that IQ-TREE will partition the alignment \texttt{example.phy} into 2 subsets named \texttt{part1} and \texttt{part2}
+containing sites (columns) 1-100 and 101-384, respectively. Moreover, IQ-TREE applies the
+subtitution models \texttt{HKY+G} and \texttt{GTR+I+G} to \texttt{part1} and \texttt{part2}, respectively.
+After the run has finished, the \texttt{example.nex.iqtree} file will contain substitution model 
+parameters, trees with branch lengths for all subsets in the partition.
+
+
+Moreover, the \texttt{CharSet} command allows to specify non-consecutive sites using comma-separated list of ranges with e.g.:
+\begin{verbatim}
+        charset part1 = 1-100 200-384;
+\end{verbatim}
+That means, \texttt{part1} contains sites 1-100 and 200-384 of the alignment. Another example is:
+\begin{verbatim}
+        charset part1 = 1-100\3;
+\end{verbatim}
+for extracting sites 1,4,7,...,100 from the alignment. This is useful for getting codon positions from the protein-coding alignment.
+
+Moreover, IQ-TREE allows a more advanced feature compared to other programs: 
+IQ-TREE allows different subsets coming from different alignments.
+For example:
+\begin{verbatim}
+#nexus
+begin sets;
+        charset part1 = part1.phy: 1-100\3 201-300\3;
+        charset part2 = part2.phy: 101-300;
+        charpartition mine = HKY:part1, GTR+G:part2;
+end;
+\end{verbatim}
+Here, \texttt{part1} and \texttt{part2} are read from alignment files \texttt{part1.phy} and \texttt{part2.phy}, respectively
+(a ':' is needed to separate the alignment file name and site specification). Because the alignment file names
+were embedded in this NEXUS file, you can simply run:
+\begin{verbatim}
+  iqtree -sp example.nex
+\end{verbatim}
+
+Note that 
+\texttt{part1.phy} and \texttt{part2.phy} need not contain the same set of sequence names. That means, if some sequence occurs
+in  \texttt{part1.phy} but not in  \texttt{part2.phy}, IQ-TREE will treat corresponding part of sequence
+in \texttt{part2.phy} as missing data. For your convenience IQ-TREE writes the concatenated alignment
+into the file \texttt{example.nex.conaln}.
+
+\textcolor{red}{********NEW EXPERIMENTAL FEATURE********}
+
+Since version 0.9.6 IQ-TREE supports partition models with joint and proportional branch lengths between genes. This is
+to reduce the number of parameters in case of model overfitting for the full partition model. For example:
+
+\begin{verbatim}
+  iqtree -spp example.nex
+\end{verbatim}
+
+applies a proportional partition model. That means, we have only one set of branch lengths for species tree 
+but allow each gene to evolve under a specific rate (scaling factor) normalized to the average of 1.
+
+A partition model with joint branch lengths is specified by:
+
+\begin{verbatim}
+  iqtree -spj example.nex
+\end{verbatim}
+ 
+(i.e., all gene-specific rates are equal to 1). 
+ 
+ 
+%============================================%
+\subsubsection{Choosing the right partitioning scheme}
+\label{sec.partition-model-selection}
+
+Since version 0.9.6 IQ-TREE implements a greedy strategy \citep{lanfear2012} that starts with the full partition model and sequentially
+merges two genes until the model fit does not increase any further:
+
+\begin{verbatim}
+  iqtree -sp example.nex -m TESTLINK
+\end{verbatim}
+
+After the best partition is found IQ-TREE will immediately start the tree reconstruction under the best-fit partition model.
+Sometimes you only want to find the best-fit partition model without doing tree reconstruction, then run:
+
+
+\begin{verbatim}
+  iqtree -sp example.nex -m TESTONLYLINK
+\end{verbatim}
+
+
+%============================================%
+\subsubsection{Bootstrapping with partition model}
+
+IQ-TREE can perform the ultrafast bootstrap with partition models by e.g.,
+\begin{verbatim}
+  iqtree -sp example.nex -bb 1000
+\end{verbatim}
+Here, IQ-TREE will resample the sites \emph{within} subsets of the partitions (i.e., 
+the bootstrap replicates are generated per subset separately and then concatenated together).
+The same holds true if you do the standard nonparametric bootstrap. 
+
+\textcolor{red}{********NEW********}
+
+Since version 0.9.6 IQ-TREE supports the gene-resampling strategy: 
+
+\begin{verbatim}
+  iqtree -sp example.nex -bb 1000 -bspec GENE
+\end{verbatim}
+
+is to resample genes instead of sites. Moreover, IQ-TREE allows an even more complicated
+strategy: resampling genes and sites within resampled genes:
+
+\begin{verbatim}
+  iqtree -sp example.nex -bb 1000 -bspec GENESITE
+\end{verbatim}
+
+
+%============================================%
+\subsection{Utilizing multi-core CPUs}
+
+A specialized version of IQ-TREE allows users to perform the analysis 
+that utilizes multiple cores during the run (made possible by the OpenMP library).
+You can download the binary from the software website or compile the source code
+yourself (see \emph{Installation} section \ref{Installation}). For the following please
+copy the binary \texttt{iqtree-omp} and other files in the package bin folder into the system \texttt{bin} folder such that it can be
+invoked from the command-line by simply running the command \texttt{iqtree-omp}.
+
+If you now run with e.g.:
+\begin{verbatim}
+  iqtree-omp -s example.phy
+\end{verbatim}
+Then IQ-TREE will use all the available cores of your CPU. 
+This might not be a good practice because our parallelization technique only works well on long alignments.
+If you have a very short alignment, it is not recommended to use this IQ-TREE version.
+Because the speedup gain depends on the alignment length,
+a good practice is to run this version with increasing number of cores by e.g.:
+\begin{verbatim}
+  iqtree-omp -s example.phy -omp 2
+\end{verbatim}
+Here, \texttt{-omp} is the option to specify the number of cores that IQ-TREE will use.
+If you see that the wall-clock time reduction is substantial compared with the sequential IQ-TREE version,
+then you can try:
+\begin{verbatim}
+  iqtree-omp -s example.phy -omp 3
+\end{verbatim}
+and so on, until no substantial reduction of running time is observed. The remaining
+analysis can then be carried out with that number of cores.
+
+For example, on my computer (Linux, Intel Core i5-2500K, 3.3 GHz, quad cores) I observed the following 
+wall-clock running time for this  example alignment:
+\begin{center}
+\begin{tabular}{cc}
+\hline
+No. cores & Wall-clock time\\
+\hline
+1 & 21.465 sec.\\
+2 & 13.627 sec.\\
+3 & 11.119 sec.\\
+4 & 10.807 sec.\\
+\hline
+\end{tabular}
+\end{center}
+Therefore, I would only use 2 cores for this specific alignment (\texttt{"-omp 2"} option).
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Advanced tutorial}
+\label{sec.advanced-tutorial}
+
+This section gives an advanced tutorial for more experienced users. It includes several advanced features
+like tree topology test, user-defined substitution models.
+
+
+%============================================%
+\subsection{Tree topology tests}
+
+IQ-TREE can compute log-likelihoods of user-defined trees passed via \texttt{-z} option:
+
+\begin{verbatim}
+  iqtree -s example.phy -z example.treels
+\end{verbatim}
+
+assuming that \texttt{example.treels} contains the trees in NEWICK format. 
+At the end of the usual run, IQ-TREE will additionally evaluate all trees in there using the estimated model parameters.
+When you look into \texttt{example.phy.iqtree}
+there will be a section \texttt{USER TREES} that lists the tree IDs and the corresponding log-likelihoods.
+Moreover, IQ-TREE will additionally write a file:
+\begin{itemize}
+\item \texttt{example.phy.treels.trees}: the trees with optimized branch lengths.
+\end{itemize}
+
+If you only want to evaluate the trees without reconstructing the ML tree, you can run:
+\begin{verbatim}
+  iqtree -s example.phy -z example.treels -n 1
+\end{verbatim}
+
+Here, IQ-TREE will only reconstruct the BIONJ+NNI tree and use that tree to estimate the model parameters,
+which are normally accurate enough for our purpose.
+
+IQ-TREE also supports several tree topology tests using the RELL approximation \citep{kishino1990} 
+including: bootstrap proportion (BP), Kishino-Hasegawa test \citep[KH; ][]{kishino1989}, Shimodaira-Hasegawa test \citep[SH; ][]{shimodaira1999}, expected likelihood weights \citep[ELW; ][]{strimmer2002}, weighted-KH (WKH), and weighted-SH (WSH) tests.
+The trees are passed via \texttt{-z} option, thus you can run:
+
+\begin{verbatim}
+  iqtree -s example.phy -z example.treels -n 1 -zb 1000
+\end{verbatim}
+
+Here, \texttt{-zb} specifies the number of RELL replicates, where 1000 is the minimum number recommended.
+The \texttt{USER TREES} section of \texttt{example.phy.iqtree} will list the results of BP, KH, SH, and ELW methods. If you want to
+also perform the WKH and WSH, simply add \texttt{-zw} option:
+
+\begin{verbatim}
+  iqtree -s example.phy -z example.treels -n 1 -zb 1000 -zw
+\end{verbatim}
+
+Finally, note that IQ-TREE will automatically detect duplicated tree topologies and omit them during the evaluation.
+
+
+%============================================%
+\subsection{User-defined substitution models}
+
+Users can specify an arbitrary DNA models using a 6-letter specification that constrains which rates to be equal. 
+For example, \texttt{010010} corresponds to the HKY model and \texttt{012345} the GTR model.
+In fact, the IQ-TREE source code internally uses this specification to simplify the coding. The 6-letter code is specified
+via -m option, e.g.:
+
+\begin{verbatim}
+  iqtree -s example.phy -m 010010+G
+\end{verbatim}
+
+Moreover, with -m option one can input a file name which contains the 6 rates (A-C, A-G, A-T, C-G, C-T, G-T) 
+and 4 base frequencies (A, C, G, T), e.g.:
+
+\begin{verbatim}
+  iqtree -s example.phy -m mymodel+G
+\end{verbatim}
+
+where \texttt{mymodel} is a file containing the 10 entries described above. One can even specify the rates within -m option by e.g.:
+
+\begin{verbatim}
+  iqtree -s example.phy -m 'TN{2.0,3.0}+G8{0.5}+I{0.15}'
+\end{verbatim}
+
+That means, we use Tamura-Nei model with fixed transition-transversion rate ratio of 2.0 and purine/pyrimidine rate ratio of 3.0. Moreover, we
+use an 8-category Gamma-distributed site rates with the shape parameter (alpha) of 0.5 and a proportion of invariable sites p-inv=0.15.
+
+Note that by default IQ-TREE computes empirical state frequencies from the alignment, but one can also optimize the frequencies by maximum-likelihood
+with \texttt{+Fo} in the model name:
+
+\begin{verbatim}
+  iqtree -s example.phy -m GTR+G+Fo
+\end{verbatim}
+
+For amino-acid alignments, if one wants to use the frequencies of the empirical protein model, then use \texttt{+Fu}, for example:
+
+\begin{verbatim}
+  iqtree -s myprotein_alignment -m WAG+G+Fu
+\end{verbatim}
+
+Finally, note that all model specifications above can be used in the partition model NEXUS file.
+
+%============================================%
+\subsection{Consensus construction and bootstrap value assignment}
+
+IQ-TREE can construct an extended majority-rule consensus tree from a set of trees written in NEWICK or NEXUS format (e.g., produced
+by MrBayes):
+
+\begin{verbatim}
+  iqtree -con mytrees
+\end{verbatim}
+
+To build a majority-rule consensus tree, simply set the minimum support threshold to 0.5:
+
+\begin{verbatim}
+  iqtree -con mytrees -t 0.5
+\end{verbatim}
+
+If you want to specify a burn-in (the number of beginning trees to ignore from the trees file), use -bi option:
+
+\begin{verbatim}
+  iqtree -con mytrees -t 0.5 -bi 100
+\end{verbatim}
+
+to skip the first 100 trees in the file.
+
+IQ-TREE can also compute a consensus network and print it into a NEXUS file by:
+
+\begin{verbatim}
+  iqtree -net mytrees
+\end{verbatim}
+
+Finally, an useful feature is to read in an input tree and a set of trees, then IQ-TREE can assign the
+support value onto the input tree (number of times each branch in the input tree occurs in the set of trees) by:
+
+\begin{verbatim}
+  iqtree -sup input_tree set_of_trees
+\end{verbatim}
+
+
+%============================================%
+\subsection{Computing Robinson-Foulds distance between trees}
+
+IQ-TREE implements a very fast Robinson-Foulds (RF) distance computation using hash table, which is a lot faster  than PHYLIP package. For example, you can run:
+
+\begin{verbatim}
+  iqtree -rf tree_set1 tree_set2
+\end{verbatim}
+
+to compute the pairwise RF distances between 2 sets of trees. If you want to compute the all-to-all RF distances
+of a set of trees, use:
+
+\begin{verbatim}
+  iqtree -rf_all tree_set
+\end{verbatim}
+
+%============================================%
+\subsection{Generating random trees}
+
+IQ-TREE provides several random tree generation models. For example,
+
+\begin{verbatim}
+    iqtree -r 100 100.tree 
+\end{verbatim}
+
+is to generate a 100-taxon random tree into the file \texttt{100.tree} under the Yule Harding model,
+where the branch lengths follow an exponential distribution with mean of 0.1.
+If you want to change the branch length distribution, run e.g:
+
+\begin{verbatim}
+    iqtree -r 100 -rlen 0.05 0.2 0.3 100.tree 
+\end{verbatim}
+
+to set the minimum, mean, and maximum branch lengths as 0.05, 0.2, and 0.3, respectively.
+If you want to generate trees under uniform model instead, use '-ru' option:
+
+\begin{verbatim}
+    iqtree -ru 100 100.tree 
+\end{verbatim}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Frequently asked questions (FAQ)}
+
+\subsection{How does IQ-TREE treat gap/missing characters?}
+
+Gaps (-) and missing characters (? or N for DNA alignments) are treated in the same way as \emph{unknown} characters, 
+which represent no information. The same treatment holds for many other ML software (RAxML, PhyML, etc.). Technically
+in the Felsenstein's pruning algorithm we fill a partial likelihood vector of all 1's for all character states. This is the same as follows.
+For a site (column) of an alignment containing AC-AG-A (i.e. A for sequence 1, C for sequence 2, - for sequence 3,...), the site-likelihood
+of a tree T is equal to the site-likelihood of the subtree of T restricted to those sequences containing non-gap characters:
+
+\[ \ell(T | AC-AG-A) = \ell(T_{sub} | ACAGA) \]
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Version History}
+\label{Version History}
+
+\begin{description}
+\item \textbf{Version 0.9.6:} October 2013
+\begin{itemize}
+\item Ultrafast model selection and partitioning for phylogenomic alignments.
+\item Introduction of nearest neighbor interchange (NNI) with five branch optimization to evaluate candidate NNIs. 
+This will bring higher accuracy for tree reconstruction and bootstrap with a tradeoff of c.a. 2X longer running time.
+\item Introduction of joint and proportional partition models to reduce the number of parameters in case of model overfitting (experimental).
+\item Introduction of gene-resampling and gene-and-site resampling for the bootstrap on multi-gene alignments.
+\end{itemize}
+
+
+\item \textbf{Version 0.9.5:} May 2013
+\begin{itemize}
+\item Introduction of bootstrap epsilon to select equally good bootstrap trees at random to deal with polytomies
+\end{itemize}
+
+\item \textbf{Version 0.9.4:} Easter 2013
+\begin{itemize}
+\item Tree topology tests
+\end{itemize}
+\item \textbf{Version 0.9.3:} March 2013
+\begin{itemize}
+\item New implementation of model selection that works on all data types.
+\item A tutorial about using partition models.
+\item Parallel OpenMP support to utilize multi-core CPUs.
+\end{itemize}
+\item \textbf{Version 0.9.0:} September 2012 - 
+First beta release.
+\end{description} 
+
+
+\section*{Credits and Acknowledgement}
+
+Some parts of the code were taken from the following packages/libraries: Phylogenetic likelihood library \citep{tomas2014}, TREE-PUZZLE  \citep{schmidt2002}, 
+BIONJ \citep{gascuel1997}, Nexus Class Libary \citep{lewis2003}, Eigen library \citep{guennebaud2010},
+SPRNG library \citep{mascagni2000}, Zlib library (\url{http://www.zlib.net}).
+
+Financial supports from the Austrian Science Fund (FWF), the Vienna Science and Technology Fund (WWTF), and the University of Vienna are greatly appreciated.
+
+\bibliographystyle{bioinformatics}
+\bibliography{genephylo,heiko} %%%%%%%%%%%
+
+
+\end{document}

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/iqtree.git



More information about the debian-med-commit mailing list