[med-svn] [Git][med-team/clonalframeml][upstream] New upstream version 1.12

Andreas Tille gitlab at salsa.debian.org
Thu Feb 27 10:42:31 GMT 2020



Andreas Tille pushed to branch upstream at Debian Med / clonalframeml


Commits:
81343d66 by Andreas Tille at 2020-02-27T11:28:26+01:00
New upstream version 1.12
- - - - -


20 changed files:

- .gitignore
- README.md
- − src/README.txt
- src/brent.h
- src/coalesce/coalescent_record.h
- src/main.cpp
- src/main.h
- src/make.sh
- − src/make_win.bat
- src/makefile
- src/myutils/DNA.h
- src/myutils/argumentwizard.h
- src/myutils/matrix.h
- src/myutils/mydouble.h
- src/myutils/myutils.h
- src/myutils/newick.h
- src/myutils/random.h
- src/myutils/vector.h
- src/powell.h
- src/xmfa.h


Changes:

=====================================
.gitignore
=====================================
@@ -1,8 +1,4 @@
-
 src/ClonalFrameML
-
 src/main.o
-
-src/version.h
-
+src/main
 src/.vscode/*


=====================================
README.md
=====================================
@@ -2,9 +2,9 @@
 
 # Introduction #
 
-This is the homepage of ClonalFrameML, a software package that performs efficient inference of recombination in bacterial genomes. ClonalFrameML was created by [Xavier Didelot](http://www.imperial.ac.uk/medicine/people/x.didelot/) and [Daniel Wilson](http://www.danielwilson.me.uk/). ClonalFrameML can be applied to any type of aligned sequence data, but is especially aimed at analysis of whole genome sequences. It is able to compare hundreds of whole genomes in a matter of hours on a standard Desktop computer. There are three main outputs from a run of ClonalFrameML: a phylogeny with branch lengths corrected to account for recombination, an estimation of the key parameters of the recombination process, and a genomic map of where recombination took place for each branch of the phylogeny.
+This is the homepage of ClonalFrameML, a software package that performs efficient inference of recombination in bacterial genomes. ClonalFrameML was created by [Xavier Didelot](http://xavierdidelot.github.io) and [Daniel Wilson](http://www.danielwilson.me.uk/). ClonalFrameML can be applied to any type of aligned sequence data, but is especially aimed at analysis of whole genome sequences. It is able to compare hundreds of whole genomes in a matter of hours on a standard Desktop computer. There are three main outputs from a run of ClonalFrameML: a phylogeny with branch lengths corrected to account for recombination, an estimation of the key parameters of the recombination process, and a genomic map of where recombination took place for each branch of the phylogeny.
 
-ClonalFrameML is a maximum likelihood implementation of the Bayesian software [ClonalFrame](http://www.xavierdidelot.xtreemhost.com/clonalframe.htm) which was previously described by [Didelot and Falush (2007)](http://www.genetics.org/cgi/content/abstract/175/3/1251). The recombination model underpinning ClonalFrameML is exactly the same as for ClonalFrame, but this new implementation is a lot faster, is able to deal with much larger genomic dataset, and does not suffer from MCMC convergence issues. A scientific paper describing ClonalFrameML in detail has been published, see [Didelot X, Wilson DJ (2015) ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes. PLoS Comput Biol 11(2): e1004041. doi:10.1371/journal.pcbi.1004041](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004041).
+ClonalFrameML is a maximum likelihood implementation of the Bayesian software [ClonalFrame](http://xavierdidelot.github.io/clonalframe.html) which was previously described by [Didelot and Falush (2007)](http://www.genetics.org/cgi/content/abstract/175/3/1251). The recombination model underpinning ClonalFrameML is exactly the same as for ClonalFrame, but this new implementation is a lot faster, is able to deal with much larger genomic dataset, and does not suffer from MCMC convergence issues. A scientific paper describing ClonalFrameML in detail has been published, see [Didelot X, Wilson DJ (2015) ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes. PLoS Comput Biol 11(2): e1004041. doi:10.1371/journal.pcbi.1004041](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004041).
 
 # Download and Installation #
 
@@ -30,4 +30,4 @@ The user guide for ClonalFrameML is available [here](https://github.com/xavierdi
 
 # Getting help #
 
-If you need assistance using ClonalFrameML, you can get in touch by emailing either [Xavier Didelot](http://www.xavierdidelot.xtreemhost.com/contact.htm) or [Daniel Wilson](http://www.danielwilson.me.uk/contact.html).
+If you need assistance using ClonalFrameML, you can get in touch by emailing either [Xavier Didelot](http://xavierdidelot.github.io/contact.html) or [Daniel Wilson](http://www.danielwilson.me.uk/contact.html).


=====================================
src/README.txt deleted
=====================================
@@ -1,45 +0,0 @@
-ClonalFrameML
-Xavier Didelot and Daniel Wilson. 2015
-
-This program reads in a Newick tree and FASTA file and, for all variable sites, reconstructs
-the joint maximum likelihood sequences at all nodes (including, for the purposes of imputation, tips)
-using the HKY85 nucleotide substitution model and an algorithm described in:
-
-    A Fast Algorithm for Joint Reconstruction of Ancestral Amino Acid Sequences
-    Tal Pupko, Itsik Peer, Ron Shamir, and Dan Graur. Mol. Biol. Evol. 17(6):890–896. 2000
-	
-Branch lengths of the tree are corrected for heterospecific horizontal gene transfer using a new maximum-
-likelihood algorithm implementing the ClonalFrame model that was described in:
-
-    Inference of Bacterial Microevolution Using Multilocus Sequence Data
-	Xavier Didelot, and Daniel Falush. Genetics 175(3):1251-1266. 2007
-
-Syntax: ClonalFrameML newick_file fasta_file output_file [OPTIONS]
-
-newick_file  The tree specified in Newick format. It must be an unrooted bifurcating tree. All
-             tips should be uniquely labelled and the internal nodes must not be labelled. Note that the
-             branch lengths must be scaled in units of expected number of substitutions per site.
-             Failure to provide appropriately scaled branch lengths will adversely affect results.
-fasta_file   The nucleotide sequences specified in FASTA format, with labels exactly matching those in
-			 the newick_file. The letter codes A, C, G and T are interpreted directly, U is converted
-			 to T, and N, -, ? and X are treated equivalently as ambiguity codes. No other codes are
-			 allowed.
-output_file  The prefix for the output files, described below.
-[OPTIONS]    Run ClonalFrameML with no arguments to see the options available.
-
-The program reports the empirical nucleotide frequencies and the joint log-likelihood of the reconstructed
-sequences for variable sites. Files are output with the following suffixes:
-
-.labelled_tree.newick          The corrected Newick tree is ouput with internal nodes labelled so that they
-                               correspond with the reconstructed ancestral sequence file.
-.ML_sequence.fasta             The reconstructed sequences (ancestral and, for the purposes of imputation,
-							   observed) in FASTA format with letter codes A, C, G and T only. The labels
-							   match exactly those in the output Newick tree.
-.position_cross_reference.txt  A vector of comma-separated values equal in length to the input FASTA file
-							   relating the positions of (variable) sites in the input FASTA file to the
-							   positions of their reconstructed sequences in the output FASTA file, starting
-							   with position 1. Sites in the input file not reconstructed are assigned a 0.
-.importation_status.txt        A FASTA file representing the inferred importation status of every site
-                               coded as 0 (unimported) 1 (imported) 2 (unimported homoplasy/multiallelic)
-							   3 (imported homoplasy/multiallelic) 4 (untested compatible) 5 (untested
-							   homoplasy).


=====================================
src/brent.h
=====================================
@@ -1,19 +1,19 @@
-/*  Copyright 2013 Daniel Wilson.
- *
+/*  
  *  brent.h
+ *  Part of ClonalFrameML
  *
- *  The myutils library is free software: you can redistribute it and/or modify
+ *  ClonalFrameML is free software: you can redistribute it and/or modify
  *  it under the terms of the GNU Lesser General Public License as published by
  *  the Free Software Foundation, either version 3 of the License, or
  *  (at your option) any later version.
  *  
- *  The myutils library is distributed in the hope that it will be useful,
+ *  ClonalFrameML is distributed in the hope that it will be useful,
  *  but WITHOUT ANY WARRANTY; without even the implied warranty of
  *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  *  GNU Lesser General Public License for more details.
  *  
  *  You should have received a copy of the GNU Lesser General Public License
- *  along with the myutils library. If not, see <http://www.gnu.org/licenses/>.
+ *  along with ClonalFrameML. If not, see <http://www.gnu.org/licenses/>..
  *
  *  Parts of this code are based on code in Numerical Recipes in C++
  *  WH Press, SA Teukolsky, WT Vetterling, BP Flannery (2002).


=====================================
src/coalesce/coalescent_record.h
=====================================
@@ -3,18 +3,18 @@
  *  coalescent_record.h
  *  Part of the coalesce library.
  *
- *  The myutils library is free software: you can redistribute it and/or modify
+ *  The coalesce library is free software: you can redistribute it and/or modify
  *  it under the terms of the GNU Lesser General Public License as published by
  *  the Free Software Foundation, either version 3 of the License, or
  *  (at your option) any later version.
  *  
- *  The myutils library is distributed in the hope that it will be useful,
+ *  The coalesce library is distributed in the hope that it will be useful,
  *  but WITHOUT ANY WARRANTY; without even the implied warranty of
  *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  *  GNU Lesser General Public License for more details.
  *  
  *  You should have received a copy of the GNU Lesser General Public License
- *  along with the myutils library. If not, see <http://www.gnu.org/licenses/>.
+ *  along with the coalesce library. If not, see <http://www.gnu.org/licenses/>.
  */
 #ifndef _RECORD_H_
 #define _RECORD_H_


=====================================
src/main.cpp
=====================================
@@ -1,5 +1,4 @@
-/*  Copyright 2013 Daniel Wilson and Xavier Didelot.
- *
+/*  
  *  main.cpp
  *  Part of ClonalFrameML
  *
@@ -21,7 +20,8 @@
 
 int main (const int argc, const char* argv[]) {
 	clock_t start_time = clock();
-	cout << "ClonalFrameML " << ClonalFrameML_GITRevision << endl;
+	cout << "ClonalFrameML " << ClonalFrameML_version << endl;
+	if (argc==2 && (strcmp(argv[1],"-version")==0||strcmp(argv[1],"-v")==0)) return 0;
 	// Process the command line arguments	
 	if(argc<4) {
 		stringstream errTxt;
@@ -43,7 +43,6 @@ int main (const int argc, const char* argv[]) {
 		errTxt << "-chromosome_name               name, eg \"chr\"            Output importation status file in BED format using given chromosome name." << endl;
 		errTxt << "-min_branch_length             value > 0 (default 1e-7)  Minimum branch length." << endl;
 		errTxt << "-reconstruct_invariant_sites   true or false (default)   Reconstruct the ancestral states at invariant sites." << endl;
-//		errTxt << "-compress_reconstructed_sites  true (default) or false   Reduce the number of columns in the output FASTA file." << endl;	// Alternative not currently implemented, so not optional
 		errTxt << "-label_uncorrected_tree        true or false (default)   Regurgitate the uncorrected Newick tree with internal nodes labelled." << endl;
 		errTxt << "Options affecting -em and -embranch:" << endl;
 		errTxt << "-prior_mean                    df \"0.1 0.001 0.1 0.0001\" Prior mean for R/theta, 1/delta, nu and M." << endl;
@@ -52,10 +51,12 @@ int main (const int argc, const char* argv[]) {
 		errTxt << "-guess_initial_m               true (default) or false   Initialize M and nu jointly in the EM algorithms." << endl;
 		errTxt << "-emsim                         value >= 0  (default 0)   Number of simulations to estimate uncertainty in the EM results." << endl;
 		errTxt << "-embranch_dispersion           value > 0 (default .01)   Dispersion in parameters among branches in the -embranch model." << endl;
+		errTxt << "-output_filtered               true of false (default)   Output a filtered alignment including only non-recombinant sites." << endl;
 		errTxt << "Options affecting -rescale_no_recombination:" << endl;
 		errTxt << "-brent_tolerance               tolerance (default .001)  Set the tolerance of the Brent routine for -rescale_no_recombination." << endl;
 		errTxt << "-powell_tolerance              tolerance (default .001)  Set the tolerance of the Powell routine for -rescale_no_recombination." << endl;
-		error(errTxt.str().c_str());
+		cout << errTxt.str().c_str()<<endl;
+		return 0;
 	}
 	// Process required arguments
 	const char* newick_file = argv[1];
@@ -64,6 +65,7 @@ int main (const int argc, const char* argv[]) {
 	string tree_out_file = string(out_file) + ".labelled_tree.newick";
 	string oritree_out_file = string(out_file) + ".labelled_uncorrected_tree.newick";
 	string fasta_out_file = string(out_file) + ".ML_sequence.fasta";
+	string fasta_filtered_file = string(out_file) + ".filtered.fasta";
 	string xref_out_file = string(out_file) + ".position_cross_reference.txt";
 	string import_out_file = string(out_file) + ".importation_status.txt";
 	string em_out_file = string(out_file) + ".em.txt";
@@ -73,7 +75,8 @@ int main (const int argc, const char* argv[]) {
 	arg.case_sensitive = false;
 	string fasta_file_list="false", xmfa_file="false", imputation_only="false", ignore_incomplete_sites="false", ignore_user_sites="", reconstruct_invariant_sites="false";
 	string use_incompatible_sites="true", rescale_no_recombination="false";
-	string show_progress="false", compress_reconstructed_sites="true";
+	string show_progress="false";
+	string output_filtered="false";
 	string string_prior_mean="0.1 0.001 0.1 0.0001", string_prior_sd="0.1 0.001 0.1 0.0001", string_initial_values = "0.1 0.001 0.05";
 	string guess_initial_m="true", em="true", embranch="false", label_original_tree="false", chr_name="";
 	double brent_tolerance = 1.0e-3, powell_tolerance = 1.0e-3, global_min_branch_length = 1.0e-7;
@@ -92,7 +95,6 @@ int main (const int argc, const char* argv[]) {
 	arg.add_item("powell_tolerance",			TP_DOUBLE, &powell_tolerance);
 	arg.add_item("rescale_no_recombination",	TP_STRING, &rescale_no_recombination);
 	arg.add_item("show_progress",				TP_STRING, &show_progress);
-	arg.add_item("compress_reconstructed_sites",TP_STRING, &compress_reconstructed_sites);	
 	arg.add_item("min_branch_length",			TP_DOUBLE, &global_min_branch_length);
 	arg.add_item("prior_mean",					TP_STRING, &string_prior_mean);
 	arg.add_item("prior_sd",					TP_STRING, &string_prior_sd);
@@ -104,6 +106,7 @@ int main (const int argc, const char* argv[]) {
 	arg.add_item("embranch_dispersion",			TP_DOUBLE, &embranch_dispersion);
 	arg.add_item("kappa",						TP_DOUBLE, &kappa);
 	arg.add_item("label_uncorrected_tree",		TP_STRING, &label_original_tree);
+	arg.add_item("output_filtered",				TP_STRING, &output_filtered);
 	arg.read_input(argc-3,argv+3);
 	bool FASTA_FILE_LIST				= string_to_bool(fasta_file_list,				"fasta_file_list");
 	bool XMFA_FILE						= string_to_bool(xmfa_file,						"xmfa_file");
@@ -113,11 +116,11 @@ int main (const int argc, const char* argv[]) {
 	bool USE_INCOMPATIBLE_SITES			= string_to_bool(use_incompatible_sites,		"use_incompatible_sites");
 	bool RESCALE_NO_RECOMBINATION		= string_to_bool(rescale_no_recombination,		"rescale_no_recombination");
 	bool SHOW_PROGRESS					= string_to_bool(show_progress,					"show_progress");
-	bool COMPRESS_RECONSTRUCTED_SITES	= string_to_bool(compress_reconstructed_sites,	"compress_reconstructed_sites");
 	bool GUESS_INITIAL_M				= string_to_bool(guess_initial_m,				"guess_initial_m");
 	bool EM								= string_to_bool(em,							"em");
 	bool EMBRANCH						= string_to_bool(embranch,						"embranch");
 	bool LABEL_ORIGINAL_TREE			= string_to_bool(label_original_tree,			"label_uncorrected_tree");
+	bool OUTPUT_FILTERED				= string_to_bool(output_filtered,				"output_filtered");
 	bool MULTITHREAD = false;
 	if(brent_tolerance<=0.0 || brent_tolerance>=0.1) {
 		stringstream errTxt;
@@ -328,7 +331,6 @@ int main (const int argc, const char* argv[]) {
 	// Report the ML
 	cout << "Maximum log-likelihood for imputation and ancestral state reconstruction = " << ML.LOG() << endl;
 	
-	if(!COMPRESS_RECONSTRUCTED_SITES) cout << "WARNING: -compress_reconstructed_sites=false not yet implemented, ignoring." << endl;
 	// Output the ML reconstructed sequences
 	write_ancestral_fasta(node_nuc, ctree_node_labels, fasta_out_file.c_str());
 	// For every position in the original FASTA file, output the corresponding position in the output FASTA file, or -1 (not included)
@@ -466,6 +468,11 @@ int main (const int argc, const char* argv[]) {
 			// Output the importation status
 			write_importation_status_intervals(is_imported,ctree_node_labels,isBLC,compat,import_out_file.c_str(),root_node,chr_name.c_str());
 			cout << "Wrote inferred importation status to " << import_out_file << endl;
+			if (OUTPUT_FILTERED) {			
+				// Output the filtered alignment
+				write_filtered_fasta(is_imported, &fa, ignore_site, fasta_filtered_file.c_str());
+				cout << "Wrote filtered alignment to " << fasta_filtered_file << endl;
+			}
 			
 			// If required, simulate under the point estimates and output posterior samples of the parameters
 			if(emsim>0) {
@@ -548,6 +555,11 @@ int main (const int argc, const char* argv[]) {
 			// Output the importation status
 			write_importation_status_intervals(is_imported,ctree_node_labels,isBLC,compat,import_out_file.c_str(),root_node,chr_name.c_str());
 			cout << "Wrote inferred importation status to " << import_out_file << endl;
+			if (OUTPUT_FILTERED) {
+				// Output the filtered alignment
+				write_filtered_fasta(is_imported, &fa, ignore_site, fasta_filtered_file.c_str());
+				cout << "Wrote filtered alignment to " << fasta_filtered_file << endl;
+			}
 			
 			// If required, simulate under the point estimates and output posterior samples of the parameters
 			if(emsim>0) {
@@ -1721,17 +1733,7 @@ void write_ancestral_fasta(Matrix<Nucleotide> &nuc, vector<string> &all_node_nam
 		errTxt << "write_ancestral_fasta(): could not open file " << file_name << " for writing";
 		error(errTxt.str().c_str());
 	}
-	write_ancestral_fasta(nuc,all_node_names,fout);
-	fout.close();
-}
-
-void write_ancestral_fasta(Matrix<Nucleotide> &nuc, vector<string> &all_node_names, ofstream &fout) {
 	static const char AGCTN[5] = {'A','G','C','T','N'};
-	if(!fout) {
-		stringstream errTxt;
-		errTxt << "write_ancestral_fasta(): could not open file stream for writing";
-		error(errTxt.str().c_str());
-	}
 	if(nuc.nrows()!=all_node_names.size()) {
 		stringstream errTxt;
 		errTxt << "write_ancestral_fasta(): number of sequences (" << nuc.nrows() << ") does not equal number of node labels (" << all_node_names.size() << ")";
@@ -1745,23 +1747,37 @@ void write_ancestral_fasta(Matrix<Nucleotide> &nuc, vector<string> &all_node_nam
 		}
 		fout << endl;
 	}	
+	fout.close();
 }
 
-void write_position_cross_reference(vector<bool> &iscompat, vector<int> &ipat, const char* file_name) {
+void write_filtered_fasta(vector< vector<ImportationState> > &imported, DNA * fa,vector<bool> &ignore_site, const char* file_name) {
 	ofstream fout(file_name);
 	if(!fout) {
 		stringstream errTxt;
-		errTxt << "write_position_cross_reference(): could not open file " << file_name << " for writing";
+		errTxt << "write_filtered_fasta(): could not open file " << file_name << " for writing";
 		error(errTxt.str().c_str());
 	}
-	write_position_cross_reference(iscompat,ipat,fout);
-	fout.close();
+	int n,pos;
+	vector<bool> tokeep(fa->lseq);
+	for (pos=0;pos<fa->lseq;pos++) {
+		tokeep[pos]=true;
+		if (ignore_site[pos]) tokeep[pos]=false;
+		for (n=0;n<imported.size();n++) if(imported[n][pos]==Imported) tokeep[pos]=false;
+	}
+	for(n=0;n<fa->nseq;n++)
+	{
+		fout << ">" << fa->label[n] << endl;
+		for(pos=0;pos<fa->lseq;pos++)
+			if (tokeep[pos]) fout << fa->sequence[n][pos];
+		fout << endl;
+	}	fout.close();
 }
 
-void write_position_cross_reference(vector<bool> &iscompat, vector<int> &ipat, ofstream &fout) {
+void write_position_cross_reference(vector<bool> &iscompat, vector<int> &ipat, const char* file_name) {
+	ofstream fout(file_name);
 	if(!fout) {
 		stringstream errTxt;
-		errTxt << "write_position_cross_reference(): could not open file stream for writing";
+		errTxt << "write_position_cross_reference(): could not open file " << file_name << " for writing";
 		error(errTxt.str().c_str());
 	}
 	int i,j,pat;
@@ -1780,6 +1796,7 @@ void write_position_cross_reference(vector<bool> &iscompat, vector<int> &ipat, o
 		fout << pat+1;
 	}
 	fout << endl;
+	fout.close();
 }
 
 mydouble likelihood_branch(const int dec_id, const int anc_id, const Matrix<Nucleotide> &node_nuc, const vector<int> &pat1, const vector<int> &cpat, const double kappa, const vector<double> &pinuc, const double branch_length) {
@@ -1810,55 +1827,6 @@ bool string_to_bool(const string s, const string label) {
 	return false;
 }
 
-void write_importation_status(vector< vector<ImportationState> > &imported, vector<string> &all_node_names, vector<bool> &isBLC, vector<int> &compat, const char* file_name, const int root_node) {
-	ofstream fout(file_name);
-	if(!fout) {
-		stringstream errTxt;
-		errTxt << "write_importation_status(): could not open file " << file_name << " for writing";
-		error(errTxt.str().c_str());
-	}
-	write_importation_status(imported,all_node_names,isBLC,compat,fout,root_node);
-	fout.close();
-}
-
-void write_importation_status(vector< vector<ImportationState> > &imported, vector<string> &all_node_names, vector<bool> &isBLC, vector<int> &compat, ofstream &fout, const int root_node) {
-	if(!fout) {
-		stringstream errTxt;
-		errTxt << "write_importation_status(): could not open file stream for writing";
-		error(errTxt.str().c_str());
-	}
-	if(imported.size()!=root_node) {
-		stringstream errTxt;
-		errTxt << "write_importation_status(): number of lineages (" << imported.size() << ") does not equal the number of non-root node labels (" << root_node << ")";
-		error(errTxt.str().c_str());
-	}
-	if(all_node_names.size()<root_node) {
-		stringstream errTxt;
-		errTxt << "write_importation_status(): number of non-root lineages (" << root_node << ") exceeds the number of node labels (" << all_node_names.size() << ")";
-		error(errTxt.str().c_str());
-	}
-	int i,pos;
-	for(i=0;i<root_node;i++) {
-		fout << ">" << all_node_names[i] << endl;
-		int k = 0;
-		for(pos=0;pos<isBLC.size();pos++) {
-			if(isBLC[pos]) {
-				// If used in branch length correction, 0 (unimported), 1 (imported), 2 (homoplasy/multiallelic unimported), 3 (homoplasy/multiallelic imported)
-				int out = 2*(compat[pos]>0) + (int)imported[i][k];
-				fout << out;
-				++k;
-			} else if(compat[pos]<=0) {
-				// If compatible but not used in branch length correction, 4
-				fout << 4;
-			} else {
-				// If homoplasy/multiallelic and not used in branch length correction, 5
-				fout << 5;
-			}
-		}
-		fout << endl;
-	}	
-}
-
 void write_importation_status_intervals(vector< vector<ImportationState> > &imported, vector<string> &all_node_names, vector<bool> &isBLC, vector<int> &compat, const char* file_name, const int root_node,  const char* chr_name) {
 	ofstream fout(file_name);
 	if(!fout) {
@@ -1866,16 +1834,6 @@ void write_importation_status_intervals(vector< vector<ImportationState> > &impo
 		errTxt << "write_importation_status_intervals(): could not open file " << file_name << " for writing";
 		error(errTxt.str().c_str());
 	}
-	write_importation_status_intervals(imported,all_node_names,isBLC,compat,fout,root_node, chr_name);
-	fout.close();
-}
-
-void write_importation_status_intervals(vector< vector<ImportationState> > &imported, vector<string> &all_node_names, vector<bool> &isBLC, vector<int> &compat, ofstream &fout, const int root_node,  const char* chr_name) {
-	if(!fout) {
-		stringstream errTxt;
-		errTxt << "write_importation_status_intervals(): could not open file stream for writing";
-		error(errTxt.str().c_str());
-	}
 	if(imported.size()!=root_node) {
 		stringstream errTxt;
 		errTxt << "write_importation_status_intervals(): number of lineages (" << imported.size() << ") does not equal the number of non-root node labels (" << root_node << ")";
@@ -1916,6 +1874,7 @@ void write_importation_status_intervals(vector< vector<ImportationState> > &impo
 			else fout << chr_name << tab << interval_beg+1 << tab << pos << tab << all_node_names[i] << endl;
 		}
 	}
+	fout.close();
 }
 
 mydouble maximum_likelihood_ClonalFrame_branch_allsites(const int dec_id, const int anc_id, const Matrix<Nucleotide> &node_nuc, const vector<bool> &iscompat, const vector<int> &ipat, const double kappa, const vector<double> &pinuc, const double branch_length, const double rho_over_theta, const double mean_import_length, const double import_divergence, vector<ImportationState> &is_imported) {


=====================================
src/main.h
=====================================
@@ -1,5 +1,4 @@
-/*  Copyright 2013 Daniel Wilson and Xavier Didelot.
- *
+/*  
  *  main.h
  *  Part of ClonalFrameML
  *
@@ -22,23 +21,20 @@
 #include <iostream>
 #include <string.h>
 #include "myutils/newick.h"
-//#include "coalesce/coalesce.h"
 #include "coalesce/coalescent_record.h"
 #include <sstream>
-//#include "myutils/myutils.h"
 #include "xmfa.h"
 #include <fstream>
 #include <algorithm>
 #include "myutils/DNA.h"
 #include "myutils/mydouble.h"
-//#include "coalesce/mutation.h"
 #include "powell.h"
 #include "myutils/argumentwizard.h"
 #include <time.h>
 #include "myutils/random.h"
 #include <limits>
 #include <iomanip>
-#include "version.h"
+#define ClonalFrameML_version "v1.12"
 
 using std::cout;
 using myutils::NewickTree;
@@ -68,15 +64,12 @@ void write_newick(const marginal_tree &ctree, const vector<string> &all_node_nam
 void write_newick(const marginal_tree &ctree, const vector<string> &all_node_names, ofstream &fout);
 void write_newick_node(const mt_node *node, const vector<string> &all_node_names, ofstream &fout);
 void write_ancestral_fasta(Matrix<Nucleotide> &nuc, vector<string> &all_node_names, const char* file_name);
-void write_ancestral_fasta(Matrix<Nucleotide> &nuc, vector<string> &all_node_names, ofstream &fout);
+void write_filtered_fasta(vector< vector<ImportationState> > &imported, DNA * fa,vector<bool> & ignore_site, const char* file_name);
 void write_position_cross_reference(vector<bool> &iscompat, vector<int> &ipat, const char* file_name);
 void write_position_cross_reference(vector<bool> &iscompat, vector<int> &ipat, ofstream &fout);
 mydouble likelihood_branch(const int dec_id, const int anc_id, const Matrix<Nucleotide> &node_nuc, const vector<int> &pat1, const vector<int> &cpat, const double kappa, const vector<double> &pinuc, const double branch_length);
 bool string_to_bool(const string s, const string label="");
-void write_importation_status(vector< vector<ImportationState> > &imported, vector<string> &all_node_names, vector<bool> &isBLC, vector<int> &compat, const char* file_name, const int root_node);
-void write_importation_status(vector< vector<ImportationState> > &imported, vector<string> &all_node_names, vector<bool> &isBLC, vector<int> &compat, ofstream &fout, const int root_node);
 void write_importation_status_intervals(vector< vector<ImportationState> > &imported, vector<string> &all_node_names, vector<bool> &isBLC, vector<int> &compat, const char* file_name, const int root_node,const char* chr_name);
-void write_importation_status_intervals(vector< vector<ImportationState> > &imported, vector<string> &all_node_names, vector<bool> &isBLC, vector<int> &compat, ofstream &fout, const int root_node, const char* chr_name);
 double Baum_Welch(const marginal_tree &tree, const Matrix<Nucleotide> &node_nuc, const vector<double> &position, const vector<int> &ipat, const double kappa, const vector<double> &pinuc, const vector<bool> &informative, const vector<double> &prior_a, const vector<double> &prior_b, vector<double> &full_param, vector<double> &posterior_a, int &neval, const bool coutput, double &priorL);
 double Baum_Welch0(const marginal_tree &tree, const Matrix<Nucleotide> &node_nuc, const vector<double> &position, const vector<int> &ipat, const double kappa, const vector<double> &pinuc, const vector<bool> &informative, const vector<double> &prior_a, const vector<double> &prior_b, const vector<double> &full_param, const vector<double> &posterior_a, const bool coutput);
 double gamma_loglikelihood(const double x, const double a, const double b);


=====================================
src/make.sh
=====================================
@@ -1,3 +1 @@
-echo "#define ClonalFrameML_GITRevision \"`git describe --tags`\"" > version.h
-g++ main.cpp -o ClonalFrameML -I ./ -I ./myutils -I ./coalesce -O3
-
+g++ main.cpp -o ClonalFrameML -O3


=====================================
src/make_win.bat deleted
=====================================
@@ -1,11 +0,0 @@
- at echo off
-rem This creates the version.h file. 
-rem You need git installed (obviously)
-rem And to be in the folder where the ".git" directory exists.
-
-FOR /F "delims=" %%i IN ('git describe --tags') DO set GITRESULT=%%i
-echo #define ClonalFrameML_GITRevision %GITRESULT% > version.h
-
-rem The linux make.sh file now compiles the code.
-rem If you're in VS, remember you need _CRT_SECURE_NO_WARNINGS in the
-rem Pre-Processor code.


=====================================
src/makefile
=====================================
@@ -1,13 +1,12 @@
-# Make file for ClonalFrameML
+# Makefile for ClonalFrameML
 CC = g++
-CFLAGS = -O3 -I./ -I./myutils -I./coalesce
-LDFLAGS = 
+CFLAGS += -O3
 OBJECTS = main.o
-HEADERS = main.h brent.h powell.h version.h
+HEADERS = main.h brent.h powell.h
 
-.PHONY: clean version
+.PHONY: clean 
 
-all: version ClonalFrameML
+all: ClonalFrameML
 
 ClonalFrameML: $(OBJECTS)
 	$(CC) $(LDFLAGS) -o ClonalFrameML $(OBJECTS)
@@ -15,8 +14,5 @@ ClonalFrameML: $(OBJECTS)
 main.o: main.cpp $(HEADERS)
 	$(CC) $(CFLAGS) -c -o main.o main.cpp
 
-version:
-	/bin/echo "#define ClonalFrameML_GITRevision \"`git describe --tags`\"" > version.h
-
 clean:
-	rm $(OBJECTS)
+	rm -f $(OBJECTS)


=====================================
src/myutils/DNA.h
=====================================
@@ -31,7 +31,7 @@
 #include <string>
 #include <fstream>
 #include <iostream>
-#include "myutils/myutils.h"
+#include "myutils.h"
 #include <map>
 #include <sstream>
 #include <algorithm>


=====================================
src/myutils/argumentwizard.h
=====================================
@@ -33,7 +33,7 @@
 #include <vector>
 #include <iostream>
 #include <ctype.h>
-#include "myutils/myerror.h"
+#include "myerror.h"
 #include <sstream>
 
 namespace myutils


=====================================
src/myutils/matrix.h
=====================================
@@ -27,8 +27,8 @@
 
 #include <stdlib.h>
 #include <stdio.h>
-#include "myutils/vector.h"
-#include "myutils/utils.h"
+#include "vector.h"
+#include "utils.h"
 
 /****************************************************************/
 /*						myutils::Matrix							*/


=====================================
src/myutils/mydouble.h
=====================================
@@ -21,7 +21,7 @@
 
 #include <limits>
 #include <math.h>
-#include "myutils/myerror.h"
+#include "myerror.h"
 
 using myutils::error;
 


=====================================
src/myutils/myutils.h
=====================================
@@ -27,26 +27,12 @@
 
 #pragma warning(disable: 4786)
 
-/*Includes all header files in the myutils directory*/
-/*#include "cmatrix.h"
+#include "myerror.h"
+#include "utils.h"
+#include "vector.h"
 #include "matrix.h"
+#include "lotri_matrix.h"
 #include "random.h"
-#include "error.h"
 #include "DNA.h"
-#include "vector.h"*/
-
-#include "myutils/myerror.h"
-#include "myutils/utils.h"
-//#include "myutils/cmatrix.h"
-#include "myutils/vector.h"
-#include "myutils/matrix.h"
-#include "myutils/lotri_matrix.h"
-#include "myutils/random.h"
-#include "myutils/DNA.h"
-//#include "myutils/pause.h"
-//#include "myutils/sort.h"
-
-//#include "controlwizard.h" /* has problems in Linux with pointers */
-//#include "pause.h"	/* removed because conio.h is not standard */
 
 #endif


=====================================
src/myutils/newick.h
=====================================
@@ -10,7 +10,7 @@
 #define _NEWICK_H_
 #include <vector>
 #include <string>
-#include "myutils/myerror.h"
+#include "myerror.h"
 #include <sstream>
 #include <iostream>
 


=====================================
src/myutils/random.h
=====================================
@@ -28,11 +28,11 @@
 #include <cmath>
 #include <time.h>
 #include <vector>
-#include "myutils/vector.h"
-#include "myutils/matrix.h"
-#include "myutils/lotri_matrix.h"
+#include "vector.h"
+#include "matrix.h"
+#include "lotri_matrix.h"
 
-#include "myutils/myerror.h"
+#include "myerror.h"
 
 namespace myutils {
 class Random {


=====================================
src/myutils/vector.h
=====================================
@@ -25,7 +25,7 @@
 #ifndef _MYUTILS_VECTOR_H_
 #define _MYUTILS_VECTOR_H_
 
-#include "myutils/myerror.h"
+#include "myerror.h"
 #include <stdlib.h>
 #include <stdio.h>
 //#include <myutils.h>


=====================================
src/powell.h
=====================================
@@ -1,19 +1,20 @@
-/*  Copyright 2013 Daniel Wilson.
- *
+/*  
  *  powell.h
+ *  Part of ClonalFrameML
+ *
  *
- *  The myutils library is free software: you can redistribute it and/or modify
+ *  ClonalFrameML is free software: you can redistribute it and/or modify
  *  it under the terms of the GNU Lesser General Public License as published by
  *  the Free Software Foundation, either version 3 of the License, or
  *  (at your option) any later version.
  *  
- *  The myutils library is distributed in the hope that it will be useful,
+ *  ClonalFrameML is distributed in the hope that it will be useful,
  *  but WITHOUT ANY WARRANTY; without even the implied warranty of
  *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  *  GNU Lesser General Public License for more details.
  *  
  *  You should have received a copy of the GNU Lesser General Public License
- *  along with the myutils library. If not, see <http://www.gnu.org/licenses/>.
+ *  along with ClonalFrameML. If not, see <http://www.gnu.org/licenses/>..
  *
  *  Parts of this code are based on code in Numerical Recipes in C++
  *  WH Press, SA Teukolsky, WT Vetterling, BP Flannery (2002).


=====================================
src/xmfa.h
=====================================
@@ -1,8 +1,8 @@
-/*  Copyright 2013 Daniel Wilson and Xavier Didelot.
- *
+/*  
  *  xmfa.h
  *  Part of ClonalFrameML
  *
+ *
  *  ClonalFrameML is free software: you can redistribute it and/or modify
  *  it under the terms of the GNU Lesser General Public License as published by
  *  the Free Software Foundation, either version 3 of the License, or



View it on GitLab: https://salsa.debian.org/med-team/clonalframeml/-/commit/81343d66151bae0c5c45fca53761c4eab15bbb16

-- 
View it on GitLab: https://salsa.debian.org/med-team/clonalframeml/-/commit/81343d66151bae0c5c45fca53761c4eab15bbb16
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20200227/dc90c4e6/attachment-0001.html>


More information about the debian-med-commit mailing list