[med-svn] [bowtie2] 01/06: Imported Upstream version 2.2.1

Alex Mestiashvili malex-guest at moszumanska.debian.org
Tue Apr 15 12:40:59 UTC 2014


This is an automated email from the git hooks/post-receive script.

malex-guest pushed a commit to branch master
in repository bowtie2.

commit 89ac339b1c8a2a8335b9478033531f75e30e5860
Author: Alexandre Mestiashvili <alex at biotec.tu-dresden.de>
Date:   Fri Mar 7 08:44:16 2014 +0100

    Imported Upstream version 2.2.1
---
 MANUAL          | 24 ++++++++++-------
 MANUAL.markdown | 32 +++++++++++++++-------
 NEWS            | 18 ++++++++++---
 VERSION         |  2 +-
 bowtie2         | 17 +++++++-----
 bt2_idx.h       | 12 ++++-----
 bt2_io.cpp      | 82 ++++++++++++++++++++-------------------------------------
 bt2_search.cpp  |  6 +++--
 doc/manual.html | 12 +++++++--
 mm.h            | 15 -----------
 ref_read.h      | 13 ---------
 reference.cpp   | 33 +++++++----------------
 word_io.h       | 72 --------------------------------------------------
 13 files changed, 120 insertions(+), 218 deletions(-)

diff --git a/MANUAL b/MANUAL
index 6ab7437..f943ce4 100644
--- a/MANUAL
+++ b/MANUAL
@@ -144,13 +144,7 @@ Building from source
 Building Bowtie 2 from source requires a GNU-like environment with GCC, GNU Make
 and other basics.  It should be possible to build Bowtie 2 on most vanilla Linux
 installations or on a Mac installation with [Xcode] installed.  Bowtie 2 can
-also be built on Windows using [Cygwin] or [MinGW] (MinGW recommended). For a 
-MinGW build the choice of what compiler is to be used is important since this
-will determine if a 32 or 64 bit code can be successfully compiled using it. If 
-there is a need to generate both 32 and 64 bit on the same machine then a multilib 
-MinGW has to be properly installed. [MSYS], the [zlib] library, and depending on 
-architecture [pthreads] library are also required. We are recommending a 64 bit
-build since it has some clear advantages in real life research problems. In order 
+also be built on Windows using a 64-bit MinGW distribution and MSYS. In order 
 to simplify the MinGW setup it might be worth investigating popular MinGW personal 
 builds since these are coming already prepared with most of the toolchains needed.
 
@@ -168,10 +162,8 @@ it is possible to use pthread library on non-POSIX platform like Windows, due
 to performance reasons bowtie 2 will try to use Windows native multithreading
 if possible.
 
-[Cygwin]:   http://www.cygwin.com/
 [MinGW]:    http://www.mingw.org/
 [MSYS]:     http://www.mingw.org/wiki/msys
-[zlib]:     http://cygwin.com/packages/mingw-zlib/
 [pthreads]: http://sourceware.org/pthreads-win32/
 [GnuWin32]: http://gnuwin32.sf.net/packages/coreutils.htm
 [Download]: https://sourceforge.net/projects/bowtie-bio/files/bowtie2/
@@ -1046,6 +1038,20 @@ fragments; i.e. specifying `--nofw` causes `bowtie2` to explore only those
 paired-end configurations corresponding to fragments from the reverse-complement
 (Crick) strand.  Default: both strands enabled. 
 
+    --no-1mm-upfront
+
+By default, Bowtie 2 will attempt to find either an exact or a 1-mismatch
+end-to-end alignment for the read *before* trying the [multiseed heuristic].  Such
+alignments can be found very quickly, and many short read alignments have exact or
+near-exact end-to-end alignments.  However, this can lead to unexpected
+alignments when the user also sets options governing the [multiseed heuristic],
+like `-L` and `-N`.  For instance, if the user specifies `-N 0` and `-L` equal
+to the length of the read, the user will be surprised to find 1-mismatch alignments
+reported.  This option prevents Bowtie 2 from searching for 1-mismatch end-to-end
+alignments before using the [multiseed heuristic], which leads to the expected
+behavior when combined with options such as `-L` and `-N`.  This comes at the
+expense of speed.
+
     --end-to-end
 
 In this mode, Bowtie 2 requires that the entire read align from one end to the
diff --git a/MANUAL.markdown b/MANUAL.markdown
index 4341185..450899b 100644
--- a/MANUAL.markdown
+++ b/MANUAL.markdown
@@ -154,13 +154,7 @@ Building from source
 Building Bowtie 2 from source requires a GNU-like environment with GCC, GNU Make
 and other basics.  It should be possible to build Bowtie 2 on most vanilla Linux
 installations or on a Mac installation with [Xcode] installed.  Bowtie 2 can
-also be built on Windows using [Cygwin] or [MinGW] (MinGW recommended). For a 
-MinGW build the choice of what compiler is to be used is important since this
-will determine if a 32 or 64 bit code can be successfully compiled using it. If 
-there is a need to generate both 32 and 64 bit on the same machine then a multilib 
-MinGW has to be properly installed. [MSYS], the [zlib] library, and depending on 
-architecture [pthreads] library are also required. We are recommending a 64 bit
-build since it has some clear advantages in real life research problems. In order 
+also be built on Windows using a 64-bit MinGW distribution and MSYS. In order 
 to simplify the MinGW setup it might be worth investigating popular MinGW personal 
 builds since these are coming already prepared with most of the toolchains needed.
 
@@ -178,10 +172,8 @@ it is possible to use pthread library on non-POSIX platform like Windows, due
 to performance reasons bowtie 2 will try to use Windows native multithreading
 if possible.
 
-[Cygwin]:   http://www.cygwin.com/
 [MinGW]:    http://www.mingw.org/
 [MSYS]:     http://www.mingw.org/wiki/msys
-[zlib]:     http://cygwin.com/packages/mingw-zlib/
 [pthreads]: http://sourceware.org/pthreads-win32/
 [GnuWin32]: http://gnuwin32.sf.net/packages/coreutils.htm
 [Download]: https://sourceforge.net/projects/bowtie-bio/files/bowtie2/
@@ -1322,7 +1314,27 @@ paired-end configurations corresponding to fragments from the reverse-complement
 (Crick) strand.  Default: both strands enabled. 
 
 </td></tr>
-<tr><td id="bowtie2-options-end-to-end">
+<tr><td id="bowtie2-options-no-1mm-upfront">
+
+[`--no-1mm-upfront`]: #bowtie2-options-no-1mm-upfront
+
+    --no-1mm-upfront
+
+</td><td>
+
+By default, Bowtie 2 will attempt to find either an exact or a 1-mismatch
+end-to-end alignment for the read *before* trying the [multiseed heuristic].  Such
+alignments can be found very quickly, and many short read alignments have exact or
+near-exact end-to-end alignments.  However, this can lead to unexpected
+alignments when the user also sets options governing the [multiseed heuristic],
+like [`-L`] and [`-N`].  For instance, if the user specifies `-N 0` and `-L` equal
+to the length of the read, the user will be surprised to find 1-mismatch alignments
+reported.  This option prevents Bowtie 2 from searching for 1-mismatch end-to-end
+alignments before using the [multiseed heuristic], which leads to the expected
+behavior when combined with options such as [`-L`] and [`-N`].  This comes at the
+expense of speed.
+
+</td></tr><tr><td id="bowtie2-options-end-to-end">
 
 [`--end-to-end`]: #bowtie2-options-end-to-end
 
diff --git a/NEWS b/NEWS
index a6e65dc..677320f 100644
--- a/NEWS
+++ b/NEWS
@@ -3,7 +3,7 @@ Bowtie 2 NEWS
 
 Bowtie 2 is now available for download from the project website,
 http://bowtie-bio.sf.net/bowtie2.  2.0.0-beta1 is the first version released to
-the public and 2.2.0 is the latest version.  Bowtie 2 is licensed under
+the public and 2.2.1 is the latest version.  Bowtie 2 is licensed under
 the GPLv3 license.  See `LICENSE' file for details.
 
 Reporting Issues
@@ -16,6 +16,19 @@ Please report any issues using the Sourceforge bug tracker:
 Version Release History
 =======================
 
+Version 2.2.1 - February 28, 2014
+   * Improved way in which index files are loaded for alignment.  Should fix
+     efficiency problems on some filesystems.
+   * Fixed a bug that made older systems unable to correctly deal with bowtie 
+     relative symbolic links.
+   * Fixed a bug that, for very big indexes, could determine to determine file
+     offsets correctly.
+   * Fixed a bug where using --no-unal option incorrectly suppressed
+     --un/--un-conc output.
+   * Dropped a perl dependency that could cause problems on old systems.
+   * Added --no-1mm-upfront option and clarified documentation for parameters
+     governing the multiseed heuristic.
+
 Version 2.2.0 - February 10, 2014
    * Improved index querying efficiency using "population count" instructions
      available since SSE4.2.
@@ -33,8 +46,7 @@ Version 2.2.0 - February 10, 2014
      included in this release.  Thank you!
    * Phased out CygWin support.
    * Added the .bat generation for Windows.
-   * Fixed issue with very large one sequence reference.
-   * Fixed some issues with rare chars in fasta files.
+   * Fixed some issues with some uncommon chars in fasta files.
    * Fixed wrappers so bowtie can now be used with symlinks.
 
 Bowtie 2 on GitHub - February 4, 2014
diff --git a/VERSION b/VERSION
index ccbccc3..c043eea 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-2.2.0
+2.2.1
diff --git a/bowtie2 b/bowtie2
index 808f7e9..7f3723f 100755
--- a/bowtie2
+++ b/bowtie2
@@ -30,7 +30,7 @@
 
 use strict;
 use warnings;
-use Getopt::Long qw(GetOptionsFromArray GetOptions);
+use Getopt::Long qw(GetOptions);
 use File::Spec;
 use POSIX;
 
@@ -39,7 +39,8 @@ my ($vol,$script_path,$prog);
 $prog = File::Spec->rel2abs( __FILE__ );
 
 while (-f $prog && -l $prog){
-    $prog = File::Spec->rel2abs(readlink($prog));   
+    my (undef, $dir, undef) = File::Spec->splitpath($prog);
+    $prog = File::Spec->rel2abs(readlink($prog), $dir);
 }
 
 ($vol,$script_path,$prog) 
@@ -237,8 +238,10 @@ my $help = 0;
 my @bt2w_args_cp = (@bt2w_args>0) ? @bt2w_args : @bt2_args;
 Getopt::Long::Configure("pass_through","no_ignore_case");
 
-GetOptionsFromArray(
-    \@bt2w_args_cp,
+my @old_ARGV = @ARGV;
+ at ARGV = @bt2w_args_cp;
+
+GetOptions(
 	"1=s"                           => \@mate1s,
 	"2=s"                           => \@mate2s,
 	"reads|U=s"                     => \@unps,
@@ -252,6 +255,7 @@ GetOptionsFromArray(
 	"help|h"                        => \$help
 );
 
+ at ARGV = @old_ARGV;
 
 my $old_stderr;
 
@@ -500,9 +504,8 @@ if(defined($cap_out)) {
 			my $unal = ($fl & 4) != 0;
 			$filt = 1 if $no_unal && $unal;
 			if($passthru) {
-				if($filt) {
-					# Next line is read with some whitespace escaped, which we
-					# ignore b/c the record is filtered out by --no-unal
+				if(scalar(keys %read_fhs) == 0) {
+					# Next line is read with some whitespace escaped
 					my $l = <BT>;
 				} else {
 					my $mate1 = (($fl &  64) != 0);
diff --git a/bt2_idx.h b/bt2_idx.h
index 93330e8..030c99f 100644
--- a/bt2_idx.h
+++ b/bt2_idx.h
@@ -494,8 +494,8 @@ public:
 	    _passMemExc(passMemExc), \
 	    _sanity(sanityCheck), \
 	    fw_(fw), \
-	    _in1(MM_FILE_INIT), \
-	    _in2(MM_FILE_INIT), \
+	    _in1(NULL), \
+	    _in2(NULL), \
 	    _zOff(OFF_MASK), \
 	    _zEbwtByteOff(OFF_MASK), \
 	    _zEbwtBpOff(-1), \
@@ -1145,8 +1145,8 @@ public:
 		if(ebwt() != NULL && useShmem_) {
 			FREE_SHARED(ebwt());
 		}
-		MM_FILE_CLOSE(_in1);
-		MM_FILE_CLOSE(_in2);
+		if (_in1 != NULL) fclose(_in1);
+		if (_in2 != NULL) fclose(_in2);
 	}
 
 	/// Accessors
@@ -2367,8 +2367,8 @@ template<typename Operation>
 	bool       _passMemExc;
 	bool       _sanity;
 	bool       fw_;     // true iff this is a forward index
-	MM_FILE    _in1;    // input fd for primary index file
-	MM_FILE    _in2;    // input fd for secondary index file
+	FILE       *_in1;    // input fd for primary index file
+	FILE       *_in2;    // input fd for secondary index file
 	string     _in1Str; // filename for primary index file
 	string     _in2Str; // filename for secondary index file
 	string     _inSaStr;  // filename for suffix-array file
diff --git a/bt2_io.cpp b/bt2_io.cpp
index d8332aa..62b37c9 100644
--- a/bt2_io.cpp
+++ b/bt2_io.cpp
@@ -57,25 +57,6 @@ void Ebwt::readIntoMemory(
 			cerr << "  About to open input files: ";
 			logTime(cerr);
 		}
-#ifdef BOWTIE_MM
-		// Initialize our primary and secondary input-stream fields
-		if(_in1 != -1) close(_in1);
-		if(_verbose || startVerbose) {
-			cerr << "Opening \"" << _in1Str.c_str() << "\"" << endl;
-		}
-		if((_in1 = open(_in1Str.c_str(), O_RDONLY)) < 0) {
-			cerr << "Could not open index file " << _in1Str.c_str() << endl;
-		}
-		if(loadSASamp) {
-			if(_in2 != -1) close(_in2);
-			if(_verbose || startVerbose) {
-				cerr << "Opening \"" << _in2Str.c_str() << "\"" << endl;
-			}
-			if((_in2 = open(_in2Str.c_str(), O_RDONLY)) < 0) {
-				cerr << "Could not open index file " << _in2Str.c_str() << endl;
-			}
-		}
-#else
 		// Initialize our primary and secondary input-stream fields
 		if(_in1 != NULL) fclose(_in1);
 		if(_verbose || startVerbose) cerr << "Opening \"" << _in1Str.c_str() << "\"" << endl;
@@ -89,7 +70,6 @@ void Ebwt::readIntoMemory(
 				cerr << "Could not open index file " << _in2Str.c_str() << endl;
 			}
 		}
-#endif
 		if(_verbose || startVerbose) {
 			cerr << "  Finished opening input files: ";
 			logTime(cerr);
@@ -98,7 +78,7 @@ void Ebwt::readIntoMemory(
 #ifdef BOWTIE_MM
 		if(_useMm /*&& !justHeader*/) {
 			const char *names[] = {_in1Str.c_str(), _in2Str.c_str()};
-			int fds[] = { _in1, _in2 };
+			int fds[] = { fileno(_in1), fileno(_in2) };
 			for(int i = 0; i < (loadSASamp ? 2 : 1); i++) {
 				if(_verbose || startVerbose) {
 					cerr << "  Memory-mapping input file " << (i+1) << ": ";
@@ -267,7 +247,7 @@ void Ebwt::readIntoMemory(
 #ifdef BOWTIE_MM
 		_plen.init((TIndexOffU*)(mmFile[0] + bytesRead), _nPat, false);
 		bytesRead += _nPat*OFF_SIZE;
-		MM_SEEK(_in1, _nPat*OFF_SIZE, SEEK_CUR);
+		fseeko(_in1, _nPat*OFF_SIZE, SEEK_CUR);
 #endif
 	} else {
 		try {
@@ -281,8 +261,8 @@ void Ebwt::readIntoMemory(
 					plen()[i] = readU<TIndexOffU>(_in1, switchEndian);
 				}
 			} else {
-				MM_READ_RET r = MM_READ(_in1, (void*)(plen()), _nPat*OFF_SIZE);
-				if(r != (MM_READ_RET)(_nPat*OFF_SIZE)) {
+				size_t r = MM_READ(_in1, (void*)(plen()), _nPat*OFF_SIZE);
+				if(r != (size_t)(_nPat*OFF_SIZE)) {
 					cerr << "Error reading _plen[] array: " << r << ", " << _nPat*OFF_SIZE << endl;
 					throw 1;
 				}
@@ -315,7 +295,7 @@ void Ebwt::readIntoMemory(
 #ifdef BOWTIE_MM
 			_rstarts.init((TIndexOffU*)(mmFile[0] + bytesRead), _nFrag*3, false);
 			bytesRead += this->_nFrag*OFF_SIZE*3;
-			MM_SEEK(_in1, this->_nFrag*OFF_SIZE*3, SEEK_CUR);
+			fseeko(_in1, this->_nFrag*OFF_SIZE*3, SEEK_CUR);
 #endif
 		} else {
 			_rstarts.init(new TIndexOffU[_nFrag*3], _nFrag*3, true);
@@ -328,8 +308,8 @@ void Ebwt::readIntoMemory(
 					this->rstarts()[i+2] = readU<TIndexOffU>(_in1, switchEndian);
 				}
 			} else {
-				MM_READ_RET r = MM_READ(_in1, (void *)rstarts(), this->_nFrag*OFF_SIZE*3);
-				if(r != (MM_READ_RET)(this->_nFrag*OFF_SIZE*3)) {
+				size_t r = MM_READ(_in1, (void *)rstarts(), this->_nFrag*OFF_SIZE*3);
+				if(r != (size_t)(this->_nFrag*OFF_SIZE*3)) {
 					cerr << "Error reading _rstarts[] array: " << r << ", " << (this->_nFrag*OFF_SIZE*3) << endl;
 					throw 1;
 				}
@@ -339,7 +319,7 @@ void Ebwt::readIntoMemory(
 		// Skip em
 		assert(rstarts() == NULL);
 		bytesRead += this->_nFrag*OFF_SIZE*3;
-		MM_SEEK(_in1, this->_nFrag*OFF_SIZE*3, SEEK_CUR);
+		fseeko(_in1, this->_nFrag*OFF_SIZE*3, SEEK_CUR);
 	}
 	
 	_ebwt.reset();
@@ -347,7 +327,7 @@ void Ebwt::readIntoMemory(
 #ifdef BOWTIE_MM
 		_ebwt.init((uint8_t*)(mmFile[0] + bytesRead), eh->_ebwtTotLen, false);
 		bytesRead += eh->_ebwtTotLen;
-		MM_SEEK(_in1, eh->_ebwtTotLen, SEEK_CUR);
+		fseek(_in1, eh->_ebwtTotLen, SEEK_CUR);
 #endif
 	} else {
 		// Allocate ebwt (big allocation)
@@ -381,7 +361,7 @@ void Ebwt::readIntoMemory(
 			char *pebwt = (char*)this->ebwt();
 
 			while (bytesLeft>0){
-				MM_READ_RET r = MM_READ(_in1, (void *)pebwt, bytesLeft);
+				size_t r = MM_READ(_in1, (void *)pebwt, bytesLeft);
 				if(MM_IS_IO_ERR(_in1,r,bytesLeft)) {
 					cerr << "Error reading _ebwt[] array: " << r << ", "
 						 << bytesLeft << gLastIOErrMsg << endl;
@@ -404,7 +384,7 @@ void Ebwt::readIntoMemory(
 #endif
 		} else {
 			// Seek past the data and wait until master is finished
-			MM_SEEK(_in1, eh->_ebwtTotLen, SEEK_CUR);
+			fseeko(_in1, eh->_ebwtTotLen, SEEK_CUR);
 #ifdef BOWTIE_SHARED_MEM
 			if(useShmem_) WAIT_SHARED(ebwt(), eh->_ebwtTotLen);
 #endif
@@ -424,7 +404,7 @@ void Ebwt::readIntoMemory(
 #ifdef BOWTIE_MM
 			_fchr.init((TIndexOffU*)(mmFile[0] + bytesRead), 5, false);
 			bytesRead += 5*OFF_SIZE;
-			MM_SEEK(_in1, 5*OFF_SIZE, SEEK_CUR);
+			fseek(_in1, 5*OFF_SIZE, SEEK_CUR);
 #endif
 		} else {
 			_fchr.init(new TIndexOffU[5], 5, true);
@@ -450,7 +430,7 @@ void Ebwt::readIntoMemory(
 #ifdef BOWTIE_MM
 				_ftab.init((TIndexOffU*)(mmFile[0] + bytesRead), eh->_ftabLen, false);
 				bytesRead += eh->_ftabLen*OFF_SIZE;
-				MM_SEEK(_in1, eh->_ftabLen*OFF_SIZE, SEEK_CUR);
+				fseeko(_in1, eh->_ftabLen*OFF_SIZE, SEEK_CUR);
 #endif
 			} else {
 				_ftab.init(new TIndexOffU[eh->_ftabLen], eh->_ftabLen, true);
@@ -458,8 +438,8 @@ void Ebwt::readIntoMemory(
 					for(TIndexOffU i = 0; i < eh->_ftabLen; i++)
 						this->ftab()[i] = readU<TIndexOffU>(_in1, switchEndian);
 				} else {
-					MM_READ_RET r = MM_READ(_in1, (void *)ftab(), eh->_ftabLen*OFF_SIZE);
-					if(r != (MM_READ_RET)(eh->_ftabLen*OFF_SIZE)) {
+					size_t r = MM_READ(_in1, (void *)ftab(), eh->_ftabLen*OFF_SIZE);
+					if(r != (size_t)(eh->_ftabLen*OFF_SIZE)) {
 						cerr << "Error reading _ftab[] array: " << r << ", " << (eh->_ftabLen*OFF_SIZE) << endl;
 						throw 1;
 					}
@@ -480,7 +460,7 @@ void Ebwt::readIntoMemory(
 #ifdef BOWTIE_MM
 				_eftab.init((TIndexOffU*)(mmFile[0] + bytesRead), eh->_eftabLen, false);
 				bytesRead += eh->_eftabLen*OFF_SIZE;
-				MM_SEEK(_in1, eh->_eftabLen*OFF_SIZE, SEEK_CUR);
+				fseeko(_in1, eh->_eftabLen*OFF_SIZE, SEEK_CUR);
 #endif
 			} else {
 				_eftab.init(new TIndexOffU[eh->_eftabLen], eh->_eftabLen, true);
@@ -488,8 +468,8 @@ void Ebwt::readIntoMemory(
 					for(TIndexOffU i = 0; i < eh->_eftabLen; i++)
 						this->eftab()[i] = readU<TIndexOffU>(_in1, switchEndian);
 				} else {
-					MM_READ_RET r = MM_READ(_in1, (void *)this->eftab(), eh->_eftabLen*OFF_SIZE);
-					if(r != (MM_READ_RET)(eh->_eftabLen*OFF_SIZE)) {
+					size_t r = MM_READ(_in1, (void *)this->eftab(), eh->_eftabLen*OFF_SIZE);
+					if(r != (size_t)(eh->_eftabLen*OFF_SIZE)) {
 						cerr << "Error reading _eftab[] array: " << r << ", " << (eh->_eftabLen*OFF_SIZE) << endl;
 						throw 1;
 					}
@@ -507,10 +487,10 @@ void Ebwt::readIntoMemory(
 			assert(eftab() == NULL);
 			// Skip ftab
 			bytesRead += eh->_ftabLen*OFF_SIZE;
-			MM_SEEK(_in1, eh->_ftabLen*OFF_SIZE, SEEK_CUR);
+			fseeko(_in1, eh->_ftabLen*OFF_SIZE, SEEK_CUR);
 			// Skip eftab
 			bytesRead += eh->_eftabLen*OFF_SIZE;
-			MM_SEEK(_in1, eh->_eftabLen*OFF_SIZE, SEEK_CUR);
+			fseeko(_in1, eh->_eftabLen*OFF_SIZE, SEEK_CUR);
 		}
 	} catch(bad_alloc& e) {
 		cerr << "Out of memory allocating fchr[], ftab[] or eftab[] arrays for the Bowtie index." << endl
@@ -523,7 +503,7 @@ void Ebwt::readIntoMemory(
 	if(loadNames) {
 		while(true) {
 			char c = '\0';
-			if(MM_READ(_in1, (void *)(&c), (size_t)1) != (MM_READ_RET)1) break;
+			if(MM_READ(_in1, (void *)(&c), (size_t)1) != (size_t)1) break;
 			bytesRead++;
 			if(c == '\0') break;
 			else if(c == '\n') {
@@ -582,8 +562,8 @@ void Ebwt::readIntoMemory(
 					}
 					for(TIndexOffU i = 0; i < offsLen; i += blockMaxSzU) {
 						TIndexOffU block = min<TIndexOffU>(blockMaxSzU, offsLen - i);
-						MM_READ_RET r = MM_READ(_in2, (void *)buf, block << (OFF_SIZE/4 + 1));
-						if(r != (MM_READ_RET)(block << (OFF_SIZE/4 + 1))) {
+						size_t r = MM_READ(_in2, (void *)buf, block << (OFF_SIZE/4 + 1));
+						if(r != (size_t)(block << (OFF_SIZE/4 + 1))) {
 							cerr << "Error reading block of _offs[] array: " << r << ", " << (block << (OFF_SIZE/4 + 1)) << endl;
 							throw 1;
 						}
@@ -603,9 +583,7 @@ void Ebwt::readIntoMemory(
 #ifdef BOWTIE_MM
 						_offs.init((TIndexOffU*)(mmFile[1] + bytesRead), offsLen, false);
 						bytesRead += offsSz;
-						// Argument to lseek can be 64 bits if compiled with
-						// _FILE_OFFSET_BITS
-						MM_SEEK(_in2, offsSz, SEEK_CUR);
+						fseeko(_in2, offsSz, SEEK_CUR);
 #endif
 					} else {
 						// Workaround for small-index mode where MM_READ may
@@ -615,7 +593,7 @@ void Ebwt::readIntoMemory(
 						char *offs = (char *)this->offs();
 
 						while(bytesLeft > 0) {
-							MM_READ_RET r = MM_READ(_in2, (void*)offs, bytesLeft);
+							size_t r = MM_READ(_in2, (void*)offs, bytesLeft);
 							if(MM_IS_IO_ERR(_in2,r,bytesLeft)) {
 								cerr << "Error reading block of _offs[] array: "
 								     << r << ", " << bytesLeft << gLastIOErrMsg << endl;
@@ -631,7 +609,7 @@ void Ebwt::readIntoMemory(
 #endif
 			} else {
 				// Not the shmem leader
-				MM_SEEK(_in2, offsLenSampled*OFF_SIZE, SEEK_CUR);
+				fseeko(_in2, offsLenSampled*OFF_SIZE, SEEK_CUR);
 #ifdef BOWTIE_SHARED_MEM				
 				if(useShmem_) WAIT_SHARED(offs(), offsLenSampled*OFF_SIZE);
 #endif
@@ -650,12 +628,8 @@ done: // Exit hatch for both justHeader and !justHeader
 	
 	// Be kind
 	if(deleteEh) delete eh;
-#ifdef BOWTIE_MM
-	MM_SEEK(_in1, 0, SEEK_SET);
-	MM_SEEK(_in2, 0, SEEK_SET);
-#else
-	rewind(_in1); rewind(_in2);
-#endif
+	rewind(_in1);
+	rewind(_in2);
 }
 
 /**
diff --git a/bt2_search.cpp b/bt2_search.cpp
index 5aa8684..e8d1ee5 100644
--- a/bt2_search.cpp
+++ b/bt2_search.cpp
@@ -415,7 +415,7 @@ static void resetOptions() {
 	mapqv = 2;               // MAPQ calculation version
 	tighten = 3;             // -M tightening mode
 	doExactUpFront = true;   // do exact search up front if seeds seem good enough
-	do1mmUpFront = true;     // do 1mm search up front if seeds seem good enough
+	do1mmUpFront = true;    // do 1mm search up front if seeds seem good enough
 	seedBoostThresh = 300;   // if average non-zero position has more than this many elements
 	nSeedRounds = 2;         // # rounds of seed searches to do for repetitive reads
 	do1mmMinLen = 60;        // length below which we disable 1mm search
@@ -722,7 +722,9 @@ static void printUsage(ostream& out) {
 		<< "  --ignore-quals     treat all quality values as 30 on Phred scale (off)" << endl
 	    << "  --nofw             do not align forward (original) version of read (off)" << endl
 	    << "  --norc             do not align reverse-complement version of read (off)" << endl
-		<< endl
+	    << "  --no-1mm-upfront   do not allow 1 mismatch alignments before attempting to" << endl
+	    << "                     scan for the optimal seeded alignments"
+	    << endl
 		<< "  --end-to-end       entire read must align; no clipping (on)" << endl
 		<< "   OR" << endl
 		<< "  --local            local alignment; ends might be soft clipped (off)" << endl
diff --git a/doc/manual.html b/doc/manual.html
index 9d5aee2..5f6a3f6 100644
--- a/doc/manual.html
+++ b/doc/manual.html
@@ -135,7 +135,7 @@
 <h1 id="obtaining-bowtie-2"><a href="#TOC">Obtaining Bowtie 2</a></h1>
 <p>Download Bowtie 2 sources and binaries from the <a href="https://sourceforge.net/projects/bowtie-bio/files/bowtie2/">Download</a> section of the Sourceforge site. Binaries are available for the Intel <code>x86_64</code> architecture running Linux, Mac OS X, and Windows. If you plan to compile Bowtie 2 yourself, make sure to get the source package, i.e., the filename that ends in "-source.zip".</p>
 <h2 id="building-from-source"><a href="#TOC">Building from source</a></h2>
-<p>Building Bowtie 2 from source requires a GNU-like environment with GCC, GNU Make and other basics. It should be possible to build Bowtie 2 on most vanilla Linux installations or on a Mac installation with <a href="http://developer.apple.com/xcode/">Xcode</a> installed. Bowtie 2 can also be built on Windows using <a href="http://www.cygwin.com/">Cygwin</a> or <a href="http://www.mingw.org/">MinGW</a> (MinGW recommended). For a MinGW build the choice of what compiler is to be used is im [...]
+<p>Building Bowtie 2 from source requires a GNU-like environment with GCC, GNU Make and other basics. It should be possible to build Bowtie 2 on most vanilla Linux installations or on a Mac installation with <a href="http://developer.apple.com/xcode/">Xcode</a> installed. Bowtie 2 can also be built on Windows using a 64-bit MinGW distribution and MSYS. In order to simplify the MinGW setup it might be worth investigating popular MinGW personal builds since these are coming already prepare [...]
 <p>First, download the source package from the <a href="https://sourceforge.net/projects/bowtie-bio/files/bowtie2/">sourceforge site</a>. Make sure you're getting the source package; the file downloaded should end in <code>-source.zip</code>. Unzip the file, change to the unzipped directory, and build the Bowtie 2 tools by running GNU <code>make</code> (usually with the command <code>make</code>, but sometimes with <code>gmake</code>) with no arguments. If building with MinGW, run <code> [...]
 <p>Bowtie 2 is using the multithreading software model in order to speed up execution times on SMP architectures where this is possible. On POSIX platforms (like linux, Mac OS, etc) it needs the pthread library. Although it is possible to use pthread library on non-POSIX platform like Windows, due to performance reasons bowtie 2 will try to use Windows native multithreading if possible.</p>
 <h2 id="adding-to-path"><a href="#TOC">Adding to PATH</a></h2>
@@ -648,7 +648,15 @@ Seed 4 rc:                   TTATGCATGA</code></pre>
 
 <p>If <code>--nofw</code> is specified, <code>bowtie2</code> will not attempt to align unpaired reads to the forward (Watson) reference strand. If <code>--norc</code> is specified, <code>bowtie2</code> will not attempt to align unpaired reads against the reverse-complement (Crick) reference strand. In paired-end mode, <code>--nofw</code> and <code>--norc</code> pertain to the fragments; i.e. specifying <code>--nofw</code> causes <code>bowtie2</code> to explore only those paired-end confi [...]
 </td></tr>
-<tr><td id="bowtie2-options-end-to-end">
+<tr><td id="bowtie2-options-no-1mm-upfront">
+
+
+
+<pre><code>--no-1mm-upfront</code></pre>
+</td><td>
+
+<p>By default, Bowtie 2 will attempt to find either an exact or a 1-mismatch end-to-end alignment for the read <em>before</em> trying the <a href="#multiseed-heuristic">multiseed heuristic</a>. Such alignments can be found very quickly, and many short read alignments have exact or near-exact end-to-end alignments. However, this can lead to unexpected alignments when the user also sets options governing the <a href="#multiseed-heuristic">multiseed heuristic</a>, like <a href="#bowtie2-opt [...]
+</td></tr><tr><td id="bowtie2-options-end-to-end">
 
 
 
diff --git a/mm.h b/mm.h
index a0d9301..c8afd39 100644
--- a/mm.h
+++ b/mm.h
@@ -29,22 +29,7 @@
  * and where there isn't POSIX I/O,
  */
 
-#ifdef BOWTIE_MM
-#define MM_FILE_CLOSE(x) if(x > 3) { close(x); }
-#define MM_READ_RET ssize_t
-#define MM_READ read
-#define MM_SEEK lseek
-#define MM_FILE int
-#define MM_FILE_INIT -1
-#define MM_IS_IO_ERR(fdesc, ret, count) is_read_err(fdesc, ret, count)
-#else
-#define MM_FILE_CLOSE(x) if(x != NULL) { fclose(x); }
-#define MM_READ_RET size_t
 #define MM_READ(file, dest, sz) fread(dest, 1, sz, file)
-#define MM_SEEK fseek
-#define MM_FILE FILE*
-#define MM_FILE_INIT NULL
 #define MM_IS_IO_ERR(file_hd, ret, count) is_fread_err(file_hd, ret, count)
-#endif
 
 #endif /* MM_H_ */
diff --git a/ref_read.h b/ref_read.h
index a387737..de3de3b 100644
--- a/ref_read.h
+++ b/ref_read.h
@@ -91,19 +91,6 @@ struct RefRecord {
 		first = fgetc(in) ? true : false;
 	}
 
-#ifdef BOWTIE_MM
-	RefRecord(int in, bool swap) {
-		off = readU<TIndexOffU>(in, swap);
-		len = readU<TIndexOffU>(in, swap);
-		char c;
-		if(!read(in, &c, 1)) {
-			cerr << "Error reading RefRecord 'first' flag" << endl;
-			throw 1;
-		}
-		first = (c ? true : false);
-	}
-#endif
-
 	void write(std::ostream& out, bool be) {
 		writeU<TIndexOffU>(out, off, be);
 		writeU<TIndexOffU>(out, len, be);
diff --git a/reference.cpp b/reference.cpp
index 06c76e0..6b8f215 100644
--- a/reference.cpp
+++ b/reference.cpp
@@ -50,22 +50,22 @@ BitPairReference::BitPairReference(
 	string s3 = in + ".3." + gEbwt_ext;
 	string s4 = in + ".4." + gEbwt_ext;
 	
-#ifdef BOWTIE_MM
-	int f3, f4;
-	if((f3 = open(s3.c_str(), O_RDONLY)) < 0) {
-		cerr << "Could not open reference-string index file " << s3.c_str() << " for reading." << endl;
+	FILE *f3, *f4;
+	if((f3 = fopen(s3.c_str(), "rb")) == NULL) {
+	    cerr << "Could not open reference-string index file " << s3 << " for reading." << endl;
 		cerr << "This is most likely because your index was built with an older version" << endl
 		<< "(<= 0.9.8.1) of bowtie-build.  Please re-run bowtie-build to generate a new" << endl
 		<< "index (or download one from the Bowtie website) and try again." << endl;
 		loaded_ = false;
 		return;
 	}
-	if((f4 = open(s4.c_str(), O_RDONLY)) < 0) {
-		cerr << "Could not open reference-string index file " << s4.c_str() << " for reading." << endl;
+    if((f4 = fopen(s4.c_str(), "rb"))  == NULL) {
+        cerr << "Could not open reference-string index file " << s4 << " for reading." << endl;
 		loaded_ = false;
 		return;
 	}
-	char *mmFile = NULL;
+#ifdef BOWTIE_MM
+    char *mmFile = NULL;
 	if(useMm_) {
 		if(verbose_ || startVerbose) {
 			cerr << "  Memory-mapping reference index file " << s4.c_str() << ": ";
@@ -78,7 +78,7 @@ BitPairReference::BitPairReference(
 			throw 1;
 		}
 		mmFile = (char*)mmap((void *)0, (size_t)sbuf.st_size,
-							 PROT_READ, MAP_SHARED, f4, 0);
+				     PROT_READ, MAP_SHARED, fileno(f4), 0);
 		if(mmFile == (void *)(-1) || mmFile == NULL) {
 			perror("mmap");
 			cerr << "Error: Could not memory-map the index file " << s4.c_str() << endl;
@@ -95,21 +95,6 @@ BitPairReference::BitPairReference(
 			}
 		}
 	}
-#else
-	FILE *f3, *f4;
-	if((f3 = fopen(s3.c_str(), "rb")) == NULL) {
-		cerr << "Could not open reference-string index file " << s3 << " for reading." << endl;
-		cerr << "This is most likely because your index was built with an older version" << endl
-		<< "(<= 0.9.8.1) of bowtie-build.  Please re-run bowtie-build to generate a new" << endl
-		<< "index (or download one from the Bowtie website) and try again." << endl;
-		loaded_ = false;
-		return;
-	}
-	if((f4 = fopen(s4.c_str(), "rb"))  == NULL) {
-		cerr << "Could not open reference-string index file " << s4 << " for reading." << endl;
-		loaded_ = false;
-		return;
-	}
 #endif
 	
 	// Read endianness sentinel, set 'swap'
@@ -179,7 +164,7 @@ BitPairReference::BitPairReference(
 	bufSz_ = cumsz;
 	assert_eq(nrefs_, refLens_.size());
 	assert_eq(sz, recs_.size());
-	MM_FILE_CLOSE(f3); // done with .3.gEbwt_ext file
+	if (f3 != NULL) fclose(f3); // done with .3.gEbwt_ext file
 	// Round cumsz up to nearest byte boundary
 	if((cumsz & 3) != 0) {
 		cumsz += (4 - (cumsz & 3));
diff --git a/word_io.h b/word_io.h
index 48fb57b..dfb59ff 100644
--- a/word_io.h
+++ b/word_io.h
@@ -101,42 +101,6 @@ static inline T readU(std::istream& in, bool swap) {
 	}
 }
 
-/**
- * Read a 32/64 bit unsigned from a file descriptor, optionally inverting
- * endianness.
- */
-#ifdef BOWTIE_MM
-//template <typename T>
-//static inline T readU(int in, bool swap) {
-//	T x;
-//	if(read(in, (void *)&x, OFF_SIZE) != OFF_SIZE) {
-//		assert(false);
-//	}
-//	if(swap) {
-//		return endianSwapU(x);
-//	} else {
-//		return x;
-//	}
-//}
-template <typename T>
-static inline T readU(int in, bool swap) {
-	T x;
-	if(read(in, (void *)&x, sizeof(T)) != sizeof(T)) {
-		assert(false);
-	}
-	if(swap) {
-		if(sizeof(T) == 4) {
-			return endianSwapU32(x);
-		} else if(sizeof(T) == 8) {
-			return endianSwapU64(x);
-		} else {
-			assert(false);
-		}
-	} else {
-		return x;
-	}
-}
-#endif
 
 /**
  * Read a 32/64 bit unsigned from a FILE*, optionally inverting
@@ -207,42 +171,6 @@ static inline T readI(std::istream& in, bool swap) {
 	}
 }
 
-/**
- * Read a 32/64 bit unsigned from a file descriptor, optionally inverting
- * endianness.
- */
-#ifdef BOWTIE_MM
-//template <typename T>
-//static inline T readI(int in, bool swap) {
-//	T x;
-//	if(read(in, (void *)&x, OFF_SIZE) != OFF_SIZE) {
-//		assert(false);
-//	}
-//	if(swap) {
-//		return endianSwapI(x);
-//	} else {
-//		return x;
-//	}
-//}
-template <typename T>
-static inline T readI(int in, bool swap) {
-	T x;
-	if(read(in, (void *)&x, sizeof(T)) != sizeof(T)) {
-		assert(false);
-	}
-	if(swap) {
-		if(sizeof(T) == 4) {
-			return endianSwapI32(x);
-		} else if(sizeof(T) == 8) {
-			return endianSwapI64(x);
-		} else {
-			assert(false);
-		}
-	} else {
-		return x;
-	}
-}
-#endif
 
 /**
  * Read a 32/64 bit unsigned from a FILE*, optionally inverting

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/bowtie2.git



More information about the debian-med-commit mailing list