[med-svn] [bedtools] 03/04: Sync with upstream repo at commit 6bf23c

Charles Plessy plessy at moszumanska.debian.org
Mon Nov 7 13:06:43 UTC 2016


This is an automated email from the git hooks/post-receive script.

plessy pushed a commit to branch master
in repository bedtools.

commit 1d31cd1fc318aa28039c66dd32011861ec2bb445
Author: Charles Plessy <plessy at debian.org>
Date:   Sat Nov 5 21:45:39 2016 +0900

    Sync with upstream repo at commit 6bf23c
    
    This patch contains bug fixes to upstream issues #429, #418 and #424.  In
    particular, it repairs the groupby command, which was completely broken.
    Cherry-picking a single commit did not result in a buildable source, and this
    big patch was the easiest alternative.
    
    Closes: #831833
---
 debian/patches/series                    |    1 +
 debian/patches/v2.26.0-19-g6bf23c4.patch | 6711 ++++++++++++++++++++++++++++++
 2 files changed, 6712 insertions(+)

diff --git a/debian/patches/series b/debian/patches/series
index e0b66fe..84290f3 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -2,3 +2,4 @@ gzstream.h.patch
 fix_test_script.patch
 remove_barski_binding_site.png.patch
 reproducible_build.patch
+v2.26.0-19-g6bf23c4.patch
diff --git a/debian/patches/v2.26.0-19-g6bf23c4.patch b/debian/patches/v2.26.0-19-g6bf23c4.patch
new file mode 100644
index 0000000..150225f
--- /dev/null
+++ b/debian/patches/v2.26.0-19-g6bf23c4.patch
@@ -0,0 +1,6711 @@
+Author: Upstream
+Bug-Debian: https://bugs.debian.org/831833
+Description: Sync with upstream repo at commit 6bf23c
+ This patch contains bug fixes to upstream issues #429, #418 and #424.
+ In particular, it repairs the groupby command, which was completely
+ broken.  Cherry-picking a single commit did not result in a buildable
+ source, and this big patch was the easiest alternative.
+diff --git a/Makefile b/Makefile
+index 91cbbd5..80c5656 100644
+--- a/Makefile
++++ b/Makefile
+@@ -18,7 +18,7 @@ export SRC_DIR	= src
+ export UTIL_DIR	= src/utils
+ export CXX		= g++
+ ifeq ($(DEBUG),1)
+-export CXXFLAGS = -Wall -O0 -g -fno-inline -fkeep-inline-functions -D_FILE_OFFSET_BITS=64 -fPIC -DDEBUG -D_DEBUG
++export CXXFLAGS = -Wall -Wextra -DDEBUG -D_DEBUG -g -O0 -D_FILE_OFFSET_BITS=64 -fPIC $(INCLUDES)
+ else
+ export CXXFLAGS = -Wall -O2 -D_FILE_OFFSET_BITS=64 -fPIC $(INCLUDES)
+ endif
+diff --git a/docs/content/history.rst b/docs/content/history.rst
+index c59e84e..ac04a02 100644
+--- a/docs/content/history.rst
++++ b/docs/content/history.rst
+@@ -4,23 +4,24 @@ Release History
+ 
+ Version 2.26.0 (7-July-2016)
+ ============================
+-1. Fixed a major memory leak when using ``-sorted``. Thanks to Emily Tsang and Steohen Montgomery.
++1. Fixed a major memory leak when using ``-sorted``. Thanks to Emily Tsang and Stephen Montgomery.
+ 2. Fixed a bug for BED files containing a single record with no newline. Thanks to @jmarshall.
+-3. The ``getfasta`` tool includes name, chromosome and position in fasta headers when the ``-name`` option is used. Thanks to @rishavray.
+-4. Fixed a bug that now forces the ``coverage`` tool to process every record in the ``-a`` file.
+-5. Fixed a bug preventing proper processing of BED files with consecutive tabs.
+-6. VCF files containing structural variants now infer SV length from either the SVLEN or END INFO fields. Thanks to Zev Kronenberg.
+-7. Resolve off by one bugs when intersecting GFF or VCF files with BED files.
+-8. The ``shuffle`` tool now uses roulette wheel sampling to shuffle to ``-incl`` regions based upon the size of the interval. Thanks to Zev Kronenberg and Michael Imbeault.
+-9. Fixed a bug in ``coverage`` that prevented correct calculation of depth when using the ``-split`` option.
+-10. The ``shuffle`` tool warns when an interval exceeds the maximum chromosome length.
+-11. The ``complement`` tool better checks intervals against the chromosome lengths.
+-12. Fixes for ``stddev``, ``min``, and ``max`` operations. Thanks to @jmarshall.
+-13. Enabled ``stdev``, ``sstdev``, ``freqasc``, and ``freqdesc`` options for ``groupby``.
+-14. Allow ``-s`` and ``-w`` to be used in any order for ``makewindows``.
+-15. Added new ``-bedOut`` option to ``getfasta``.
+-16. The ``-r`` option forces the ``-F`` value for ``intersect``.
+-17. Add ``-pc`` option to the ``genomecov`` tool, allowing coverage to be calculated based upon paired-end fragments.
++3. Fixed a bug in the contigency table values for thr ``fisher`` tool.
++4. The ``getfasta`` tool includes name, chromosome and position in fasta headers when the ``-name`` option is used. Thanks to @rishavray.
++5. Fixed a bug that now forces the ``coverage`` tool to process every record in the ``-a`` file.
++6. Fixed a bug preventing proper processing of BED files with consecutive tabs.
++7. VCF files containing structural variants now infer SV length from either the SVLEN or END INFO fields. Thanks to Zev Kronenberg.
++8. Resolve off by one bugs when intersecting GFF or VCF files with BED files.
++9. The ``shuffle`` tool now uses roulette wheel sampling to shuffle to ``-incl`` regions based upon the size of the interval. Thanks to Zev Kronenberg and Michael Imbeault.
++10. Fixed a bug in ``coverage`` that prevented correct calculation of depth when using the ``-split`` option.
++11. The ``shuffle`` tool warns when an interval exceeds the maximum chromosome length.
++12. The ``complement`` tool better checks intervals against the chromosome lengths.
++13. Fixes for ``stddev``, ``min``, and ``max`` operations. Thanks to @jmarshall.
++14. Enabled ``stdev``, ``sstdev``, ``freqasc``, and ``freqdesc`` options for ``groupby``.
++15. Allow ``-s`` and ``-w`` to be used in any order for ``makewindows``.
++16. Added new ``-bedOut`` option to ``getfasta``.
++17. The ``-r`` option forces the ``-F`` value for ``intersect``.
++18. Add ``-pc`` option to the ``genomecov`` tool, allowing coverage to be calculated based upon paired-end fragments.
+ 
+ 
+ Version 2.25.0 (3-Sept-2015)
+diff --git a/docs/index.rst b/docs/index.rst
+index 2d67581..67cdd91 100755
+--- a/docs/index.rst
++++ b/docs/index.rst
+@@ -11,6 +11,7 @@ genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool
+ *intersect* two interval files), quite sophisticated analyses can be conducted
+ by combining multiple bedtools operations on the UNIX command line.
+ 
++**bedtools** is developed in the `Quinlan laboratory <http://quinlanlab.org>`_ at the `University of Utah <http://www.utah.edu/>`_ and benefits from fantastic contributions made by scientists worldwide.
+ 
+ ==========================
+ Tutorial
+diff --git a/docs/templates/sidebar-intro.html b/docs/templates/sidebar-intro.html
+index 262da46..dc430e7 100644
+--- a/docs/templates/sidebar-intro.html
++++ b/docs/templates/sidebar-intro.html
+@@ -8,7 +8,7 @@
+   <li><a target="_blank" href="https://bedtools.googlecode.com">Old Releases @ Google Code</a></li>
+   <li><a target="_blank" href="http://groups.google.com/group/bedtools-discuss">Mailing list @ Google Groups</a></li>
+   <li><a target="_blank" href="http://www.biostars.org/show/tag/bedtools/">Queries @ Biostar</a></li>
+-  <li><a target="_blank" href="http://quinlanlab.org">Quinlan lab @ UVa</a></li>
++  <li><a target="_blank" href="http://quinlanlab.org">Quinlan lab @ UU</a></li>
+ 
+ </ul>
+ 
+diff --git a/src/bedtools.cpp b/src/bedtools.cpp
+index 088ea70..b03b072 100644
+--- a/src/bedtools.cpp
++++ b/src/bedtools.cpp
+@@ -34,8 +34,8 @@ using namespace std;
+ // define our parameter checking macro
+ #define PARAMETER_CHECK(param, paramLen, actualLen) (strncmp(argv[i], param, min(actualLen, paramLen))== 0) && (actualLen == paramLen)
+ 
+-bool sub_main(const QuickString &subCmd);
+-void showHelp(const QuickString &subCmd);
++bool sub_main(const string &subCmd);
++void showHelp(const string &subCmd);
+ 
+ int annotate_main(int argc, char* argv[]);//
+ int bamtobed_main(int argc, char* argv[]);//
+@@ -92,7 +92,7 @@ int main(int argc, char *argv[])
+     // make sure the user at least entered a sub_command
+     if (argc < 2) return bedtools_help();
+ 
+-    QuickString subCmd(argv[1]);
++    string subCmd(argv[1]);
+     BedtoolsDriver btDriver;
+     if (btDriver.supports(subCmd)) {
+ 
+@@ -190,8 +190,13 @@ int main(int argc, char *argv[])
+ 
+ int bedtools_help(void)
+ {
+-    cout  << PROGRAM_NAME  << ": flexible tools for genome arithmetic and DNA sequence analysis.\n";
+-    cout << "usage:    bedtools <subcommand> [options]" << endl << endl;
++    cout  << PROGRAM_NAME  << " is a powerful toolset for genome arithmetic." << endl << endl;
++    cout << "Version:   " << VERSION << endl;
++    cout << "About:     developed in the quinlanlab.org and by many contributors worldwide." << endl;
++    cout << "Docs:      http://bedtools.readthedocs.io/" << endl;
++    cout << "Code:      https://github.com/arq5x/bedtools2" << endl;
++    cout << "Mail:      https://groups.google.com/forum/#!forum/bedtools-discuss" << endl << endl;
++    cout << "Usage:     bedtools <subcommand> [options]" << endl << endl;
+ 
+     cout  << "The bedtools sub-commands include:" << endl;
+     
+@@ -287,7 +292,7 @@ int bedtools_faq(void)
+     return 0;
+ }
+ 
+-void showHelp(const QuickString &subCmd) {
++void showHelp(const string &subCmd) {
+ 	if (subCmd == "intersect") {
+ 		intersect_help();
+ 	} else if (subCmd == "map") {
+diff --git a/src/complementFile/complementFile.cpp b/src/complementFile/complementFile.cpp
+index 803b7c5..5d3b384 100644
+--- a/src/complementFile/complementFile.cpp
++++ b/src/complementFile/complementFile.cpp
+@@ -38,7 +38,7 @@ void ComplementFile::processHits(RecordOutputMgr *outputMgr, RecordKeyVector &hi
+ 	const Record *rec = hits.getKey();
+ 
+ 	//test for chrom change.
+-	const QuickString &newChrom = rec->getChrName();
++	const string &newChrom = rec->getChrName();
+ 	if (_currChrom != newChrom) {
+ 
+ 		outPutLastRecordInPrevChrom();
+@@ -95,7 +95,7 @@ void ComplementFile::giveFinalReport(RecordOutputMgr *outputMgr) {
+ 
+ void ComplementFile::outPutLastRecordInPrevChrom()
+ {
+-	const QuickString &chrom = _outRecord.getChrName();
++	const string &chrom = _outRecord.getChrName();
+ 
+ 	//do nothing if triggered by first record in DB. At this point,
+ 	//there was no prev chrom, so nothing is stored in the output Record yet.
+@@ -106,7 +106,7 @@ void ComplementFile::outPutLastRecordInPrevChrom()
+ 	printRecord(maxChromSize);
+ }
+ 
+-bool ComplementFile::fastForward(const QuickString &newChrom) {
++bool ComplementFile::fastForward(const string &newChrom) {
+ 	if (!newChrom.empty() && !_genomeFile->hasChrom(newChrom)) return false;
+ 
+ 	int i= _currPosInGenomeList +1;
+@@ -133,14 +133,14 @@ bool ComplementFile::fastForward(const QuickString &newChrom) {
+ void ComplementFile::printRecord(int endPos)
+ {
+ 	_outRecord.setStartPos(_currStartPos);
+-	QuickString startStr;
+-	startStr.append(_currStartPos);
+-	_outRecord.setStartPosStr(startStr);
++	stringstream startStr;
++	startStr << _currStartPos;
++	_outRecord.setStartPosStr(startStr.str());
+ 
+ 	_outRecord.setEndPos(endPos);
+-	QuickString endStr;
+-	endStr.append(endPos);
+-	_outRecord.setEndPosStr(endStr);
++	stringstream endStr;
++	endStr << endPos;
++	_outRecord.setEndPosStr(endStr.str());
+ 
+ 	_outputMgr->printRecord(&_outRecord);
+ 	_outputMgr->newline();
+diff --git a/src/complementFile/complementFile.h b/src/complementFile/complementFile.h
+index 3382dbd..a90b6c2 100644
+--- a/src/complementFile/complementFile.h
++++ b/src/complementFile/complementFile.h
+@@ -34,17 +34,17 @@ public:
+ protected:
+ 	FileRecordMergeMgr *_frm;
+ 	Bed3Interval _outRecord;
+-	QuickString _currChrom;
++	string _currChrom;
+ 	const NewGenomeFile *_genomeFile;
+ 	int _currStartPos;
+ 	RecordOutputMgr *_outputMgr;
+-	const vector<QuickString> &_chromList;
++	const vector<string> &_chromList;
+ 	int _currPosInGenomeList;
+ 
+ 	virtual ContextComplement *upCast(ContextBase *context) { return static_cast<ContextComplement *>(context); }
+ 
+ 	void outPutLastRecordInPrevChrom();
+-	bool fastForward(const QuickString &newChrom);
++	bool fastForward(const string &newChrom);
+ 	void printRecord(int endPos);
+ 
+ };
+diff --git a/src/coverageFile/coverageFile.cpp b/src/coverageFile/coverageFile.cpp
+index 9473eeb..b01eda4 100644
+--- a/src/coverageFile/coverageFile.cpp
++++ b/src/coverageFile/coverageFile.cpp
+@@ -6,6 +6,7 @@
+  */
+ 
+ #include "coverageFile.h"
++#include <iomanip>
+ 
+ CoverageFile::CoverageFile(ContextCoverage *context)
+ : IntersectFile(context),
+@@ -13,6 +14,7 @@ CoverageFile::CoverageFile(ContextCoverage *context)
+  _depthArrayCapacity(0),
+  _queryLen(0),
+  _totalQueryLen(0),
++ _hitCount(0),
+  _queryOffset(0),
+  _floatValBuf(NULL)
+ {
+@@ -34,40 +36,38 @@ CoverageFile::~CoverageFile() {
+ }
+ 
+ 
+-void CoverageFile::processHits(RecordOutputMgr *outputMgr, RecordKeyVector &hits) {
+-	   makeDepthCount(hits);
+-	   _finalOutput.clear();
+-
+-	   switch(upCast(_context)->getCoverageType()) {
+-	   case ContextCoverage::COUNT:
+-		   doCounts(outputMgr, hits);
+-		   break;
+-
+-	   case ContextCoverage::PER_BASE:
+-		   doPerBase(outputMgr, hits);
+-		   break;
+-
+-	   case ContextCoverage::MEAN:
+-		   doMean(outputMgr, hits);
+-		   break;
+-
+-	   case ContextCoverage::HIST:
+-		   doHist(outputMgr, hits);
+-		   break;
+-
+-	   case ContextCoverage::DEFAULT:
+-	   default:
+-		   doDefault(outputMgr, hits);
+-		   break;
+-
+-	   }
+-
++void CoverageFile::processHits(RecordOutputMgr *outputMgr, RecordKeyVector &hits) 
++{
++	makeDepthCount(hits);
++	_finalOutput.clear();
++
++	switch(upCast(_context)->getCoverageType()) {
++	case ContextCoverage::COUNT:
++	   doCounts(outputMgr, hits);
++	   break;
++
++	case ContextCoverage::PER_BASE:
++	   doPerBase(outputMgr, hits);
++	   break;
++
++	case ContextCoverage::MEAN:
++	   doMean(outputMgr, hits);
++	   break;
++
++	case ContextCoverage::HIST:
++	   doHist(outputMgr, hits);
++	   break;
++
++	case ContextCoverage::DEFAULT:
++	default:
++	   doDefault(outputMgr, hits);
++	   break;
++	}
+ }
+ 
+ void CoverageFile::cleanupHits(RecordKeyVector &hits) {
+ 	IntersectFile::cleanupHits(hits);
+ 	memset(_depthArray, 0, sizeof(size_t) * _queryLen);
+-
+ }
+ 
+ void CoverageFile::giveFinalReport(RecordOutputMgr *outputMgr) {
+@@ -77,19 +77,25 @@ void CoverageFile::giveFinalReport(RecordOutputMgr *outputMgr) {
+ 		return;
+ 	}
+ 
++	
+ 	for (depthMapType::iterator iter = _finalDepthMap.begin(); iter != _finalDepthMap.end(); iter++) {
+ 		size_t depth = iter->first;
+ 		size_t basesAtDepth = iter->second;
++		//cout << "x\n";
+ 		float depthPct = (float)basesAtDepth / (float)_totalQueryLen;
+-
+-		_finalOutput = "all\t";
+-		_finalOutput.append(static_cast<uint32_t>(depth));
+-		_finalOutput.append("\t");
+-		_finalOutput.append(static_cast<uint32_t>(basesAtDepth));
+-		_finalOutput.append("\t");
+-		_finalOutput.append(static_cast<uint32_t>(_totalQueryLen));
+-		_finalOutput.append("\t");
+-		format(depthPct);
++		//cout << "y\n";
++		ostringstream s;
++		s << "all\t";
++		s << depth;
++		s << "\t";
++		s << basesAtDepth;
++		s << "\t";
++		s << _totalQueryLen;
++		s << "\t";
++		char *depthPctString;
++		asprintf(&depthPctString, "%0.7f", depthPct);
++		s << depthPctString;
++		_finalOutput = s.str();
+ 
+ 		outputMgr->printRecord(NULL, _finalOutput);
+ 	}
+@@ -101,24 +107,57 @@ void CoverageFile::makeDepthCount(RecordKeyVector &hits) {
+ 	_queryLen = (size_t)(key->getEndPos() - _queryOffset);
+ 	_totalQueryLen += _queryLen;
+ 
+-	//resize depth array if needed
++	// resize depth array if needed
+ 	if (_depthArrayCapacity < _queryLen) {
+ 		_depthArray = (size_t*)realloc(_depthArray, sizeof(size_t) * _queryLen);
+ 		_depthArrayCapacity = _queryLen;
+ 		memset(_depthArray, 0, sizeof(size_t) * _depthArrayCapacity);
+ 	}
+-
+-	//loop through hits, which may not be in sorted order, due to
+-	//potential multiple databases, and increment the depth array as needed.
+-	for (RecordKeyVector::const_iterator_type iter = hits.begin(); iter != hits.end(); iter = hits.next()) {
+-		const Record *dbRec = *iter;
+-		int dbStart = dbRec->getStartPos();
+-		int dbEnd = dbRec->getEndPos();
+-		int maxStart = max(_queryOffset, dbStart);
+-		int minEnd = min(dbEnd, key->getEndPos());
+-
+-		for (int i=maxStart; i < minEnd; i++) {
+-			_depthArray[i - _queryOffset]++;
++	_hitCount = 0;
++	// no -split
++	if (!(_context)->getObeySplits())
++	{
++		//loop through hits, which may not be in sorted order, due to
++		//potential multiple databases, and increment the depth array as needed.
++		for (RecordKeyVector::iterator_type iter = hits.begin(); iter != hits.end(); iter = hits.next()) 
++		{
++			const Record *dbRec = *iter;
++			int dbStart = dbRec->getStartPos();
++			int dbEnd = dbRec->getEndPos();
++			int maxStart = max(_queryOffset, dbStart);
++			int minEnd = min(dbEnd, key->getEndPos());
++
++			for (int i=maxStart; i < minEnd; i++) {
++				_depthArray[i - _queryOffset]++;
++			}
++			_hitCount++;
++		}
++	}
++	// -split
++	else
++	{
++		for (RecordKeyVector::iterator_type iter = hits.begin(); iter != hits.end(); iter = hits.next()) {
++			const Record *dbRec = *iter;
++			bool count_hit  = false;
++			for (size_t i = 0; i < dbRec->block_starts.size(); ++i)
++			{
++				int block_start = dbRec->block_starts[i];
++				int block_end = dbRec->block_ends[i];
++				int maxStart = max(_queryOffset, block_start);
++				int minEnd = min(block_end, key->getEndPos());
++				if ((minEnd - maxStart) > 0)
++				{
++					for (int i = maxStart; i < minEnd; i++) 
++					{
++						_depthArray[i - _queryOffset]++;
++					}
++					count_hit = true;
++				}
++			}
++			if (count_hit)
++			{
++				_hitCount++;
++			}
+ 		}
+ 	}
+ }
+@@ -135,19 +174,23 @@ size_t CoverageFile::countBasesAtDepth(size_t depth) {
+ 
+ void CoverageFile::doCounts(RecordOutputMgr *outputMgr, RecordKeyVector &hits)
+ {
+-	_finalOutput = static_cast<uint32_t>(hits.size());
++	ostringstream s;
++	s << _hitCount;
++	_finalOutput.append(s.str());
+ 	outputMgr->printRecord(hits.getKey(), _finalOutput);
+ }
+ 
+ void CoverageFile::doPerBase(RecordOutputMgr *outputMgr, RecordKeyVector &hits)
+ {
+ 	//loop through all bases in query, printing full record and metrics for each
+-	const Record * queryRec = hits.getKey();
++	
++	Record * queryRec = hits.getKey();
+ 	for (size_t i= 0; i < _queryLen; i++) {
+-		_finalOutput = static_cast<uint32_t>(i+1);
+-		_finalOutput.append("\t");
+-		_finalOutput.append(static_cast<uint32_t>(_depthArray[i]));
+-
++		ostringstream s;
++		s << (i+1);
++		s << "\t";
++		s << _depthArray[i];
++		_finalOutput = s.str();
+ 		outputMgr->printRecord(queryRec, _finalOutput);
+ 	}
+ }
+@@ -158,7 +201,12 @@ void CoverageFile::doMean(RecordOutputMgr *outputMgr, RecordKeyVector &hits)
+ 	for (size_t i= 0; i < _queryLen; i++) {
+ 		sum += _depthArray[i];
+ 	}
+-	format((float)sum / (float)_queryLen);
++	ostringstream s;
++	float mean = ((float)sum / (float)_queryLen);
++	char *meanString;
++	asprintf(&meanString, "%0.7f", mean);
++	s << meanString;
++	_finalOutput.append(s.str());
+ 	outputMgr->printRecord(hits.getKey(), _finalOutput);
+ }
+ 
+@@ -166,7 +214,6 @@ void CoverageFile::doMean(RecordOutputMgr *outputMgr, RecordKeyVector &hits)
+ void CoverageFile::doHist(RecordOutputMgr *outputMgr, RecordKeyVector &hits)
+ {
+ 	//make a map of depths to num bases with that depth
+-
+ 	_currDepthMap.clear();
+ 	for (size_t i=0; i < _queryLen; i++) {
+ 		_currDepthMap[_depthArray[i]]++;
+@@ -176,40 +223,38 @@ void CoverageFile::doHist(RecordOutputMgr *outputMgr, RecordKeyVector &hits)
+ 	for (depthMapType::iterator iter = _currDepthMap.begin(); iter != _currDepthMap.end(); iter++) {
+ 		size_t depth = iter->first;
+ 		size_t numBasesAtDepth = iter->second;
+-		float coveredBases = (float)numBasesAtDepth / (float)_queryLen;
+-
+-		_finalOutput = static_cast<uint32_t>(depth);
+-		_finalOutput.append("\t");
+-		_finalOutput.append(static_cast<uint32_t>(numBasesAtDepth));
+-		_finalOutput.append("\t");
+-		_finalOutput.append(static_cast<uint32_t>(_queryLen));
+-		_finalOutput.append("\t");
+-		format(coveredBases);
+-
++		float coveredFraction = (float)numBasesAtDepth / (float)_queryLen;
++
++		ostringstream s;
++		s << depth;
++		s << "\t";
++		s << numBasesAtDepth;
++		s << "\t";
++		s << _queryLen;
++		s << "\t";
++		char *coveredFractionString;
++		asprintf(&coveredFractionString, "%0.7f", coveredFraction);
++		s << coveredFractionString;
++		_finalOutput = s.str();
+ 		outputMgr->printRecord(hits.getKey(), _finalOutput);
+ 	}
+-
+ }
+ 
+ void CoverageFile::doDefault(RecordOutputMgr *outputMgr, RecordKeyVector &hits)
+ {
+ 	size_t nonZeroBases = _queryLen - countBasesAtDepth(0);
+-	float coveredBases = (float)nonZeroBases / (float)_queryLen;
+-
+-	_finalOutput = static_cast<uint32_t>(hits.size());
+-	_finalOutput.append("\t");
+-	_finalOutput.append(static_cast<uint32_t>(nonZeroBases));
+-	_finalOutput.append("\t");
+-	_finalOutput.append(static_cast<uint32_t>(_queryLen));
+-	_finalOutput.append("\t");
+-	format(coveredBases);
+-
++	float coveredFraction = (float)nonZeroBases / (float)_queryLen;
++
++	ostringstream s;
++	s << _hitCount;
++	s << "\t";
++	s << nonZeroBases;
++	s << "\t";
++	s << _queryLen;
++	s << "\t";
++	char *coveredFractionString;
++	asprintf(&coveredFractionString, "%0.7f", coveredFraction);
++	s << coveredFractionString;
++	_finalOutput = s.str();
+ 	outputMgr->printRecord(hits.getKey(), _finalOutput);
+ }
+-
+-void CoverageFile::format(float val)
+-{
+-	memset(_floatValBuf, 0, floatValBufLen);
+-	sprintf(_floatValBuf, "%0.7f", val);
+-   _finalOutput.append(_floatValBuf);
+-}
+diff --git a/src/coverageFile/coverageFile.h b/src/coverageFile/coverageFile.h
+index fa2c662..691b74f 100644
+--- a/src/coverageFile/coverageFile.h
++++ b/src/coverageFile/coverageFile.h
+@@ -8,6 +8,7 @@
+ #ifndef COVERAGEFILE_H_
+ #define COVERAGEFILE_H_
+ 
++#include <stdio.h> // for asprintf
+ #include "intersectFile.h"
+ #include "ContextCoverage.h"
+ 
+@@ -21,12 +22,13 @@ public:
+ 
+ 
+ protected:
+-	QuickString _finalOutput;
++	string _finalOutput;
+ 
+ 	size_t *_depthArray;
+ 	size_t _depthArrayCapacity;
+ 	size_t _queryLen;
+ 	size_t _totalQueryLen;
++	size_t _hitCount;
+ 	int _queryOffset;
+ 	static const int DEFAULT_DEPTH_CAPACITY = 1024;
+ 	char *_floatValBuf;
+@@ -47,9 +49,6 @@ protected:
+ 	void doMean(RecordOutputMgr *outputMgr, RecordKeyVector &hits);
+ 	void doHist(RecordOutputMgr *outputMgr, RecordKeyVector &hits);
+ 	void doDefault(RecordOutputMgr *outputMgr, RecordKeyVector &hits);
+-
+-	void format(float val);
+-
+ };
+ 
+ 
+diff --git a/src/fisher/fisher.cpp b/src/fisher/fisher.cpp
+index 7548f18..583b467 100644
+--- a/src/fisher/fisher.cpp
++++ b/src/fisher/fisher.cpp
+@@ -91,7 +91,7 @@ void Fisher::giveFinalReport(RecordOutputMgr *outputMgr)
+ unsigned long Fisher::getTotalIntersection(RecordKeyVector &recList)
+ {
+     unsigned long intersection = 0;
+-    const Record *key = recList.getKey();
++    Record *key = recList.getKey();
+     int keyStart = key->getStartPos();
+     int keyEnd = key->getEndPos();
+ 
+@@ -99,7 +99,7 @@ unsigned long Fisher::getTotalIntersection(RecordKeyVector &recList)
+     _qsizes.push_back((keyEnd - keyStart));
+ 
+     int hitIdx = 0;
+-    for (RecordKeyVector::const_iterator_type iter = recList.begin(); iter != recList.end(); iter = recList.next()) {
++    for (RecordKeyVector::iterator_type iter = recList.begin(); iter != recList.end(); iter = recList.next()) {
+         int maxStart = max((*iter)->getStartPos(), keyStart);
+         int minEnd = min((*iter)->getEndPos(), keyEnd);
+         _qsizes.push_back((int)(minEnd - maxStart));
+diff --git a/src/groupBy/Makefile b/src/groupBy/Makefile
+index 9bb141a..44cd7aa 100644
+--- a/src/groupBy/Makefile
++++ b/src/groupBy/Makefile
+@@ -10,6 +10,7 @@ INCLUDES = -I$(UTILITIES_DIR)/Contexts/ \
+ 			-I$(UTILITIES_DIR)/general/ \
+ 			-I$(UTILITIES_DIR)/fileType/ \
+ 			-I$(UTILITIES_DIR)/lineFileUtilities/ \
++			-I$(UTILITIES_DIR)/stringUtilities/ \
+ 			-I$(UTILITIES_DIR)/gzstream/ \
+            -I$(UTILITIES_DIR)/GenomeFile/ \
+            -I$(UTILITIES_DIR)/BamTools/include \
+diff --git a/src/groupBy/groupBy.cpp b/src/groupBy/groupBy.cpp
+index 867f15f..1e1dbda 100644
+--- a/src/groupBy/groupBy.cpp
++++ b/src/groupBy/groupBy.cpp
+@@ -7,6 +7,8 @@
+ #include "groupBy.h"
+ #include "Tokenizer.h"
+ #include "ParseTools.h"
++#include "stringUtilities.h"
++#include <utility>
+ 
+ GroupBy::GroupBy(ContextGroupBy *context)
+ : ToolBase(context),
+@@ -29,8 +31,8 @@ bool GroupBy::init()
+ 	for (int i=0; i < numElems; i++) {
+ 		//if the item is a range, such as 3-5,
+ 		//must split that as well.
+-		const QuickString &elem = groupColsTokens.getElem(i);
+ 
++		const string &elem = groupColsTokens.getElem(i);
+ 		if (strchr(elem.c_str(), '-')) {
+ 			Tokenizer rangeElems;
+ 			rangeElems.tokenize(elem, '-');
+@@ -59,14 +61,19 @@ bool GroupBy::findNext(RecordKeyVector &hits)
+ 	assignPrevFields();
+ 	hits.setKey(_prevRecord);
+ 	hits.push_back(_prevRecord); //key should also be part of group for calculations
+-	while (1) {
+-		const Record *newRecord = getNextRecord();
+-		if (newRecord == NULL) {
++	while (1) 
++	{
++		Record *newRecord = getNextRecord();
++		if (newRecord == NULL) 
++		{
+ 			_prevRecord = NULL;
+ 			break;
+-		} else if (canGroup(newRecord)) {
++		} else if (canGroup(newRecord)) 
++		{
+ 			hits.push_back(newRecord);
+-		} else {
++		} 
++		else 
++		{
+ 			_prevRecord = newRecord;
+ 			break;
+ 		}
+@@ -77,15 +84,19 @@ bool GroupBy::findNext(RecordKeyVector &hits)
+ void GroupBy::processHits(RecordOutputMgr *outputMgr, RecordKeyVector &hits)
+ {
+ 
+-	const Record *rec = hits.getKey();
+-	const QuickString &opVal  = _context->getColumnOpsVal(hits);
+-	if (upCast(_context)->printFullCols()) {
++	Record *rec = hits.getKey();
++	const string &opVal  = _context->getColumnOpsVal(hits);
++	if (upCast(_context)->printFullCols()) 
++	{
+ 		outputMgr->printRecord(rec, opVal);
+-	} else {
+-		QuickString outBuf;
+-		for (int i=0; i < (int)_groupCols.size(); i++) {
++	} 
++	else 
++	{
++		string outBuf;
++		for (int i = 0; i < (int)_groupCols.size(); i++) 
++		{
+ 			outBuf.append(rec->getField(_groupCols[i]));
+-			outBuf.append('\t');
++			outBuf.append("\t");
+ 		}
+ 		outBuf.append(opVal);
+ 		outputMgr->printRecord(NULL, outBuf);
+@@ -95,7 +106,7 @@ void GroupBy::processHits(RecordOutputMgr *outputMgr, RecordKeyVector &hits)
+ 
+ void GroupBy::cleanupHits(RecordKeyVector &hits)
+ {
+-	RecordKeyVector::const_iterator_type iter = hits.begin();
++	RecordKeyVector::iterator_type iter = hits.begin();
+ 	for (; iter != hits.end(); iter = hits.next()) 
+ 	{
+ 		_queryFRM->deleteRecord(*iter);	
+@@ -103,12 +114,16 @@ void GroupBy::cleanupHits(RecordKeyVector &hits)
+ 	hits.clearAll();
+ }
+ 
+-const Record *GroupBy::getNextRecord() {
+-	while (!_queryFRM->eof()) {
++Record *GroupBy::getNextRecord() {
++	while (!_queryFRM->eof()) 
++	{
+ 		Record *queryRecord = _queryFRM->getNextRecord();
+-		if (queryRecord == NULL) {
++		if (queryRecord == NULL) 
++		{
+ 			continue;
+-		} else {
++		} 
++		else 
++		{
+ 			return queryRecord;
+ 		}
+ 	}
+@@ -121,19 +136,22 @@ void GroupBy::assignPrevFields() {
+ 	}
+ }
+ 
+-bool GroupBy::canGroup(const Record *newRecord) {
+-
+-	for (int i=0; i < (int)_groupCols.size(); i++) {
++bool GroupBy::canGroup(Record *newRecord) 
++{
++	for (int i = 0; i < (int)_groupCols.size(); i++) 
++	{
+ 		int fieldNum = _groupCols[i];
+-		const QuickString &newField = newRecord->getField(fieldNum);
+-		const QuickString &oldField = _prevFields[i];
+-		if (upCast(_context)->ignoreCase()) {
+-			if (oldField.stricmp(newField)) return false;
+-		} else {
++		const string &newField = newRecord->getField(fieldNum);
++		const string &oldField = _prevFields[i];
++		if (upCast(_context)->ignoreCase()) 
++		{
++			if (toLower(oldField) != toLower(newField)) return false;
++		} 
++		else 
++		{
+ 			if (oldField != newField) return false;
+ 		}
+ 	}
+ 	return true;
+-
+ }
+ 
+diff --git a/src/groupBy/groupBy.h b/src/groupBy/groupBy.h
+index 2c96dc9..c44ba3c 100644
+--- a/src/groupBy/groupBy.h
++++ b/src/groupBy/groupBy.h
+@@ -27,11 +27,11 @@ protected:
+ 	virtual ContextGroupBy *upCast(ContextBase *context) { return static_cast<ContextGroupBy *>(context); }
+ 
+ 	vector<int> _groupCols;
+-	vector<QuickString> _prevFields;
++	vector<string> _prevFields;
+ 	FileRecordMgr *_queryFRM;
+-	const Record *_prevRecord;
+-	const Record *getNextRecord();
+-	bool canGroup(const Record *);
++	Record *_prevRecord;
++	Record *getNextRecord();
++	bool canGroup(Record *);
+ 	void assignPrevFields();
+ };
+ 
+diff --git a/src/intersectFile/intersectFile.cpp b/src/intersectFile/intersectFile.cpp
+index 01bb222..45977fc 100644
+--- a/src/intersectFile/intersectFile.cpp
++++ b/src/intersectFile/intersectFile.cpp
+@@ -67,6 +67,7 @@ bool IntersectFile::findNext(RecordKeyVector &hits)
+ 
+ void IntersectFile::processHits(RecordOutputMgr *outputMgr, RecordKeyVector &hits)
+ {
++	RecordKeyVector::iterator_type hitListIter = hits.begin();
+ 	outputMgr->printRecord(hits);
+ }
+ 
+@@ -120,19 +121,16 @@ void IntersectFile::makeSweep() {
+ void IntersectFile::checkSplits(RecordKeyVector &hitSet)
+ {
+ 	if (upCast(_context)->getObeySplits()) {
+-		RecordKeyVector keySet(hitSet.getKey());
+-		RecordKeyVector resultSet(hitSet.getKey());
+-		RecordKeyVector overlapSet(hitSet.getKey());
+-		upCast(_context)->getSplitBlockInfo()->findBlockedOverlaps(keySet, hitSet, resultSet, overlapSet);
+-		
++
+ 		// when using coverage, we need a list of the sub-intervals of coverage
+ 		// so that per-base depth can be properly calculated when obeying splits
+ 		if (_context->getProgram() == ContextBase::COVERAGE)
+ 		{
+-			hitSet.swap(overlapSet);
++			upCast(_context)->getSplitBlockInfo()->findBlockedOverlaps(hitSet, true);
+ 		}
+-		else {
+-			hitSet.swap(resultSet);
++		else
++		{
++			upCast(_context)->getSplitBlockInfo()->findBlockedOverlaps(hitSet, false);
+ 		}
+ 	}
+ }
+diff --git a/src/intersectFile/intersectFile.h b/src/intersectFile/intersectFile.h
+index f40e750..3c85f93 100644
+--- a/src/intersectFile/intersectFile.h
++++ b/src/intersectFile/intersectFile.h
+@@ -26,11 +26,11 @@ public:
+     IntersectFile(ContextIntersect *context);
+     virtual ~IntersectFile();
+ 	virtual bool init();
+-	virtual bool findNext(RecordKeyVector &hits);
+-	virtual void processHits(RecordOutputMgr *outputMgr, RecordKeyVector &hits);
+-	virtual void cleanupHits(RecordKeyVector &hits);
++	virtual bool findNext(RecordKeyVector &);
++	virtual void processHits(RecordOutputMgr *, RecordKeyVector &);
++	virtual void cleanupHits(RecordKeyVector &);
+ 	virtual bool finalizeCalculations();
+-	virtual void  giveFinalReport(RecordOutputMgr *outputMgr) {}
++	virtual void  giveFinalReport(RecordOutputMgr *) {}
+ 
+ 
+ protected:
+diff --git a/src/jaccard/jaccard.cpp b/src/jaccard/jaccard.cpp
+index c99c117..2f61b86 100644
+--- a/src/jaccard/jaccard.cpp
++++ b/src/jaccard/jaccard.cpp
+@@ -57,13 +57,13 @@ void  Jaccard::giveFinalReport(RecordOutputMgr *outputMgr) {
+ unsigned long Jaccard::getTotalIntersection(RecordKeyVector &hits)
+ {
+ 	unsigned long intersection = 0;
+-	const Record *key = hits.getKey();
++	Record *key = hits.getKey();
+ 	int keyStart = key->getStartPos();
+ 	int keyEnd = key->getEndPos();
+ 
+ 	int hitIdx = 0;
+-	for (RecordKeyVector::const_iterator_type iter = hits.begin(); iter != hits.end(); iter = hits.next()) {
+-		const Record *currRec = *iter;
++	for (RecordKeyVector::iterator_type iter = hits.begin(); iter != hits.end(); iter = hits.next()) {
++		Record *currRec = *iter;
+ 		int maxStart = max(currRec->getStartPos(), keyStart);
+ 		int minEnd = min(currRec->getEndPos(), keyEnd);
+ 		if (_context->getObeySplits()) {
+diff --git a/src/jaccard/jaccard.h b/src/jaccard/jaccard.h
+index e23ad9c..8c4f6db 100644
+--- a/src/jaccard/jaccard.h
++++ b/src/jaccard/jaccard.h
+@@ -15,11 +15,11 @@ class Jaccard : public IntersectFile {
+ 
+ public:
+ 	Jaccard(ContextJaccard *context);
+-	virtual bool findNext(RecordKeyVector &hits);
+-	virtual void processHits(RecordOutputMgr *outputMgr, RecordKeyVector &hits) {}
+-	virtual void cleanupHits(RecordKeyVector &hits);
++	virtual bool findNext(RecordKeyVector &);
++	virtual void processHits(RecordOutputMgr *, RecordKeyVector &) {}
++	virtual void cleanupHits(RecordKeyVector &);
+ 	virtual bool finalizeCalculations();
+-	virtual void  giveFinalReport(RecordOutputMgr *outputMgr);
++	virtual void  giveFinalReport(RecordOutputMgr *);
+ 
+ 
+ protected:
+diff --git a/src/nekSandbox1/nekSandboxMain.cpp b/src/nekSandbox1/nekSandboxMain.cpp
+index d228540..d9b298a 100644
+--- a/src/nekSandbox1/nekSandboxMain.cpp
++++ b/src/nekSandbox1/nekSandboxMain.cpp
+@@ -67,7 +67,7 @@ int nek_sandbox1_main(int argc,char** argv)
+ //		printf("%s", sLine);
+ //	}
+ //	return 0;
+-//	QuickString filename(argv[1]);
++//	string filename(argv[1]);
+ //	istream *inputStream = NULL;
+ //	if (filename  == "-") {
+ //		inputStream = &cin;
+@@ -91,7 +91,7 @@ int nek_sandbox1_main(int argc,char** argv)
+ ////		exit(1);
+ ////	}
+ ////	}
+-//    QuickString _bamHeader = _bamReader.GetHeaderText();
++//    string _bamHeader = _bamReader.GetHeaderText();
+ //    BamTools::RefVector _references = _bamReader.GetReferenceData();
+ //
+ //    if (_bamHeader.empty() || _references.empty()) {
+@@ -107,10 +107,10 @@ int nek_sandbox1_main(int argc,char** argv)
+ //		exit(1);
+ //	}
+ //	string sLine;
+-//	vector<QuickString> fields;
+-//	QuickString chrName;
++//	vector<string> fields;
++//	string chrName;
+ //
+-//	vector<QuickString> chroms;
++//	vector<string> chroms;
+ //	chroms.push_back("1");
+ //	chroms.push_back("2");
+ //	chroms.push_back("10");
+@@ -127,7 +127,7 @@ int nek_sandbox1_main(int argc,char** argv)
+ //			continue;
+ //		}
+ //		Tokenize(sLine.c_str(), fields);
+-//		const QuickString &currChrom = fields[2];
++//		const string &currChrom = fields[2];
+ //		if (currChrom == chroms[chromIdx]) {
+ //			cout << sLine << endl;
+ //			chromCounts[chromIdx]++;
+@@ -157,7 +157,7 @@ int nek_sandbox1_main(int argc,char** argv)
+ //	cout << "RecordType is : " << frm.getRecordType() << ", " << frm.getRecordTypeName() << "."  << endl;
+ //
+ //	bool headerFound = false;
+-//	QuickString outbuf;
++//	string outbuf;
+ //	while (!frm.eof()) {
+ //		Record *record = frm.getNextRecord();
+ //		if (!headerFound && frm.hasHeader()) {
+diff --git a/src/regressTest/regressTestMain.cpp b/src/regressTest/regressTestMain.cpp
+index 0377ca6..01a2816 100644
+--- a/src/regressTest/regressTestMain.cpp
++++ b/src/regressTest/regressTestMain.cpp
+@@ -3,7 +3,7 @@
+ #include <cstring>
+ #include <cstdlib>
+ #include <cstdio>
+-#include "QuickString.h"
++#include "string.h"
+ 
+ void usage()  {
+ 	printf("Usage: bedtools regressTest sub-prog targetVersion configFile [optionsToTest]\n");
+@@ -31,7 +31,7 @@ int regress_test_main(int argc, char **argv) {
+ 		usage();
+ 		exit(1);
+ 	}
+-	QuickString program(argv[2]);
++	string program(argv[2]);
+ 
+ 	RegressTest *regressTest = new RegressTest();
+ 
+diff --git a/src/shiftBed/shiftBed.cpp b/src/shiftBed/shiftBed.cpp
+index 81022f8..50724ce 100644
+--- a/src/shiftBed/shiftBed.cpp
++++ b/src/shiftBed/shiftBed.cpp
+@@ -51,7 +51,7 @@ void BedShift::AddShift(BED &bed) {
+ 
+   CHRPOS chromSize = (CHRPOS)_genome->getChromSize(bed.chrom);
+ 
+-  float shift;
++  double shift;
+ 
+   if (bed.strand == "-") {
+     shift = _shiftMinus;
+@@ -59,7 +59,7 @@ void BedShift::AddShift(BED &bed) {
+     shift = _shiftPlus;
+   }
+   if (_fractional == true)
+-    shift = shift * (float)bed.size();
++    shift = shift * (double)bed.size();
+ 
+   if ((bed.start + shift) < 0)
+     bed.start = 0;
+diff --git a/src/shuffleBed/shuffleBed.cpp b/src/shuffleBed/shuffleBed.cpp
+index 9b71125..2478ccc 100644
+--- a/src/shuffleBed/shuffleBed.cpp
++++ b/src/shuffleBed/shuffleBed.cpp
+@@ -73,14 +73,15 @@ BedShuffle::BedShuffle(string &bedFile, string &genomeFile,
+         _haveExclude = true;
+     }
+     
+-    if (_haveInclude) {
++    if (_haveInclude) 
++    {
+         _include = new BedFile(includeFile);
+-	_include->loadBedFileIntoVector();
+-
+-	for(std::vector<BED>::iterator it = _include->bedList.begin();
... 5743 lines suppressed ...

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/bedtools.git



More information about the debian-med-commit mailing list