[med-svn] [Git][med-team/bedtools][upstream] New upstream version 2.31.1+dfsg
Étienne Mollier (@emollier)
gitlab at salsa.debian.org
Fri Nov 24 11:16:53 GMT 2023
Étienne Mollier pushed to branch upstream at Debian Med / bedtools
Commits:
8c3c0378 by Étienne Mollier at 2023-11-24T12:07:56+01:00
New upstream version 2.31.1+dfsg
- - - - -
8 changed files:
- README.md
- docs/content/tools/summary.rst
- src/utils/FileRecordTools/Records/Record.h
- src/utils/NewChromsweep/NewChromsweep.cpp
- src/utils/NewChromsweep/NewChromsweep.h
- src/utils/general/ParseTools.cpp
- src/utils/general/ParseTools.h
- src/utils/version/version_release.txt
Changes:
=====================================
README.md
=====================================
@@ -35,8 +35,8 @@ Details
First created through urgency and adrenaline by Aaron Quinlan Spring 2009.
Maintained by the Quinlan Laboratory at the University of Virginia.
-1. **Lead developers**: Aaron Quinlan, Hao Hoou, Brent Pedersen, Neil Kindlon
-2. **Significant contributions**: Hao Hou, John Marshall, Assaf Gordon, Royden Clark, Brent Pedersen, Ryan Dale
+1. **Lead developers**: Aaron Quinlan, Hao Hou, Brent Pedersen, Neil Kindlon
+2. **Significant contributions**: John Marshall, Assaf Gordon, Royden Clark, Ryan Dale
3. **Repository**: https://github.com/arq5x/bedtools2
4. **Stable releases**: https://github.com/arq5x/bedtools2/releases
5. **Documentation**: http://bedtools.readthedocs.org
=====================================
docs/content/tools/summary.rst
=====================================
@@ -9,14 +9,14 @@
Genomics experiments have numerous sources of both technical and biological variation
that can confound analysis and interpretation. Therefore, one of the most important steps
in genomics data analysis is generating high-level summary stats of one's data to ask if the
-basic observations are in line with the expectation. Doing such quality control as early
+basic observations align with the expectation. Doing such quality control as early
as possible in the analysis workflow helps to head off unnecessary time spent chasing
-technical artifacts that masquerade as biological signal. **This quality control
+technical artifacts that masquerade as biological signals. **This quality control
is the motivation behind the** ``bedtools summary`` **command.**
Given an input interval file in standard formats, as well as a genome file defining
the chromosome names and lengths relevant to your data, ``bedtools summary`` will compute
-a number of summary statistics detailing, for each chromosome, the number of intervals,
+several summary statistics detailing, for each chromosome, the number of intervals,
the total number of base pairs, and the fraction of intervals and base pairs observed
in your input file. From these summary measures, one can get a quick sense of questions like:
@@ -25,20 +25,20 @@ in your input file. From these summary measures, one can get a quick sense of qu
- Which chromosomes are outliers?
For example, the following plot was generated directly from the output of ``bedtools summary``.
-It depicts, for each chromosome, the the fraction of all intervals in the RepeatMasker track from UCSC
+It depicts, for each chromosome, the fraction of all intervals in the RepeatMasker track from UCSC
**observed** on each chromosome versus the fraction of RepeatMasker intervals that are **expected**
for each chromosome, based on the fraction of the genome that each chromosome represents.
-This plot highlights that ``chr19``, ``chrM``, and ``chrMT`` (the different mitochindrial reference genomes) are outliers.
+This plot highlights that ``chr19``, ``chrM``, and ``chrMT`` (the different mitochondrial reference genomes) are outliers.
Chromosome 19 has more than 1.5 times the intervals that are expected based upon the length of
-the chromosome. ChrM has no intervals, making the observed to expected ratio be 0.
+the chromosome. ChrM has no intervals, making the observed-to-expected ratio be 0.
This former is because the repeat content of chromosome 19 "is approximately 55%, more than 10% higher
-than the genome-wide average" (``Grimwood J, et al. Nature. 2004;428:529–35``). The latter because
+than the genome-wide average" (``Grimwood J, et al. Nature. 2004;428:529–35``). The latter is because
are indeed no repeat annotations provided by UCSC for the mitochondrial genome.
-In this case, the extremes in obersved versus expected ratios make sense. However,
-**this tool provides the ability to detect cases that do not and are either artifacts or
-biological signal**.
+In this case, the extremes in observed versus expected ratios make sense. However,
+**this tool allows one to detect cases that do not and are either artifacts or
+biological signals**.
.. image:: ../images/tool-glyphs/summary.png
:width: 600pt
@@ -100,23 +100,23 @@ Now, let's make a "genome" file for GRCh38 from the `chromInfo` table at UCSC
.. code-block:: bash
- curl -s http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/chromInfo.txt.gz \
+ curl -s http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/chromInfo.txt.gz \
| gzcat \
| cut -f 1-2 \
| grep -v -E 'Un|fix|random|alt|hap' \
> grch38.genome.txt
head grch38.genome.txt
- chr1 249250621
- chr2 243199373
- chr3 198022430
- chr4 191154276
- chr5 180915260
- chr6 171115067
- chr7 159138663
- chrX 155270560
- chr8 146364022
- chr9 141213431
+ chr1 248956422
+ chr2 242193529
+ chr3 198295559
+ chr4 190214555
+ chr5 181538259
+ chr6 170805979
+ chr7 159345973
+ chrX 156040895
+ chr8 145138636
+ chr9 138394717
Now, let's run ``bedtools summary``.
@@ -124,33 +124,32 @@ Now, let's run ``bedtools summary``.
bedtools summary -i simrep.grch38.bed -g grch38.genome.txt | column -t
chrom chrom_length num_ivls total_ivl_bp chrom_frac_genome frac_all_ivls frac_all_bp min max mean
- chr1 249250621 74548 15557884 0.080514834 0.077210928 0.048725518 25 124438 208.696195740
- chr2 243199373 74474 14493548 0.078560114 0.077134284 0.045392139 25 336509 194.612186803
- chr3 198022430 56894 13946854 0.063966714 0.058926309 0.043679955 25 500000 245.137518895
- chr4 191154276 56685 10160257 0.061748110 0.058709844 0.031820766 25 136950 179.240663315
- chr5 180915260 53887 16801740 0.058440625 0.055811896 0.052621133 25 500000 311.795794904
- chr6 171115067 51802 11222841 0.055274892 0.053652418 0.035148658 25 500000 216.648797344
- chr7 159138663 55972 20054618 0.051406183 0.057971375 0.062808775 25 150228 358.297327235
- chrX 155270560 50432 27398336 0.050156679 0.052233481 0.085808462 25 500000 543.272842640
- chr8 146364022 45937 15650021 0.047279621 0.047577915 0.049014080 25 500000 340.684437382
- chr9 141213431 39329 10932158 0.045615838 0.040733870 0.034238272 25 159861 277.966843805
- chr10 135534747 45074 11407694 0.043781466 0.046684087 0.035727596 25 110000 253.088121755
- chr11 135006516 41279 14127024 0.043610833 0.042753526 0.044244227 25 500000 342.232709126
- chr12 133851895 44151 13878240 0.043237859 0.045728117 0.043465064 25 356015 314.335802134
- chr13 115169878 29907 9423815 0.037203051 0.030975307 0.029514313 25 110000 315.103989033
- chr14 107349540 27973 9245970 0.034676866 0.028972223 0.028957323 25 173523 330.531941515
- chr15 102531392 25557 9565023 0.033120471 0.026469921 0.029956561 25 110000 374.262354736
- chr16 90354753 35288 11959674 0.029187080 0.036548522 0.037456334 25 138208 338.916175470
- chr17 81195210 32093 16264416 0.026228295 0.033239393 0.050938295 25 132210 506.790141152
- chr18 78077248 23966 18684937 0.025221107 0.024822089 0.058519091 25 500000 779.643536677
- chr20 63025520 22608 12620160 0.020358983 0.023415580 0.039524901 25 500000 558.216560510
- chrY 59373566 15130 4564760 0.019179301 0.015670458 0.014296307 25 227093 301.702577660
- chr19 59128983 30854 11391752 0.019100294 0.031956135 0.035677667 25 396802 369.214753355
- chr22 51304566 16760 10691540 0.016572792 0.017358684 0.033484683 25 498537 637.920047733
- chr21 48129895 14911 9253172 0.015547285 0.015443636 0.028979879 25 499939 620.560123399
- chrM 16571 0 0 0.000005353 0.000000000 0.000000000 -1 -1 -1
- chrMT 16569 0 0 0.000005352 0.000000000 0.000000000 -1 -1 -1
- all 3095710552 965511 319296434 1.0 1.0 1.0 25 500000 330.702015824
+ chr1 248956422 74548 15557884 0.080613126 0.077210928 0.048725518 25 124438 208.696195740
+ chr2 242193529 74474 14493548 0.078423273 0.077134284 0.045392139 25 336509 194.612186803
+ chr3 198295559 56894 13946854 0.064208928 0.058926309 0.043679955 25 500000 245.137518895
+ chr4 190214555 56685 10160257 0.061592265 0.058709844 0.031820766 25 136950 179.240663315
+ chr5 181538259 53887 16801740 0.058782844 0.055811896 0.052621133 25 500000 311.795794904
+ chr6 170805979 51802 11222841 0.055307687 0.053652418 0.035148658 25 500000 216.648797344
+ chr7 159345973 55972 20054618 0.051596890 0.057971375 0.062808775 25 150228 358.297327235
+ chrX 156040895 50432 27398336 0.050526692 0.052233481 0.085808462 25 500000 543.272842640
+ chr8 145138636 45937 15650021 0.046996495 0.047577915 0.049014080 25 500000 340.684437382
+ chr9 138394717 39329 10932158 0.044812786 0.040733870 0.034238272 25 159861 277.966843805
+ chr11 135086622 41279 14127024 0.043741611 0.042753526 0.044244227 25 500000 342.232709126
+ chr10 133797422 45074 11407694 0.043324163 0.046684087 0.035727596 25 110000 253.088121755
+ chr12 133275309 44151 13878240 0.043155100 0.045728117 0.043465064 25 356015 314.335802134
+ chr13 114364328 29907 9423815 0.037031646 0.030975307 0.029514313 25 110000 315.103989033
+ chr14 107043718 27973 9245970 0.034661202 0.028972223 0.028957323 25 173523 330.531941515
+ chr15 101991189 25557 9565023 0.033025172 0.026469921 0.029956561 25 110000 374.262354736
+ chr16 90338345 35288 11959674 0.029251932 0.036548522 0.037456334 25 138208 338.916175470
+ chr17 83257441 32093 16264416 0.026959106 0.033239393 0.050938295 25 132210 506.790141152
+ chr18 80373285 23966 18684937 0.026025204 0.024822089 0.058519091 25 500000 779.643536677
+ chr20 64444167 22608 12620160 0.020867290 0.023415580 0.039524901 25 500000 558.216560510
+ chr19 58617616 30854 11391752 0.018980628 0.031956135 0.035677667 25 396802 369.214753355
+ chrY 57227415 15130 4564760 0.018530475 0.015670458 0.014296307 25 227093 301.702577660
+ chr22 50818468 16760 10691540 0.016455232 0.017358684 0.033484683 25 498537 637.920047733
+ chr21 46709983 14911 9253172 0.015124887 0.015443636 0.028979879 25 499939 620.560123399
+ chrM 16569 0 0 0.000005365 0.000000000 0.000000000 -1 -1 -1
+ all 3088286401 965511 319296434 1.0 1.0 1.0 25 500000 330.702015824
Notice the following:
=====================================
src/utils/FileRecordTools/Records/Record.h
=====================================
@@ -73,7 +73,7 @@ public:
virtual void printNull(string &) const {}
friend ostream &operator << (ostream &out, const Record &record);
- virtual const Record & operator=(const Record &);
+ const Record & operator=(const Record &);
virtual bool isZeroBased() const {return true;};
=====================================
src/utils/NewChromsweep/NewChromsweep.cpp
=====================================
@@ -26,11 +26,11 @@ NewChromSweep::NewChromSweep(ContextIntersect *context)
_wasInitialized(false),
_currQueryRec(NULL),
_runToQueryEnd(_context->getRunToQueryEnd()),
+ _runToDbEnd(false),
_lexicoDisproven(false),
_lexicoAssumed(false),
_lexicoAssumedFileIdx(-1),
- _testLastQueryRec(false),
- _runToDbEnd(false)
+ _testLastQueryRec(false)
{
_filePrevChrom.resize(_numFiles);
_runToDbEnd = context->shouldRunToDbEnd();
=====================================
src/utils/NewChromsweep/NewChromsweep.h
=====================================
@@ -90,8 +90,7 @@ protected:
string _currQueryChromName;
string _prevQueryChromName;
bool _runToQueryEnd;
- bool _runToDbEnd;
-
+ bool _runToDbEnd;
virtual void masterScan(RecordKeyVector &retList);
=====================================
src/utils/general/ParseTools.cpp
=====================================
@@ -2,6 +2,7 @@
#include <climits>
#include <cctype>
#include <cstring>
+#include <cstdint>
#include <cstdio>
#include <cstdlib>
#include <sstream>
=====================================
src/utils/general/ParseTools.h
=====================================
@@ -16,11 +16,10 @@
#include "string.h"
#include <cstdio>
#include <cstdlib>
+#include "BedtoolsTypes.h"
using namespace std;
-typedef int64_t CHRPOS;
-
bool isNumeric(const string &str);
bool isInteger(const string &str);
@@ -54,7 +53,7 @@ void int2str(U number, T& buffer, bool appendToBuf = false)
bool neg = number < 0;
if(neg) number = -number;
- uint32_t n;
+ unsigned int n;
for(n = 0; number; number /= 10)
tmp[12 - ++n] = number % 10 + '0';
if(neg) tmp[12 - ++n] = '-';
=====================================
src/utils/version/version_release.txt
=====================================
@@ -1,5 +1,5 @@
-# This file was auto-generated by running "make setversion VERSION=v2.31.0"
-# on Fri Apr 28 06:08:59 MDT 2023 .
+# This file was auto-generated by running "make setversion VERSION=v2.31.1"
+# on Tue Nov 7 16:23:24 MST 2023 .
# Please do not edit or commit this file manually.
#
-v2.31.0
+v2.31.1
View it on GitLab: https://salsa.debian.org/med-team/bedtools/-/commit/8c3c03786c2b8f63ca6c8fe01ea2e97637fd023d
--
View it on GitLab: https://salsa.debian.org/med-team/bedtools/-/commit/8c3c03786c2b8f63ca6c8fe01ea2e97637fd023d
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20231124/d27387f1/attachment-0001.htm>
More information about the debian-med-commit
mailing list