[med-svn] [Git][med-team/bcftools][upstream] New upstream version 1.10.2
Michael R. Crusoe
gitlab at salsa.debian.org
Wed Dec 25 17:09:41 GMT 2019
Michael R. Crusoe pushed to branch upstream at Debian Med / bcftools
Commits:
e439282c by Michael R. Crusoe at 2019-12-25T16:58:15Z
New upstream version 1.10.2
- - - - -
24 changed files:
- Makefile
- NEWS
- doc/bcftools.1
- doc/bcftools.html
- doc/bcftools.txt
- plugins/trio-dnm.c
- + test/norm.split.3.out
- + test/norm.split.3.vcf
- test/test.pl
- + test/trio-dnm.2.out
- + test/trio-dnm.2.vcf
- + test/view64bit.1.out
- + test/view64bit.1.vcf
- + test/view64bit.2.out
- + test/view64bit.2.vcf
- + test/view64bit.3.out
- + test/view64bit.3.vcf
- + test/view64bit.4.out
- + test/view64bit.4.vcf
- + test/view64bit.5.out
- + test/view64bit.5.vcf
- vcfisec.c
- vcfnorm.c
- version.sh
Changes:
=====================================
Makefile
=====================================
@@ -103,7 +103,7 @@ endif
include config.mk
-PACKAGE_VERSION = 1.10
+PACKAGE_VERSION = 1.10.2
# If building from a Git repository, replace $(PACKAGE_VERSION) with the Git
# description of the working tree: either a release tag with the same value
=====================================
NEWS
=====================================
@@ -1,3 +1,9 @@
+## Release 1.10.2 (19th December 2019)
+
+This is a release fix that corrects minor inconsistencies discovered in
+previous deliverables.
+
+
## Release 1.10 (6th December 2019)
=====================================
doc/bcftools.1
=====================================
@@ -2,12 +2,12 @@
.\" Title: bcftools
.\" Author: [see the "AUTHORS" section]
.\" Generator: DocBook XSL Stylesheets v1.76.1 <http://docbook.sf.net/>
-.\" Date: 2019-12-06
+.\" Date: 2019-12-19
.\" Manual: \ \&
.\" Source: \ \&
.\" Language: English
.\"
-.TH "BCFTOOLS" "1" "2019\-12\-06" "\ \&" "\ \&"
+.TH "BCFTOOLS" "1" "2019\-12\-19" "\ \&" "\ \&"
.\" -----------------------------------------------------------------
.\" * Define some portability stuff
.\" -----------------------------------------------------------------
@@ -41,7 +41,7 @@ Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatica
BCFtools is designed to work on a stream\&. It regards an input file "\-" as the standard input (stdin) and outputs to the standard output (stdout)\&. Several commands can thus be combined with Unix pipes\&.
.SS "VERSION"
.sp
-This manual page was last updated \fB2019\-12\-06\fR and refers to bcftools git version \fB1\&.10\fR\&.
+This manual page was last updated \fB2019\-12\-19\fR and refers to bcftools git version \fB1\&.10\&.2\fR\&.
.SS "BCF1"
.sp
The BCF1 format output by versions of samtools <= 0\&.1\&.19 is \fBnot\fR compatible with this version of bcftools\&. To read BCF1 files one can use the view command from old versions of bcftools packaged with samtools versions <= 0\&.1\&.19 to convert to VCF, which can then be read by this version of bcftools\&.
@@ -3130,6 +3130,13 @@ reference sequence\&. Supplying this option will turn on left\-alignment and nor
option below\&.
.RE
.PP
+\fB\-\-force\fR
+.RS 4
+try to proceed with
+\fB\-m\-\fR
+even if malformed tags with incorrect number of fields are encountered, discarding such tags\&. (Experimental, use at your own risk\&.)
+.RE
+.PP
\fB\-m, \-\-multiallelics\fR \fB\-\fR|\fB+\fR[\fIsnps\fR|\fIindels\fR|\fIboth\fR|\fIany\fR]
.RS 4
split multiallelic sites into biallelic records (\fB\-\fR) or join biallelic sites into multiallelic records (\fB+\fR)\&. An optional type string can follow which controls variant types which should be split or merged together: If only SNP records should be split or merged, specify
=====================================
doc/bcftools.html
=====================================
@@ -1,6 +1,6 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>bcftools</title><link rel="stylesheet" type="text/css" href="docbook-xsl.css" /><meta name="generator" content="DocBook XSL Stylesheets V1.76.1" /></head><body><div xml:lang="en" class="refentry" title="bcftools" lang="en"><a id="idp212896"></a><div class="titlepage"></div><div class="refnamediv"><h2>Name</h2><p>bcftools — utilities for variant calling and manipulating VCFs and BCFs.</p></div><div class="refsynopsisdiv" title="Synopsis"><a id="_synopsis"></a><h2>Synopsis</h2><p><span class="strong"><strong>bcftools</strong></span> [--version|--version-only] [--help] [<span class="emphasis"><em>COMMAND</em></span>] [<span class="emphasis"><em>OPTIONS</em></span>]</p></div><div class="refsect1" title="DESCRIPTION"><a id="_description"></a><h2>DESCRIPTION</h2><p>BCFtools is a set of utilities that manipulate variant calls in the Variant
+<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>bcftools</title><link rel="stylesheet" type="text/css" href="docbook-xsl.css" /><meta name="generator" content="DocBook XSL Stylesheets V1.76.1" /></head><body><div xml:lang="en" class="refentry" title="bcftools" lang="en"><a id="idp25229120"></a><div class="titlepage"></div><div class="refnamediv"><h2>Name</h2><p>bcftools — utilities for variant calling and manipulating VCFs and BCFs.</p></div><div class="refsynopsisdiv" title="Synopsis"><a id="_synopsis"></a><h2>Synopsis</h2><p><span class="strong"><strong>bcftools</strong></span> [--version|--version-only] [--help] [<span class="emphasis"><em>COMMAND</em></span>] [<span class="emphasis"><em>OPTIONS</em></span>]</p></div><div class="refsect1" title="DESCRIPTION"><a id="_description"></a><h2>DESCRIPTION</h2><p>BCFtools is a set of utilities that manipulate variant calls in the Variant
Call Format (VCF) and its binary counterpart BCF. All commands work
transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.</p><p>Most commands accept VCF, bgzipped VCF and BCF with filetype detected
automatically even when streaming from a pipe. Indexed VCF and BCF
@@ -10,7 +10,7 @@ read simultaneously, they must be indexed and therefore also compressed.
(Note that files with non-standard index names can be accessed as e.g.
"<code class="literal">bcftools view -r X:2928329 file.vcf.gz##idx##non-standard-index-name</code>".)</p><p>BCFtools is designed to work on a stream. It regards an input file "-" as the
standard input (stdin) and outputs to the standard output (stdout). Several
-commands can thus be combined with Unix pipes.</p><div class="refsect2" title="VERSION"><a id="_version"></a><h3>VERSION</h3><p>This manual page was last updated <span class="strong"><strong>2019-12-06</strong></span> and refers to bcftools git version <span class="strong"><strong>1.10</strong></span>.</p></div><div class="refsect2" title="BCF1"><a id="_bcf1"></a><h3>BCF1</h3><p>The BCF1 format output by versions of samtools <= 0.1.19 is <span class="strong"><strong>not</strong></span>
+commands can thus be combined with Unix pipes.</p><div class="refsect2" title="VERSION"><a id="_version"></a><h3>VERSION</h3><p>This manual page was last updated <span class="strong"><strong>2019-12-19</strong></span> and refers to bcftools git version <span class="strong"><strong>1.10.2</strong></span>.</p></div><div class="refsect2" title="BCF1"><a id="_bcf1"></a><h3>BCF1</h3><p>The BCF1 format output by versions of samtools <= 0.1.19 is <span class="strong"><strong>not</strong></span>
compatible with this version of bcftools. To read BCF1 files one can use
the view command from old versions of bcftools packaged with samtools
versions <= 0.1.19 to convert to VCF, which can then be read by
@@ -1895,6 +1895,11 @@ the <span class="strong"><strong><a class="link" href="#fasta_ref">--fasta-ref</
and normalization, however, see also the <span class="strong"><strong><a class="link" href="#do_not_normalize">--do-not-normalize</a></strong></span>
option below.
</dd><dt><span class="term">
+<span class="strong"><strong>--force</strong></span>
+</span></dt><dd>
+ try to proceed with <span class="strong"><strong>-m-</strong></span> even if malformed tags with incorrect number of fields
+ are encountered, discarding such tags. (Experimental, use at your own risk.)
+</dd><dt><span class="term">
<span class="strong"><strong>-m, --multiallelics</strong></span> <span class="strong"><strong>-</strong></span>|<span class="strong"><strong>+</strong></span>[<span class="emphasis"><em>snps</em></span>|<span class="emphasis"><em>indels</em></span>|<span class="emphasis"><em>both</em></span>|<span class="emphasis"><em>any</em></span>]
</span></dt><dd>
split multiallelic sites into biallelic records (<span class="strong"><strong>-</strong></span>) or join
=====================================
doc/bcftools.txt
=====================================
@@ -1924,6 +1924,10 @@ the *<<fasta_ref,--fasta-ref>>* option is supplied.
and normalization, however, see also the *<<do_not_normalize,--do-not-normalize>>*
option below.
+*--force*::
+ try to proceed with *-m-* even if malformed tags with incorrect number of fields
+ are encountered, discarding such tags. (Experimental, use at your own risk.)
+
*-m, --multiallelics* *-*|*+*['snps'|'indels'|'both'|'any']::
split multiallelic sites into biallelic records (*-*) or join
biallelic sites into multiallelic records (*+*). An optional type string
=====================================
plugins/trio-dnm.c
=====================================
@@ -1,6 +1,6 @@
/* The MIT License
- Copyright (c) 2018 Genome Research Ltd.
+ Copyright (c) 2018-2019 Genome Research Ltd.
Author: Petr Danecek <pd3 at sanger.ac.uk>
@@ -72,7 +72,7 @@ typedef struct
double min_score;
double *aprob; // proband's allele probabilities
double *pl3; // normalized PLs converted to probs for proband,father,mother
- int maprob, mpl3, midx, *idx;
+ int maprob, mpl3, midx, *idx, force_ad;
}
args_t;
@@ -91,6 +91,7 @@ static const char *usage_text(void)
"Usage: bcftools +trio-dnm [Plugin Options]\n"
"Plugin options:\n"
" -e, --exclude EXPR exclude sites and samples for which the expression is true\n"
+ " --force-AD calculate VAF even if the number of FMT/AD fields is incorrect. Use at your own risk!\n"
" -i, --include EXPR include sites and samples for which the expression is true\n"
" -m, --min-score NUM do not add FMT/DNM annotation if the score is smaller than NUM\n"
" -o, --output FILE output file name [stdout]\n"
@@ -293,13 +294,25 @@ static void process_record(args_t *args, bcf1_t *rec)
if ( bcf_write(args->out_fh, args->hdr_out, rec)!=0 ) error("[%s] Error: cannot write to %s\n", __func__,args->output_fname);
return;
}
- int nret, nsmpl = bcf_hdr_nsamples(args->hdr), has_fmt_ad = args->has_fmt_ad;
- if ( args->has_fmt_ad )
+ static int n_ad_warned = 0;
+ int nret, nsmpl = bcf_hdr_nsamples(args->hdr), n_ad = args->has_fmt_ad;
+ if ( n_ad )
{
nret = bcf_get_format_int32(args->hdr,rec,"AD",&args->ad,&args->mad);
- if ( nret<=0 ) has_fmt_ad = 0;
- else if ( nret != nsmpl * rec->n_allele )
- error("Incorrect number of fields for FORMAT/AD at %s:%"PRId64"\n", bcf_seqname(args->hdr,rec),(int64_t) rec->pos+1);
+ if ( nret<=0 ) n_ad = 0;
+ else
+ {
+ n_ad = nret / nsmpl;
+ if ( nret != nsmpl * rec->n_allele )
+ {
+ if ( !n_ad_warned )
+ {
+ hts_log_warning("Incorrect number of fields for FORMAT/AD at %s:%"PRId64". This warning is printed only once", bcf_seqname(args->hdr,rec),(int64_t) rec->pos+1);
+ n_ad_warned = 1;
+ }
+ if ( !args->force_ad ) n_ad = 0;
+ }
+ }
}
nret = bcf_get_format_int32(args->hdr,rec,"PL",&args->pl,&args->mpl);
if ( nret<=0 ) error("The FORMAT/PL tag not present at %s:%"PRId64"\n", bcf_seqname(args->hdr,rec),(int64_t) rec->pos+1);
@@ -307,7 +320,7 @@ static void process_record(args_t *args, bcf1_t *rec)
if ( npl1!=rec->n_allele*(rec->n_allele+1)/2 )
error("fixme: not a diploid site at %s:%"PRId64": %d alleles, %d PLs\n", bcf_seqname(args->hdr,rec),(int64_t) rec->pos+1,rec->n_allele,npl1);
hts_expand(double,3*npl1,args->mpl3,args->pl3);
- int i, j, k, al0, al1, write_dnm = 0;
+ int i, j, k, al0, al1, write_dnm = 0, ad_set = 0;
for (i=0; i<nsmpl; i++) args->dnm_qual[i] = bcf_int32_missing;
for (i=0; i<args->ntrio; i++)
{
@@ -327,23 +340,32 @@ static void process_record(args_t *args, bcf1_t *rec)
args->dnm_qual[ args->trio[i].idx[iCHILD] ] = score;
}
- if ( has_fmt_ad )
+ if ( n_ad )
{
- for (j=0; j<3; j++)
+ if ( al0 < n_ad && al1 < n_ad )
{
- int32_t *src = args->ad + rec->n_allele * args->trio[i].idx[j];
- args->vaf[ args->trio[i].idx[j] ] = src[al0]+src[al1] ? round(src[al1]*100./(src[al0]+src[al1])) : 0;
+ ad_set = 1;
+ for (j=0; j<3; j++)
+ {
+ int32_t *src = args->ad + n_ad * args->trio[i].idx[j];
+ args->vaf[ args->trio[i].idx[j] ] = src[al0]+src[al1] ? round(src[al1]*100./(src[al0]+src[al1])) : 0;
+ }
}
+ else
+ for (j=0; j<3; j++) args->vaf[ args->trio[i].idx[j] ] = bcf_int32_missing;
}
}
if ( write_dnm )
{
if ( bcf_update_format_int32(args->hdr_out,rec,"DNM",args->dnm_qual,nsmpl)!=0 )
error("Failed to write FORMAT/DNM at %s:%"PRId64"\n", bcf_seqname(args->hdr,rec),(int64_t) rec->pos+1);
- if ( has_fmt_ad && bcf_update_format_int32(args->hdr_out,rec,"VAF",args->vaf,nsmpl)!=0 )
- error("Failed to write FORMAT/VAF at %s:%"PRId64"\n", bcf_seqname(args->hdr,rec),(int64_t) rec->pos+1);
+ if ( ad_set )
+ {
+ if ( bcf_update_format_int32(args->hdr_out,rec,"VAF",args->vaf,nsmpl)!=0 )
+ error("Failed to write FORMAT/VAF at %s:%"PRId64"\n", bcf_seqname(args->hdr,rec),(int64_t) rec->pos+1);
+ }
}
- if ( bcf_write(args->out_fh, args->hdr_out, rec)!=0 ) error("[%s] Error: cannot write to %s\n", __func__,args->output_fname);
+ if ( bcf_write(args->out_fh, args->hdr_out, rec)!=0 ) error("[%s] Error: cannot write to %s at %s:%"PRId64"\n", __func__,args->output_fname,bcf_seqname(args->hdr,rec),(int64_t)rec->pos+1);
}
int run(int argc, char **argv)
@@ -353,6 +375,7 @@ int run(int argc, char **argv)
args->output_fname = "-";
static struct option loptions[] =
{
+ {"force-AD",no_argument,0,1},
{"min-score",required_argument,0,'m'},
{"include",required_argument,0,'i'},
{"exclude",required_argument,0,'e'},
@@ -372,6 +395,7 @@ int run(int argc, char **argv)
{
switch (c)
{
+ case 1 : args->force_ad = 1; break;
case 'e': args->filter_str = optarg; args->filter_logic |= FLT_EXCLUDE; break;
case 'i': args->filter_str = optarg; args->filter_logic |= FLT_INCLUDE; break;
case 't': args->targets = optarg; break;
=====================================
test/norm.split.3.out
=====================================
@@ -0,0 +1,36 @@
+##fileformat=VCFv4.2
+##FILTER=<ID=PASS,Description="All filters passed">
+##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">
+##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
+##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Phred-scaled likelihood">
+##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Depth">
+##contig=<ID=1,length=2147483647>
+##contig=<ID=2,length=2147483647>
+##contig=<ID=20,length=2147483647>
+##contig=<ID=21,length=2147483647>
+##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
+##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes">
+##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
+##INFO=<ID=XRF,Number=R,Type=Float,Description="Test Number=AGR in INFO">
+##INFO=<ID=XAF,Number=A,Type=Float,Description="Test Number=AGR in INFO">
+##INFO=<ID=XGF,Number=G,Type=Float,Description="Test Number=AGR in INFO">
+##INFO=<ID=XRI,Number=R,Type=Integer,Description="Test Number=AGR in INFO">
+##INFO=<ID=XAI,Number=A,Type=Integer,Description="Test Number=AGR in INFO">
+##INFO=<ID=XGI,Number=G,Type=Integer,Description="Test Number=AGR in INFO">
+##INFO=<ID=XRS,Number=R,Type=String,Description="Test Number=AGR in INFO">
+##INFO=<ID=XAS,Number=A,Type=String,Description="Test Number=AGR in INFO">
+##INFO=<ID=XGS,Number=G,Type=String,Description="Test Number=AGR in INFO">
+##FORMAT=<ID=FRF,Number=R,Type=Float,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FAF,Number=A,Type=Float,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FGF,Number=G,Type=Float,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FRI,Number=R,Type=Integer,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FAI,Number=A,Type=Integer,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FGI,Number=G,Type=Integer,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FRS,Number=R,Type=String,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FAS,Number=A,Type=String,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FGS,Number=G,Type=String,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FSTR,Number=1,Type=String,Description="Test String in FORMAT">
+##INFO=<ID=ISTR,Number=1,Type=String,Description="Test String in INFO">
+#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT XY00001 XY00002
+1 105 . TAAACCCTAAA TAA 999 PASS INDEL;AN=4;DP=19;ISTR=SomeString;XRS=AAA,BBB;XAS=AAA;XGS=A,B,C GT:DP:FRS:FAS:FGF 1/0:1:AAAA,BBB:A:1e+06,2e+06,3e+06 1/0:1:AAAA,BBB:A:1e+06,2e+06,3e+06
+1 105 . TAAACCCTAAA TAACCCTAAA 999 PASS INDEL;AN=4;DP=19;ISTR=SomeString;XRS=AAA,DDD;XAS=DDD;XGS=A,E,F GT:DP:FRS:FAS:FGF 0/1:1:AAAA,CC:BB:1e+06,500000,9e+09 0/1:1:AAAA,CC:BB:1e+06,500000,9e+09
=====================================
test/norm.split.3.vcf
=====================================
@@ -0,0 +1,35 @@
+##fileformat=VCFv4.2
+##FILTER=<ID=PASS,Description="All filters passed">
+##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">
+##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
+##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Phred-scaled likelihood">
+##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Depth">
+##contig=<ID=1,length=2147483647>
+##contig=<ID=2,length=2147483647>
+##contig=<ID=20,length=2147483647>
+##contig=<ID=21,length=2147483647>
+##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
+##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes">
+##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
+##INFO=<ID=XRF,Number=R,Type=Float,Description="Test Number=AGR in INFO">
+##INFO=<ID=XAF,Number=A,Type=Float,Description="Test Number=AGR in INFO">
+##INFO=<ID=XGF,Number=G,Type=Float,Description="Test Number=AGR in INFO">
+##INFO=<ID=XRI,Number=R,Type=Integer,Description="Test Number=AGR in INFO">
+##INFO=<ID=XAI,Number=A,Type=Integer,Description="Test Number=AGR in INFO">
+##INFO=<ID=XGI,Number=G,Type=Integer,Description="Test Number=AGR in INFO">
+##INFO=<ID=XRS,Number=R,Type=String,Description="Test Number=AGR in INFO">
+##INFO=<ID=XAS,Number=A,Type=String,Description="Test Number=AGR in INFO">
+##INFO=<ID=XGS,Number=G,Type=String,Description="Test Number=AGR in INFO">
+##FORMAT=<ID=FRF,Number=R,Type=Float,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FAF,Number=A,Type=Float,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FGF,Number=G,Type=Float,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FRI,Number=R,Type=Integer,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FAI,Number=A,Type=Integer,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FGI,Number=G,Type=Integer,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FRS,Number=R,Type=String,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FAS,Number=A,Type=String,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FGS,Number=G,Type=String,Description="Test Number=AGR in FORMAT">
+##FORMAT=<ID=FSTR,Number=1,Type=String,Description="Test String in FORMAT">
+##INFO=<ID=ISTR,Number=1,Type=String,Description="Test String in INFO">
+#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT XY00001 XY00002
+1 105 . TAAACCCTAAA TAA,TAACCCTAAA 999 PASS INDEL;AN=4;AC=2,2,7;DP=19;ISTR=SomeString;XRF=1e+06,2e+06,500000,7;XRI=1111,2222,5555,7;XRS=AAA,BBB,DDD,xx;XAF=1e+06,500000,7;XAI=1111,5555,7;XAS=AAA,DDD,xx;XGF=1e+06,2e+06,3e+06,500000,.,9e+09,7;XGI=1111,2222,3333,5555,.,9999,7;XGS=A,B,C,E,.,F,x GT:PL:DP:FRF:FRI:FRS:FAF:FAI:FAS:FGF:FGI:FGS 1/2:1,2,3,4,5,6:1:1e+06,2e+06,500000:1111,2222,5555:AAAA,BBB,CC:1e+06,500000:1111,5555:A,BB:1e+06,2e+06,3e+06,500000,.,9e+09:1111,2222,3333,5555,.,9999:A,BB,CCC,EEEE,.,FFFFF 1/2:1,2,3,4,5,6,7:1:1e+06,2e+06,500000,7:1111,2222,5555,7:AAAA,BBB,CC:1e+06,500000,7:1111,5555,7:A,BB:1e+06,2e+06,3e+06,500000,.,9e+09:1111,2222,3333,5555,.,9999,7:A,BB,CCC,EEEE,.,FFFFF,XX
=====================================
test/test.pl
=====================================
@@ -175,6 +175,7 @@ test_vcf_query($opts,in=>'query.filter.10',out=>'query.74.out',args=>q[-f'%POS
test_vcf_norm($opts,in=>'norm',out=>'norm.out',fai=>'norm',args=>'-cx');
test_vcf_norm($opts,in=>'norm.split',out=>'norm.split.out',args=>'-m-');
test_vcf_norm($opts,in=>'norm.split.2',out=>'norm.split.2.out',args=>'-m-');
+test_vcf_norm($opts,in=>'norm.split.3',out=>'norm.split.3.out',args=>'-m- --force');
test_vcf_norm($opts,in=>'norm.split',fai=>'norm',out=>'norm.split.and.norm.out',args=>'-m-');
test_vcf_norm($opts,in=>'norm.merge',out=>'norm.merge.out',args=>'-m+');
test_vcf_norm($opts,in=>'norm.merge.2',out=>'norm.merge.2.out',args=>'-m+');
@@ -225,6 +226,11 @@ test_vcf_view($opts,in=>'idx.2',out=>'idx.2.out',args=>q[-H -r 1:1172777-1172804
test_vcf_view($opts,in=>'idx.2',out=>'idx.2.out',args=>q[-H -R {PATH}/idx.2.bed]);
test_vcf_view($opts,in=>'idx.3',out=>'idx.3.out',args=>q[-H -R {PATH}/idx.3.bed]);
test_vcf_view($opts,in=>'idx.4',out=>'idx.4.out',args=>q[-H -R {PATH}/idx.4.bed]);
+test_vcf_64bit($opts,in=>'view64bit.1',out=>'view64bit.1.out',do_bcf=>1);
+test_vcf_64bit($opts,in=>'view64bit.2',out=>'view64bit.2.out',do_bcf=>1);
+test_vcf_64bit($opts,in=>'view64bit.3',out=>'view64bit.3.out'); # large coordinates don't work with BCF
+test_vcf_64bit($opts,in=>'view64bit.4',out=>'view64bit.4.out',do_bcf=>1);
+test_vcf_64bit($opts,in=>'view64bit.5',out=>'view64bit.5.out',do_bcf=>1);
test_vcf_filter($opts,in=>'view.filter',out=>'view.filter.6.out',args=>q[-S. -e'TXT0="text"'],reg=>'');
test_vcf_filter($opts,in=>'view.filter',out=>'view.filter.7.out',args=>q[-S. -e'FMT/FRS[*:1]="BB"'],reg=>'');
test_vcf_filter($opts,in=>'view.filter',out=>'view.filter.8.out',args=>q[-S. -e'FMT/FGS[*:0]="AAAAAA"'],reg=>'');
@@ -402,6 +408,8 @@ test_vcf_plugin($opts,in=>'mendelian',out=>'mendelian.3.out',cmd=>'+mendelian',a
test_vcf_plugin($opts,in=>'contrast',out=>'contrast.out',cmd=>'+contrast',args=>'-0 a,b -1 c');
test_vcf_plugin($opts,in=>'contrast',out=>'contrast.out',cmd=>'+contrast',args=>'-0 {PATH}/contrast0.txt -1 {PATH}/contrast1.txt');
test_vcf_plugin($opts,in=>'trio-dnm.1',out=>'trio-dnm.1.out',cmd=>'+trio-dnm',args=>"-p proband,father,mother | $$opts{bin}/bcftools query -f'%CHROM[\\t%DNM]\\t[\\t%VAF]\\n'");
+test_vcf_plugin($opts,in=>'trio-dnm.2',out=>'trio-dnm.1.out',cmd=>'+trio-dnm',args=>"-p proband,father,mother --force-AD | $$opts{bin}/bcftools query -f'%CHROM[\\t%DNM]\\t[\\t%VAF]\\n'");
+test_vcf_plugin($opts,in=>'trio-dnm.2',out=>'trio-dnm.2.out',cmd=>'+trio-dnm',args=>"-p proband,father,mother | $$opts{bin}/bcftools query -f'%CHROM[\\t%DNM]\\t[\\t%VAF]\\n'");
test_vcf_plugin($opts,in=>'gvcfz',out=>'gvcfz.1.out',cmd=>'+gvcfz',args=>qq[-g 'PASS:GT!="alt"' -a | $$opts{bin}/bcftools query -f'%POS\\t%REF\\t%ALT\\t%END[\\t%GT][\\t%DP][\\t%GQ][\\t%RGQ]\\n']);
test_vcf_plugin($opts,in=>'gvcfz',out=>'gvcfz.2.out',cmd=>'+gvcfz',args=>qq[-g 'PASS:GQ>10; FLT:-' -a | $$opts{bin}/bcftools query -f'%POS\\t%REF\\t%ALT\\t%FILTER\\t%END[\\t%GT][\\t%DP][\\t%GQ][\\t%RGQ]\\n']);
test_vcf_plugin($opts,in=>'gvcfz.2',out=>'gvcfz.2.1.out',cmd=>'+gvcfz',args=>qq[-g 'PASS:GT!="alt"' -a | $$opts{bin}/bcftools query -f'%POS\\t%REF\\t%ALT\\t%FILTER\\t%END[\\t%GT][\\t%DP]\\n']);
@@ -956,6 +964,16 @@ sub test_vcf_view
test_cmd($opts,%args,cmd=>"$$opts{bin}/bcftools view -Ob $args{args} $$opts{tmp}/$args{in}.vcf.gz $args{reg} | $$opts{bin}/bcftools view | grep -v ^##bcftools_", exp_fix=>1);
}
}
+sub test_vcf_64bit
+{
+ my ($opts,%args) = @_;
+ test_cmd($opts,%args,cmd=>"$$opts{bin}/bcftools view $$opts{path}/$args{in}.vcf -H", exp_fix=>1);
+ test_cmd($opts,%args,cmd=>"$$opts{bin}/bcftools view $$opts{path}/$args{in}.vcf | $$opts{bin}/bcftools view -H", exp_fix=>1);
+ if ( $args{do_bcf} )
+ {
+ test_cmd($opts,%args,cmd=>"$$opts{bin}/bcftools view $$opts{path}/$args{in}.vcf -Ou | $$opts{bin}/bcftools view -H", exp_fix=>1);
+ }
+}
sub test_vcf_call
{
my ($opts,%args) = @_;
=====================================
test/trio-dnm.2.out
=====================================
@@ -0,0 +1,18 @@
+1 100 . . . . .
+1 98 . . . . .
+1 99 . . . . .
+1 98 . . . . .
+1 92 . . . . .
+1 98 . . . . .
+2 5 . . . . .
+2 6 . . . . .
+2 5 . . . . .
+3 3 . . . . .
+3 3 . . . . .
+3 3 . . . . .
+3 0 . . . . .
+3 3 . . . . .
+3 3 . . . . .
+3 3 . . . . .
+3 3 . . . . .
+3 3 . . . . .
=====================================
test/trio-dnm.2.vcf
=====================================
@@ -0,0 +1,31 @@
+##fileformat=VCFv4.2
+##FILTER=<ID=LowQual,Description="Low quality">
+##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
+##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
+##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
+##INFO=<ID=TP,Number=0,Type=Flag,Description="true positive">
+##INFO=<ID=FP,Number=0,Type=Flag,Description="false positive">
+##INFO=<ID=UN,Number=0,Type=Flag,Description="uncertain">
+##contig=<ID=1,length=249250621>
+##contig=<ID=2,length=249250621>
+##contig=<ID=3,length=249250621>
+##reference=file:///lustre/scratch113/resources/ref/Homo_sapiens/1000Genomes_hs37d5/hs37d5.fa
+#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT proband father mother
+1 1 . G A,T . . TP GT:AD:PL 0/2:62,0,27,0:733,919,3060,0,2141,2060 0/0:35,0,0:0,99,1485,99,1485,1485 0/0:38,0,0:0,102,1530,102,1530,1530
+1 2 . G T,A . . TP GT:AD:PL 0/1:67,14,0,0:257,0,2105,458,2146,2605 0/0:36,0,0:0,99,1485,99,1485,1485 0/0:36,0,0:0,99,1485,99,1485,1485
+1 3 . G A . . TP GT:AD:PL 0/1:71,14,0:241,0,2314 0/0:45,0:0,101,1530 0/0:43,0:0,99,1485
+1 4 . C T . . TP GT:AD:PL 0/1:111,24,0:504,0,3776 0/0:35,0:0,99,1485 0/0:36,0:0,99,1485
+1 5 . C A . . TP GT:AD:PL 0/1:30,6,0:124,0,981 0/0:37,0:0,99,1485 0/0:32,0:0,90,1350
+1 8 . A G . . TP GT:AD:PL 0/1:434,52,0:859,0,18086 0/0:38,0:0,99,1485 0/0:38,0:0,99,1485
+2 1 . A G . . UN GT:AD:PL 0/1:2,5,0:179,0,55 0/0:7,0:0,0,180 0/0:4,0:0,12,148
+2 2 . A G . . UN GT:AD:PL 0/1:4,5,0:159,0,126 0/0:1,0:0,3,39 0/0:4,0:0,9,135
+2 3 . A G . . UN GT:AD:PL 0/1:4,4,0:137,0,107 0/0:6,0:0,18,213 0/0:8,0:0,0,232
+3 1 . A G . . FP GT:AD:PL 0/1:7,9,0,0:357,0,408 0/0:15,0:0,39,585 0/1:4,3:114,0,550
+3 2 . C A . . FP GT:AD:PL 0/1:13,15,0:453,0,442 0/1:29,30:913,0,1011 0/0:39,0:0,99,1485
+3 3 . A G . . FP GT:AD:PL 0/1:11,12,0:361,0,358 0/0:21,0:0,51,765 0/1:10,15:538,0,292
+3 4 . A G,C . . FP GT:PL:AD 0/0:0,255,255,255,255,255:306,11,0,0 0/0:0,255,255,255,255,255:328,1,1 0/0:0,255,255,255,255,255:318,0,0
+3 5 . A G . . FP GT:AD:PL 0/1:33,32,0:890,0,963 0/1:56,45:1328,0,1809 0/0:36,0:0,99,1485
+3 6 . A G . . FP GT:AD:PL 0/1:19,24,0:737,0,649 0/0:48,0:0,108,1620 0/1:25,22:644,0,836
+3 7 . A G . . FP GT:AD:PL 0/1:73,90,0:2864,0,2197 0/0:42,0:0,99,1485 0/1:69,74:2395,0,2064
+3 8 . A G . . FP GT:AD:PL 0/1:115,128,0:4130,0,3542 0/0:34,0:0,99,1360 0/1:137,89:2571,0,4411
+3 9 . A G . . FP GT:AD:PL 0/1:18,11,0:311,0,627 0/1:3,3:51,0,105 0/0:19,0:0,57,764
=====================================
test/view64bit.1.out
=====================================
@@ -0,0 +1,2 @@
+chr1 1 . G C . . MPOS=.;XPOS=.,.,.;NALOD=-0.8279;NLOD=15.45;POPAF=6
+chr1 2 . G C . . MPOS=.;XPOS=.,.,.;NALOD=-0.8279;NLOD=15.45;POPAF=6
=====================================
test/view64bit.1.vcf
=====================================
@@ -0,0 +1,10 @@
+##fileformat=VCFv4.2
+##INFO=<ID=MPOS,Number=A,Type=Integer,Description="dummy">
+##INFO=<ID=XPOS,Number=.,Type=Integer,Description="dummy">
+##INFO=<ID=NALOD,Number=A,Type=Float,Description="dummy">
+##INFO=<ID=NLOD,Number=A,Type=Float,Description="dummy">
+##INFO=<ID=POPAF,Number=A,Type=Float,Description="dummy">
+##contig=<ID=chr1,length=248956422>
+#CHROM POS ID REF ALT QUAL FILTER INFO
+chr1 1 . G C . . MPOS=-2147483641;XPOS=-2147483641,-2147483641,-2147483641;NALOD=-8.279e-01;NLOD=15.45;POPAF=6.00
+chr1 2 . G C . . MPOS=-2147483648;XPOS=-2147483648,-2147483648,-2147483648;NALOD=-8.279e-01;NLOD=15.45;POPAF=6.00
=====================================
test/view64bit.2.out
=====================================
@@ -0,0 +1 @@
+chr1 1 . G C . . MPOS=.;XPOS=.,.,. MPOS:XPOS .:.,.,.
=====================================
test/view64bit.2.vcf
=====================================
@@ -0,0 +1,8 @@
+##fileformat=VCFv4.2
+##INFO=<ID=MPOS,Number=A,Type=Integer,Description="dummy">
+##INFO=<ID=XPOS,Number=.,Type=Integer,Description="dummy">
+##FORMAT=<ID=MPOS,Number=.,Type=Integer,Description="dummy">
+##FORMAT=<ID=XPOS,Number=.,Type=Integer,Description="dummy">
+##contig=<ID=chr1,length=248956422>
+#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
+chr1 1 . G C . . MPOS=42949672950;XPOS=42949672950,42949672950,42949672950 MPOS:XPOS 42949672950:42949672950,42949672950,42949672950
=====================================
test/view64bit.3.out
=====================================
@@ -0,0 +1 @@
+chr1 42949672950 . G C . . MPOS=.;XPOS=.,.,. MPOS:XPOS .:.,.,.
=====================================
test/view64bit.3.vcf
=====================================
@@ -0,0 +1,8 @@
+##fileformat=VCFv4.2
+##INFO=<ID=MPOS,Number=A,Type=Integer,Description="dummy">
+##INFO=<ID=XPOS,Number=.,Type=Integer,Description="dummy">
+##FORMAT=<ID=MPOS,Number=.,Type=Integer,Description="dummy">
+##FORMAT=<ID=XPOS,Number=.,Type=Integer,Description="dummy">
+##contig=<ID=chr1,length=248956422>
+#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
+chr1 42949672950 . G C . . MPOS=42949672950;XPOS=42949672950,42949672950,42949672950 MPOS:XPOS 42949672950:42949672950,42949672950,42949672950
=====================================
test/view64bit.4.out
=====================================
@@ -0,0 +1 @@
+chr1 1 . G C . . MPOS=.;XPOS=.,.,. MPOS:XPOS .:.,.,.
=====================================
test/view64bit.4.vcf
=====================================
@@ -0,0 +1,8 @@
+##fileformat=VCFv4.2
+##INFO=<ID=MPOS,Number=A,Type=Integer,Description="dummy">
+##INFO=<ID=XPOS,Number=.,Type=Integer,Description="dummy">
+##FORMAT=<ID=MPOS,Number=.,Type=Integer,Description="dummy">
+##FORMAT=<ID=XPOS,Number=.,Type=Integer,Description="dummy">
+##contig=<ID=chr1,length=248956422>
+#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
+chr1 1 . G C . . MPOS=.;XPOS=.,.,. MPOS:XPOS .:.,.,.
=====================================
test/view64bit.5.out
=====================================
@@ -0,0 +1,6 @@
+chr1 1 . G C . . END=42949672,1,.;MPOS=100000000 XX 1,-2147483640,.,.,.,2147483647,.
+chr1 1 . G C . . END=-2147483640 XX -2147483640
+chr1 1 . G C . . END=. XX .
+chr1 1 . G C . . END=. XX .
+chr1 1 . G C . . END=. XX .
+chr1 1 . G C . . END=. XX .
=====================================
test/view64bit.5.vcf
=====================================
@@ -0,0 +1,12 @@
+##fileformat=VCFv4.2
+##INFO=<ID=END,Number=A,Type=Integer,Description="dummy">
+##INFO=<ID=MPOS,Number=A,Type=Integer,Description="dummy">
+##FORMAT=<ID=XX,Number=A,Type=Integer,Description="dummy">
+##contig=<ID=chr1,length=248956422>
+#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S1
+chr1 1 . G C . . END=42949672,1,5000000000;MPOS=100000000 XX 1,-2147483640,-2147483641,-2147483647,-2147483648,2147483647,2147483648
+chr1 1 . G C . . END=-2147483640 XX -2147483640
+chr1 1 . G C . . END=-2147483641 XX -2147483641
+chr1 1 . G C . . END=-2147483647 XX -2147483647
+chr1 1 . G C . . END=-2147483648 XX -2147483648
+chr1 1 . G C . . END=-2147483649 XX -2147483649
=====================================
vcfisec.c
=====================================
@@ -1,6 +1,6 @@
/* vcfisec.c -- Create intersections, unions and complements of VCF files.
- Copyright (C) 2012-2014 Genome Research Ltd.
+ Copyright (C) 2012-2019 Genome Research Ltd.
Author: Petr Danecek <pd3 at sanger.ac.uk>
@@ -486,6 +486,9 @@ static void usage(void)
fprintf(stderr, " # Extract and write records from A shared by both A and B using exact allele match\n");
fprintf(stderr, " bcftools isec A.vcf.gz B.vcf.gz -p dir -n =2 -w 1\n");
fprintf(stderr, "\n");
+ fprintf(stderr, " # Extract and write records from C found in A and C but not in B\n");
+ fprintf(stderr, " bcftools isec A.vcf.gz B.vcf.gz C.vcf.gz -p dir -n~101 -w 3\n");
+ fprintf(stderr, "\n");
fprintf(stderr, " # Extract records private to A or B comparing by position only\n");
fprintf(stderr, " bcftools isec A.vcf.gz B.vcf.gz -p dir -n -1 -c all\n");
fprintf(stderr, "\n");
=====================================
vcfnorm.c
=====================================
@@ -97,7 +97,7 @@ typedef struct
char **argv, *output_fname, *ref_fname, *vcf_fname, *region, *targets;
int argc, rmdup, output_type, n_threads, check_ref, strict_filter, do_indels;
int nchanged, nskipped, nsplit, ntotal, mrows_op, mrows_collapse, parsimonious;
- int record_cmd_line;
+ int record_cmd_line, force, force_warned;
}
args_t;
@@ -466,23 +466,68 @@ static void split_info_numeric(args_t *args, bcf1_t *src, bcf_info_t *info, int
if ( len==BCF_VL_A ) \
{ \
if ( ret!=src->n_allele-1 ) \
+ { \
+ if ( args->force && !args->force_warned ) \
+ { \
+ fprintf(stderr, \
+ "Warning: wrong number of fields in INFO/%s at %s:%"PRId64", expected %d, found %d\n" \
+ " (This warning is printed only once.)\n", \
+ tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,src->n_allele-1,ret); \
+ args->force_warned = 1; \
+ } \
+ if ( args->force ) \
+ { \
+ bcf_update_info_##type(args->hdr,dst,tag,NULL,0); \
+ return; \
+ } \
error("Error: wrong number of fields in INFO/%s at %s:%"PRId64", expected %d, found %d\n", \
tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,src->n_allele-1,ret); \
+ } \
bcf_update_info_##type(args->hdr,dst,tag,vals+ialt,1); \
} \
else if ( len==BCF_VL_R ) \
{ \
if ( ret!=src->n_allele ) \
+ { \
+ if ( args->force && !args->force_warned ) \
+ { \
+ fprintf(stderr, \
+ "Warning: wrong number of fields in INFO/%s at %s:%"PRId64", expected %d, found %d\n" \
+ " (This warning is printed only once.)\n", \
+ tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,src->n_allele,ret); \
+ args->force_warned = 1; \
+ } \
+ if ( args->force ) \
+ { \
+ bcf_update_info_##type(args->hdr,dst,tag,NULL,0); \
+ return; \
+ } \
error("Error: wrong number of fields in INFO/%s at %s:%"PRId64", expected %d, found %d\n", \
tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,src->n_allele,ret); \
+ } \
if ( ialt!=0 ) vals[1] = vals[ialt+1]; \
bcf_update_info_##type(args->hdr,dst,tag,vals,2); \
} \
else if ( len==BCF_VL_G ) \
{ \
if ( ret!=src->n_allele*(src->n_allele+1)/2 ) \
+ { \
+ if ( args->force && !args->force_warned ) \
+ { \
+ fprintf(stderr, \
+ "Warning: wrong number of fields in INFO/%s at %s:%"PRId64", expected %d, found %d\n" \
+ " (This warning is printed only once.)\n", \
+ tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,src->n_allele*(src->n_allele+1)/2,ret); \
+ args->force_warned = 1; \
+ } \
+ if ( args->force ) \
+ { \
+ bcf_update_info_##type(args->hdr,dst,tag,NULL,0); \
+ return; \
+ } \
error("Error: wrong number of fields in INFO/%s at %s:%"PRId64", expected %d, found %d\n", \
tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,src->n_allele*(src->n_allele+1)/2,ret); \
+ } \
if ( ialt!=0 ) \
{ \
vals[1] = vals[bcf_alleles2gt(0,ialt+1)]; \
@@ -627,8 +672,23 @@ static void split_format_numeric(args_t *args, bcf1_t *src, bcf_fmt_t *fmt, int
if ( len==BCF_VL_A ) \
{ \
if ( nvals!=(src->n_allele-1)*nsmpl ) \
+ { \
+ if ( args->force && !args->force_warned ) \
+ { \
+ fprintf(stderr, \
+ "Warning: wrong number of fields in FMT/%s at %s:%"PRId64", expected %d, found %d. Removing the field.\n" \
+ " (This warning is printed only once.)\n", \
+ tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,(src->n_allele-1)*nsmpl,nvals); \
+ args->force_warned = 1; \
+ } \
+ if ( args->force ) \
+ { \
+ bcf_update_format_##type(args->hdr,dst,tag,NULL,0); \
+ return; \
+ } \
error("Error: wrong number of fields in FMT/%s at %s:%"PRId64", expected %d, found %d\n", \
tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,(src->n_allele-1)*nsmpl,nvals); \
+ } \
nvals /= nsmpl; \
type_t *src_vals = vals, *dst_vals = vals; \
for (i=0; i<nsmpl; i++) \
@@ -642,8 +702,23 @@ static void split_format_numeric(args_t *args, bcf1_t *src, bcf_fmt_t *fmt, int
else if ( len==BCF_VL_R ) \
{ \
if ( nvals!=src->n_allele*nsmpl ) \
+ { \
+ if ( args->force && !args->force_warned ) \
+ { \
+ fprintf(stderr, \
+ "Warning: wrong number of fields in FMT/%s at %s:%"PRId64", expected %d, found %d. Removing the field.\n" \
+ " (This warning is printed only once.)\n", \
+ tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,(src->n_allele-1)*nsmpl,nvals); \
+ args->force_warned = 1; \
+ } \
+ if ( args->force ) \
+ { \
+ bcf_update_format_##type(args->hdr,dst,tag,NULL,0); \
+ return; \
+ } \
error("Error: wrong number of fields in FMT/%s at %s:%"PRId64", expected %d, found %d\n", \
tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,src->n_allele*nsmpl,nvals); \
+ } \
nvals /= nsmpl; \
type_t *src_vals = vals, *dst_vals = vals; \
for (i=0; i<nsmpl; i++) \
@@ -658,7 +733,22 @@ static void split_format_numeric(args_t *args, bcf1_t *src, bcf_fmt_t *fmt, int
else if ( len==BCF_VL_G ) \
{ \
if ( nvals!=src->n_allele*(src->n_allele+1)/2*nsmpl && nvals!=src->n_allele*nsmpl ) \
+ { \
+ if ( args->force && !args->force_warned ) \
+ { \
+ fprintf(stderr, \
+ "Warning: wrong number of fields in FMT/%s at %s:%"PRId64", expected %d, found %d. Removing the field.\n" \
+ " (This warning is printed only once.)\n", \
+ tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,(src->n_allele-1)*nsmpl,nvals); \
+ args->force_warned = 1; \
+ } \
+ if ( args->force ) \
+ { \
+ bcf_update_format_##type(args->hdr,dst,tag,NULL,0); \
+ return; \
+ } \
error("Error at %s:%"PRId64", the tag %s has wrong number of fields\n", bcf_seqname(args->hdr,src),(int64_t) src->pos+1,bcf_hdr_int2id(args->hdr,BCF_DT_ID,fmt->id)); \
+ } \
nvals /= nsmpl; \
int all_haploid = nvals==src->n_allele ? 1 : 0; \
type_t *src_vals = vals, *dst_vals = vals; \
@@ -770,8 +860,23 @@ static void split_format_string(args_t *args, bcf1_t *src, bcf_fmt_t *fmt, int i
}
if ( nfields==1 && se-ptr==1 && *ptr=='.' ) continue; // missing value
if ( nfields!=src->n_allele*(src->n_allele+1)/2 && nfields!=src->n_allele )
+ {
+ if ( args->force && !args->force_warned )
+ {
+ fprintf(stderr,
+ "Warning: wrong number of fields in FMT/%s at %s:%"PRId64", expected %d or %d, found %d. Removing the field.\n"
+ " (This warning is printed only once.)\n",
+ tag,bcf_seqname(args->hdr,src),(int64_t)src->pos+1,src->n_allele*(src->n_allele+1)/2,src->n_allele,nfields);
+ args->force_warned = 1;
+ }
+ if ( args->force )
+ {
+ bcf_update_format_char(args->hdr,dst,tag,NULL,0);
+ return;
+ }
error("Error: wrong number of fields in FMT/%s at %s:%"PRId64", expected %d or %d, found %d\n",
tag,bcf_seqname(args->hdr,src),(int64_t) src->pos+1,src->n_allele*(src->n_allele+1)/2,src->n_allele,nfields);
+ }
int len = 0;
if ( nfields==src->n_allele ) // haploid
@@ -1849,7 +1954,8 @@ static void usage(void)
fprintf(stderr, " -c, --check-ref <e|w|x|s> check REF alleles and exit (e), warn (w), exclude (x), or set (s) bad sites [e]\n");
fprintf(stderr, " -D, --remove-duplicates remove duplicate lines of the same type.\n");
fprintf(stderr, " -d, --rm-dup <type> remove duplicate snps|indels|both|all|exact\n");
- fprintf(stderr, " -f, --fasta-ref <file> reference sequence (MANDATORY)\n");
+ fprintf(stderr, " -f, --fasta-ref <file> reference sequence\n");
+ fprintf(stderr, " --force try to proceed even if malformed tags are encountered. Experimental, use at your own risk\n");
fprintf(stderr, " -m, --multiallelics <-|+>[type] split multiallelics (-) or join biallelics (+), type: snps|indels|both|any [both]\n");
fprintf(stderr, " --no-version do not append version and command line to the header\n");
fprintf(stderr, " -N, --do-not-normalize do not normalize indels (with -m or -c s)\n");
@@ -1863,6 +1969,13 @@ static void usage(void)
fprintf(stderr, " --threads <int> use multithreading with <int> worker threads [0]\n");
fprintf(stderr, " -w, --site-win <int> buffer for sorting lines which changed position during realignment [1000]\n");
fprintf(stderr, "\n");
+ fprintf(stderr, "Examples:\n");
+ fprintf(stderr, " # normalize and left-align indels\n");
+ fprintf(stderr, " bcftools norm -f ref.fa in.vcf\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " # split multi-allelic sites\n");
+ fprintf(stderr, " bcftools norm -m- in.vcf\n");
+ fprintf(stderr, "\n");
exit(1);
}
@@ -1886,6 +1999,7 @@ int main_vcfnorm(int argc, char *argv[])
static struct option loptions[] =
{
{"help",no_argument,NULL,'h'},
+ {"force",no_argument,NULL,7},
{"fasta-ref",required_argument,NULL,'f'},
{"do-not-normalize",no_argument,NULL,'N'},
{"multiallelics",required_argument,NULL,'m'},
@@ -1963,6 +2077,7 @@ int main_vcfnorm(int argc, char *argv[])
break;
case 9 : args->n_threads = strtol(optarg, 0, 0); break;
case 8 : args->record_cmd_line = 0; break;
+ case 7 : args->force = 1; break;
case 'h':
case '?': usage(); break;
default: error("Unknown argument: %s\n", optarg);
=====================================
version.sh
=====================================
@@ -1,7 +1,7 @@
#!/bin/sh
# Master version, for use in tarballs or non-git source copies
-VERSION=1.10
+VERSION=1.10.2
# If we have a git clone, then check against the current tag
if [ -e .git ]
View it on GitLab: https://salsa.debian.org/med-team/bcftools/commit/e439282c1235fa123a822dbcd7dae553cf4cb693
--
View it on GitLab: https://salsa.debian.org/med-team/bcftools/commit/e439282c1235fa123a822dbcd7dae553cf4cb693
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20191225/6fac6e72/attachment-0001.html>
More information about the debian-med-commit
mailing list