[med-svn] [varscan] 01/01: Enhancing packaging

Andreas Tille tille at debian.org
Wed Apr 16 17:34:24 UTC 2014


This is an automated email from the git hooks/post-receive script.

tille pushed a commit to branch master
in repository varscan.

commit 30722343a0a45d073c5f7cddd8a4bd800ebaca73
Author: Andreas Tille <tille at debian.org>
Date:   Wed Apr 16 08:01:01 2014 +0200

    Enhancing packaging
---
 debian/changelog         |   2 +-
 debian/control           |   4 +-
 debian/copyright         | 231 +++++++++++++-
 debian/faq.txt           | 268 ++++++++++++++++
 debian/get-manual        |  21 ++
 debian/manifest          |   3 +
 debian/upstream/metadata |  23 +-
 debian/using-varscan.txt | 801 +++++++++++++++++++++++++++++++++++++++++++++++
 8 files changed, 1335 insertions(+), 18 deletions(-)

diff --git a/debian/changelog b/debian/changelog
index 961d122..8dbd399 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -2,4 +2,4 @@ varscan (2.3.6+dfsg-1) UNRELEASED; urgency=low
 
   * Initial release (Closes: #<bug>)
 
- -- DMPT <debian-med-packaging at lists.alioth.debian.org>  Thu, 24 May 2012 14:30:13 +0200
+ -- Andreas Tille <tille at debian.org>  Tue, 15 Apr 2014 13:38:37 +0200
diff --git a/debian/control b/debian/control
index 1305617..8b244ee 100644
--- a/debian/control
+++ b/debian/control
@@ -5,8 +5,8 @@ Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.
 Uploaders: Andreas Tille <tille at debian.org>
 Build-Depends: debhelper (>= 9)
 Standards-Version: 3.9.5
-Vcs-Browser: http://anonscm.debian.org/viewvc/debian-med/trunk/packages/varscan/trunk/
-Vcs-Svn: svn://anonscm.debian.org/debian-med/trunk/packages/varscan/trunk/
+Vcs-Browser: http://anonscm.debian.org/gitweb/?p=debian-med/varscan.git
+Vcs-Git: git://anonscm.debian.org/debian-med/varscan.git
 Homepage: http://varscan.sourceforge.net/
 
 Package: varscan
diff --git a/debian/copyright b/debian/copyright
index ba82f3b..ed6cd04 100644
--- a/debian/copyright
+++ b/debian/copyright
@@ -1,14 +1,237 @@
 Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
 Upstream-Name: VarScan
+Upstream-Contact: Dan Koboldt http://genome.wustl.edu/people/individual/dan-koboldt/
 Source: http://sourceforge.net/projects/varscan/files/
 Files-Excluded:
     *.class
     META-INF
 
 Files: *
-Copyright: © 20xx-20yy <upstream>
-License: <license>
+Copyright: 2009-2014 by Washington University in St. Louis.
+License: NPOSL-3.0
+ The download page
+   http://sourceforge.net/projects/varscan/files/
+ says:
+   VarScan 2 is licensed under the Non-Profit Open Software License 3.0 (NPOSL-3.0)
+ .
+ Non-Profit Open Software License 3.0 (NPOSL-3.0)
+ .
+   This Non-Profit Open Software License ("Non-Profit OSL") version 3.0
+   (the "License") applies to any original work of authorship (the
+   "Original Work") whose owner (the "Licensor") has placed the following
+   licensing notice adjacent to the copyright notice for the Original
+   Work:
+ .
+   Licensed under the Non-Profit Open Software License version 3.0
+ .
+   1) Grant of Copyright License. Licensor grants You a worldwide,
+   royalty-free, non-exclusive, sublicensable license, for the duration of
+   the copyright, to do the following:
+ .
+   a) to reproduce the Original Work in copies, either alone or as part of
+   a collective work;
+ .
+   b) to translate, adapt, alter, transform, modify, or arrange the
+   Original Work, thereby creating derivative works ("Derivative Works")
+   based upon the Original Work;
+ .
+   c) to distribute or communicate copies of the Original Work and
+   Derivative Works to the public, with the proviso that copies of
+   Original Work or Derivative Works that You distribute or communicate
+   shall be licensed under this Non-Profit Open Software License or as
+   provided in section 17(d);
+ .
+   d) to perform the Original Work publicly; and
+ .
+   e) to display the Original Work publicly.
+ .
+   2) Grant of Patent License. Licensor grants You a worldwide,
+   royalty-free, non-exclusive, sublicensable license, under patent claims
+   owned or controlled by the Licensor that are embodied in the Original
+   Work as furnished by the Licensor, for the duration of the patents, to
+   make, use, sell, offer for sale, have made, and import the Original
+   Work and Derivative Works.
+ .
+   3) Grant of Source Code License. The term "Source Code" means the
+   preferred form of the Original Work for making modifications to it and
+   all available documentation describing how to modify the Original Work.
+   Licensor agrees to provide a machine-readable copy of the Source Code
+   of the Original Work along with each copy of the Original Work that
+   Licensor distributes. Licensor reserves the right to satisfy this
+   obligation by placing a machine-readable copy of the Source Code in an
+   information repository reasonably calculated to permit inexpensive and
+   convenient access by You for as long as Licensor continues to
+   distribute the Original Work.
+ .
+   4) Exclusions From License Grant. Neither the names of Licensor, nor
+   the names of any contributors to the Original Work, nor any of their
+   trademarks or service marks, may be used to endorse or promote products
+   derived from this Original Work without express prior permission of the
+   Licensor. Except as expressly stated herein, nothing in this License
+   grants any license to Licensor's trademarks, copyrights, patents, trade
+   secrets or any other intellectual property. No patent license is
+   granted to make, use, sell, offer for sale, have made, or import
+   embodiments of any patent claims other than the licensed claims defined
+   in Section 2. No license is granted to the trademarks of Licensor even
+   if such marks are included in the Original Work. Nothing in this
+   License shall be interpreted to prohibit Licensor from licensing under
+   terms different from this License any Original Work that Licensor
+   otherwise would have a right to license.
+ .
+   5) External Deployment. The term "External Deployment" means the use,
+   distribution, or communication of the Original Work or Derivative Works
+   in any way such that the Original Work or Derivative Works may be used
+   by anyone other than You, whether those works are distributed or
+   communicated to those persons or made available as an application
+   intended for use over a network. As an express condition for the grants
+   of license hereunder, You must treat any External Deployment by You of
+   the Original Work or a Derivative Work as a distribution under section
+   1(c).
+ .
+   6) Attribution Rights. You must retain, in the Source Code of any
+   Derivative Works that You create, all copyright, patent, or trademark
+   notices from the Source Code of the Original Work, as well as any
+   notices of licensing and any descriptive text identified therein as an
+   "Attribution Notice." You must cause the Source Code for any Derivative
+   Works that You create to carry a prominent Attribution Notice
+   reasonably calculated to inform recipients that You have modified the
+   Original Work.
+ .
+   7) Warranty of Provenance and Disclaimer of Warranty. The Original Work
+   is provided under this License on an "AS IS" BASIS and WITHOUT
+   WARRANTY, either express or implied, including, without limitation, the
+   warranties of non-infringement, merchantability or fitness for a
+   particular purpose. THE ENTIRE RISK AS TO THE QUALITY OF THE ORIGINAL
+   WORK IS WITH YOU. This DISCLAIMER OF WARRANTY constitutes an essential
+   part of this License. No license to the Original Work is granted by
+   this License except under this disclaimer.
+ .
+   8) Limitation of Liability. Under no circumstances and under no legal
+   theory, whether in tort (including negligence), contract, or otherwise,
+   shall the Licensor be liable to anyone for any direct, indirect,
+   special, incidental, or consequential damages of any character arising
+   as a result of this License or the use of the Original Work including,
+   without limitation, damages for loss of goodwill, work stoppage,
+   computer failure or malfunction, or any and all other commercial
+   damages or losses. This limitation of liability shall not apply to the
+   extent applicable law prohibits such limitation.
+ .
+   9) Acceptance and Termination. If, at any time, You expressly assented
+   to this License, that assent indicates your clear and irrevocable
+   acceptance of this License and all of its terms and conditions. If You
+   distribute or communicate copies of the Original Work or a Derivative
+   Work, You must make a reasonable effort under the circumstances to
+   obtain the express assent of recipients to the terms of this License.
+   This License conditions your rights to undertake the activities listed
+   in Section 1, including your right to create Derivative Works based
+   upon the Original Work, and doing so without honoring these terms and
+   conditions is prohibited by copyright law and international treaty.
+   Nothing in this License is intended to affect copyright exceptions and
+   limitations (including "fair use" or "fair dealing"). This License
+   shall terminate immediately and You may no longer exercise any of the
+   rights granted to You by this License upon your failure to honor the
+   conditions in Section 1(c).
+ .
+   10) Termination for Patent Action. This License shall terminate
+   automatically and You may no longer exercise any of the rights granted
+   to You by this License as of the date You commence an action, including
+   a cross-claim or counterclaim, against Licensor or any licensee
+   alleging that the Original Work infringes a patent. This termination
+   provision shall not apply for an action alleging patent infringement by
+   combinations of the Original Work with other software or hardware.
+ .
+   11) Jurisdiction, Venue and Governing Law. Any action or suit relating
+   to this License may be brought only in the courts of a jurisdiction
+   wherein the Licensor resides or in which Licensor conducts its primary
+   business, and under the laws of that jurisdiction excluding its
+   conflict-of-law provisions. The application of the United Nations
+   Convention on Contracts for the International Sale of Goods is
+   expressly excluded. Any use of the Original Work outside the scope of
+   this License or after its termination shall be subject to the
+   requirements and penalties of copyright or patent law in the
+   appropriate jurisdiction. This section shall survive the termination of
+   this License.
+ .
+   12) Attorneys' Fees. In any action to enforce the terms of this License
+   or seeking damages relating thereto, the prevailing party shall be
+   entitled to recover its costs and expenses, including, without
+   limitation, reasonable attorneys' fees and costs incurred in connection
+   with such action, including any appeal of such action. This section
+   shall survive the termination of this License.
+ .
+   13) Miscellaneous. If any provision of this License is held to be
+   unenforceable, such provision shall be reformed only to the extent
+   necessary to make it enforceable.
+ .
+   14) Definition of "You" in This License. "You" throughout this License,
+   whether in upper or lower case, means an individual or a legal entity
+   exercising rights under, and complying with all of the terms of, this
+   License. For legal entities, "You" includes any entity that controls,
+   is controlled by, or is under common control with you. For purposes of
+   this definition, "control" means (i) the power, direct or indirect, to
+   cause the direction or management of such entity, whether by contract
+   or otherwise, or (ii) ownership of fifty percent (50%) or more of the
+   outstanding shares, or (iii) beneficial ownership of such entity.
+ .
+   15) Right to Use. You may use the Original Work in all ways not
+   otherwise restricted or conditioned by this License or by law, and
+   Licensor promises not to interfere with or be responsible for such uses
+   by You.
+ .
+   16) Modification of This License. This License is Copyright © 2005
+   Lawrence Rosen. Permission is granted to copy, distribute, or
+   communicate this License without modification. Nothing in this License
+   permits You to modify this License as applied to the Original Work or
+   to Derivative Works. However, You may modify the text of this License
+   and copy, distribute or communicate your modified version (the
+   "Modified License") and apply it to other original works of authorship
+   subject to the following conditions: (i) You may not indicate in any
+   way that your Modified License is the "Open Software License" or "OSL"
+   and you may not use those names in the name of your Modified License;
+   (ii) You must replace the notice specified in the first paragraph above
+   with the notice "Licensed under <insert your license name here>" or
+   with a notice of your own that is not confusingly similar to the notice
+   in this License; and (iii) You may not claim that your original works
+   are open source software unless your Modified License has been approved
+   by Open Source Initiative (OSI) and You comply with its license review
+   and certification process.
+ .
+   17) Non-Profit Amendment. The name of this amended version of the Open
+   Software License ("OSL 3.0") is "Non-Profit Open Software License 3.0".
+   The original OSL 3.0 license has been amended as follows:
+ .
+   (a) Licensor represents and declares that it is a not-for-profit
+   organization that derives no revenue whatsoever from the distribution
+   of the Original Work or Derivative Works thereof, or from support or
+   services relating thereto.
+ .
+   (b) The first sentence of Section 7 ["Warranty of Provenance"] of OSL
+   3.0 has been stricken. For Original Works licensed under this
+   Non-Profit OSL 3.0, LICENSOR OFFERS NO WARRANTIES WHATSOEVER.
+ .
+   (c) In the first sentence of Section 8 ["Limitation of Liability"] of
+   this Non-Profit OSL 3.0, the list of damages for which LIABILITY IS
+   LIMITED now includes "direct" damages.
+ .
+   (d) The proviso in Section 1(c) of this License now refers to this
+   "Non-Profit Open Software License" rather than the "Open Software
+   License". You may distribute or communicate the Original Work or
+   Derivative Works thereof under this Non-Profit OSL 3.0 license only if
+   You make the representation and declaration in paragraph (a) of this
+   Section 17. Otherwise, You shall distribute or communicate the Original
+   Work or Derivative Works thereof only under the OSL 3.0 license and You
+   shall publish clear licensing notices so stating. Also by way of
+   clarification, this License does not authorize You to distribute or
+   communicate works under this Non-Profit OSL 3.0 if You received them
+   under the original OSL 3.0 license.
+ .
+   (e) Original Works licensed under this license shall reference
+   "Non-Profit OSL 3.0" in licensing notices to distinguish them from
+   works licensed under the original OSL 3.0 license.
 
 Files: debian/*
-Copyright: © 2014 maintainername <maintainer at e.mail>
-License: <license>
+Copyright: 2014 Andreas Tille <tille at debian.org>
+License: LGPL-3+
+ On Debian systems the full text of the GNU Lesser General Public License
+ is available at /usr/share/common-licenses/LGPL-3+
+
diff --git a/debian/faq.txt b/debian/faq.txt
new file mode 100644
index 0000000..8f15168
--- /dev/null
+++ b/debian/faq.txt
@@ -0,0 +1,268 @@
+Support FAQ
+===========
+
+This page contains answers to many of the common questions asked about VarScan
+usage, performance, input/output, etc.
+
+
+USAGE QUESTIONS
+
+Which version of VarScan should I use?
+Which VarScan command should I use?
+Can I use VarScan on WGS, exome, or RNA-seq data?
+Does VarScan work for pooled samples?
+How do I use VarScan for validation?
+Can I use VarScan on my model organism or microbial genome?
+
+
+INPUT QUESTIONS
+
+What input format should I give to VarScan?
+Which aligner should I use?
+Do I need Illumina or Phred scale base qualities?
+What about mapping qualities?
+
+
+OUTPUT QUESTIONS
+
+What do the output columns mean?
+How are the p-values calculated?
+Why are all of my p-values 0.98?
+How should I filter the output files?
+Does VarScan provide a confidence score similar to SAMtools?
+
+
+COMMON ISSUES, WARNINGS, AND ERRORS
+
+Warnings about "resetting normal" or "resetting tumor" file
+Read counts from VarScan are different from SAMtools/IGV counts
+
+
+
+USAGE QUESTIONS
+
+Which version of VarScan should I use?
+
+Always use the latest version! Currently, this is v2.3.x. There are differences
+between versions; notably, between v2.2.2 and v2.2.3 we adjusted how reads are
+counted for indels, which had exhibited strange behavior due to some
+peculiarities in how SAMtools represents indels in the pileup files.
+
+
+Which VarScan command should I use?
+
+If you have a single sample and wish to call variants, you can use pileup2snp
+to call SNPs, pileup2indel to call indels, or pileup2cns to call consensus
+genotypes at every position meeting the coverage requirement. To call both SNPs
+and indels simultaneously, use pileup2cns with the --variants parameter set to
+1. If you have tumor-normal pairs, you should use the somatic command to call
+mutations (SNVs/indels) and the copynumber command to call copy number
+alterations (CNAs).
+
+
+Can I use VarScan on WGS, exome, or RNA-seq data?
+
+Yes, you can use VarScan for any of these. The default settings are optimized
+for exome data, where one expects to have at least 10x or 20x coverage across
+targeted exons. By lowering the minimum coverage requirement (to perhaps 6x)
+one can call variants in lower-coverage data. Warning: the output files for WGS
+data may be quite large, and the runtimes longer, since it will call 3+ million
+variants per genome. VarScan works for RNA-seq data as well, though this data
+tends to be noisier for variant calling due to RT errors, allele-specific
+expression, and alignment difficulties.
+
+
+Does VarScan work for pooled samples?
+
+Absolutely. It's simply a matter of setting appropriate input parameters,
+particularly --min-coverage, --min-var-freq, and --p-value. In pooled data, you
+might specify a higher minimum coverage, a lower variant allele frequency
+(conservatively, 0.50 / the number of samples), and a less-stringent p-value to
+detect rare variants.
+
+
+How do I use VarScan for validation?
+
+If validating germline SNPs/indels on an orthogonal sequencing platform, use
+the pileup2cns command, which will call consensus genotypes at all positions
+with sufficient coverage, rather than just the variants. This lets you
+determine sites that are refuted as wild-type. You might start with the
+following parameters:
+
+
+--min-coverage 20
+--min-var-freq 0.08
+--p-value 0.05
+
+If validating somatic SNPs/indels on an orthogonal sequncing platform, use the
+somatic command with --validation set to 1. You might start with the following
+parameters:
+
+
+--min-coverage 20
+--min-var-freq 0.08
+--p-value 0.10
+--somatic-p-value 0.05
+--validation 1
+
+
+
+Can I use VarScan on my model organism or microbial genome?
+
+Of course. All you need is a BAM file of your reads aligned to a reference
+sequence. If you don't have a reference sequence to which you might align
+reads, then no.
+
+
+INPUT QUESTIONS
+
+What input format should I give to VarScan?
+
+VarScan expects its input in SAMtools pileup format, which is obtained from a
+BAM file via the samtools pileup command. For example:
+
+
+samtools pileup -f myReference.fasta myReads.bam >myPileup.pileup
+java -jar VarScan.jar pileup2snp myPileup.pileup
+
+To save on disk space and I/O, you can also use a UNIX "pipe" command to
+forward the pileup output directly into VarScan:
+
+
+samtools pileup -f myRef.fasta myBam.bam | java -jar VarScan.jar pileup2snp
+
+
+
+Which aligner should I use?
+
+Great question! The choice of an aligner is an important one, and entire review
+articles have been written on the topic. For practical purposes, you should use
+an aligner whose output is (or can be converted to) a SAM/BAM file. I have
+heard that popular aligners for include BWA, Bowtie, and Novoalign for Illumina
+data, SHRiMP and BFAST for SOLiD data, and SSAHA2 and BWA-SW for Roche/454
+data.
+
+
+Do I need Illumina or Phred scale base qualities?
+
+VarScan expects Phred-scaled (Phred+33, also called "Sanger") base qualities,
+in which a score of 20 indicates a 1/1000 probability of base error. By
+default, VarScan requires a minimum base quality of 15 or 20 (depending on the
+application), so this value should be adjusted appropriately if alternate base
+qualities are used.
+
+
+What about mapping qualities?
+
+
+Many aligners provide a Phred-scaled quality value (mapping quality) for every
+read's alignment, which is correlated to the probability that the read is
+correctly mapped. A mapping quality of 0 typically indicates that the given
+read has many possible mapping locations of equal probability, and that the
+location given was chosen randomly. Thus, it's best to exclude reads with
+mapping quality of 0 from most downstream analyses. A minimum mapping quality
+of 10 is even better. It's possible to apply this threshold to a BAM file using
+SAMtools, as follows:
+
+samtools view -b -q 10 myBam.bam | samtools pileup -f myRef.fasta -
+| java -jar VarScan.jar pileup2snp
+
+
+
+OUTPUT QUESTIONS
+
+What do the output columns mean?
+
+By default, all VarScan output files should include headers. For detailed
+descriptions of the output columns and their meanings, see the "output"
+sections of the germline or somatic calling pages. You can also specify VCF
+output, which is widely documented.
+
+
+How are the p-values calculated?
+
+P-values are calculated using a Fisher's Exact Test on the read counts
+supporting reference and variant alleles. For details, see the germline or
+somatic calling pages.
+
+
+Why are all of my p-values 0.98?
+
+The p-value calculations are computationally expensive, so if the user doesn't
+specify a p-value threshold, VarScan skips the calculation in pileup2snp,
+pileup2indel, and pileup2cns. Instead, it inserts a dummy value of 0.98. To get
+p-values, set the --p-value parameter to something like 0.10, 0.05, or 0.01.
+
+
+How should I filter the output files?
+
+VarScan's default parameters aim to be sensitive when detecting variants, with
+the trade-off that it will report a lot of candidate variants, many of which
+will be false positives. These should be further filtered by coverage, variant
+allele frequency, strand representation, and p-value to isolate high-confidence
+calls. When performing the initial discovery, it is recommended that you set
+--strand-filter to 1. After discovery, you should run VarScan filter or
+somaticFilter to further refine your variant calls.
+
+
+Does VarScan provide a confidence score similar to SAMtools?
+
+It is possible to calculate a Phred-scaled confidence score from the p-value
+that VarScan provides:
+
+$score = -10 * log10($p_value);
+$score = 255 if($score > 255);
+
+This is done on a per-sample basis when --vcf-output is specified, since VCF
+format expects Phred-scaled confidence scores.
+
+
+COMMON ISSUES, WARNINGS, AND ERRORS
+
+Warnings about "resetting normal" or "resetting tumor" file
+
+If you provide two separate input (pileup) files, VarScan attempts to use the
+chromosome and position to simultaneously parse both files and match them up.
+Doing so while accounting for alphanumeric chromosome names is difficult,
+especially when there are chromosomes for which only one sample had coverage.
+VarScan does its best to ensure that no chromosomes are missed due to a sorting
+issue by resetting one sample's pileup file, to ensure that no chromosomes are
+missed due to sorting. When you have many chromosomes or contigs with coverage
+in just one sample, you'll see a lot of these warnings.
+
+The best way to address this simultaneous parsing issue is quite simple:
+provide VarScan with a two-sample MPILEUP file (normal and tumor) instead of
+two individual pileup files:
+
+samtools mpileup -f reference.fasta -q 1 -B normal.bam tumor.bam >normal-tumor.mpileup
+java -jar VarScan.jar somatic normal-tumor.mpileup normal-tumor.varScan.output --mpileup 1
+
+In the mpileup file, SAMtools already does the chromosome and position
+matching-up, so there's no room for error. You'll also notice that I provided
+the -B parameter. This disables SAMtools BAQ computation, which is turned on by
+default in SAMtools mpileup, but (as the author admits) occasionally too
+stringent for variant calling.
+
+Read counts from VarScan are different from SAMtools/IGV counts
+
+By default, SAMtools and IGV show and count all bases at a given position,
+regardless of base quality. In contrast, VarScan requires that bases meet the
+minimum Phred quality score (default 15 for most commands) to count them for
+things like read counts (reads1, reads2) and to compute variant allele
+frequency. However, when VarScan reports the depth (such as in the DP field of
+VCF output), it reports SAMtools raw depth. To get VarScan read counts to more
+closely match another tool, set use parameter --min-avg-qual 0. And use
+caution! Low-quality bases, with the occasional exception of BAQ penalties,
+should not be trusted.
+
+Also, VarScan reports variants on a biallelic basis. That is, for a given SNP
+call, the "reads1" column is the number of reference-supporting reads (RD), and
+the "reads2" column is the number of variant-supporting reads (AD). There may
+be additional reads at that position showing other bases (SNP or indel
+variants). If these other variants meet the calling criteria, they will be
+reported in their own line. If not, it may look like you have "missing" reads.
+
+
+
+Copyright © 2009-2013 by Washington University in St. Louis. Design by CSS
+Templates
diff --git a/debian/get-manual b/debian/get-manual
new file mode 100755
index 0000000..27d123d
--- /dev/null
+++ b/debian/get-manual
@@ -0,0 +1,21 @@
+#!/bin/sh
+TMP=`mktemp`
+OUT=using-varscan.txt
+w3m -dump http://varscan.sourceforge.net/using-varscan.html > $TMP
+cat >$OUT <<EOT
+VarScan User's Manual
+=====================
+EOT
+sed "1,/^VarScan User's Manual/d" $TMP >> $OUT
+
+
+OUT=faq.txt
+w3m -dump http://varscan.sourceforge.net/support-faq.html > $TMP
+cat >$OUT <<EOT
+Support FAQ
+===========
+EOT
+sed "1,/^Support FAQ/d" $TMP >> $OUT
+
+rm -f $TMP
+
diff --git a/debian/manifest b/debian/manifest
new file mode 100644
index 0000000..416f0ea
--- /dev/null
+++ b/debian/manifest
@@ -0,0 +1,3 @@
+Manifest-Version: 1.0
+Main-Class: net.sf.varscan.VarScan
+
diff --git a/debian/upstream/metadata b/debian/upstream/metadata
index d8b5812..428fd3e 100644
--- a/debian/upstream/metadata
+++ b/debian/upstream/metadata
@@ -1,12 +1,13 @@
 Reference:
-  Author: 
-  Title: 
-  Journal: 
-  Year: 
-  Volume: 
-  Number: 
-  Pages: 
-  DOI: 
-  PMID:
-  URL: 
-  eprint: 
+  Author: Daniel C. Koboldt and Qunyuan Zhang and David E. Larson and Dong Shen and Michael D. McLellan and Ling Lin and Christopher A. Miller and Elaine R. Mardis and Li Ding and Richard K. Wilson
+  Title: "VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing"
+  Journal: Genome Res.
+  Year: 2012
+  Volume: 22
+  Number: 3
+  Pages: 568-76
+  DOI: doi: 10.1101/gr.129684.111
+  PMID: 22300766
+  URL: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3290792/
+  eprint: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3290792/pdf/568.pdf
+
diff --git a/debian/using-varscan.txt b/debian/using-varscan.txt
new file mode 100644
index 0000000..8d03809
--- /dev/null
+++ b/debian/using-varscan.txt
@@ -0,0 +1,801 @@
+VarScan User's Manual
+=====================
+
+VarScan is coded in Java, and should be executed from the command line
+(Terminal, in Linux/UNIX/OSX, or Command Prompt in MS Windows). For variant
+calling, you will need a pileup file. See the How to Build A Pileup File
+section for details. Running VarScan with no arguments prints the usage
+information. Because some fields changed as of VarScan v2.2.3, we are providing
+updated documentations for the current release. For documentation of v2.2.2 and
+prior, see below.
+
+
+VarScan Documentation (v2.2.3 and later)
+
+
+        USAGE: java -jar VarScan.jar  [COMMAND] [OPTIONS]
+
+        COMMANDS:
+
+        Single-sample Calling:
+        pileup2snp [pileup file]
+        pileup2indel [pileup file]
+        pileup2cns [pileup file]
+
+        Multi-sample Calling:
+        mpileup2snp [mpileup file]
+        mpileup2indel [mpileup file]
+        mpileup2cns [mpileup file]
+
+        Tumor-normal Comparison:
+        somatic [normal pileup] [tumor pileup] or [normal-tumor mpileup]
+        copynumber [normal pileup] [tumor pileup] or [normal-tumor mpileup]
+
+        Variant Filtering:
+        filter [variants file]
+        somaticFilter [mutations file]
+
+        Utility Functions:
+        limit [variants file]
+        readcounts [pileup file]
+        compare [file1] [file2]
+
+
+pileup2snp
+
+This command calls SNPs from a pileup file based on user-defined parameters:
+
+        USAGE: java -jar VarScan.jar pileup2snp [pileup file] OPTIONS
+        pileup file - The SAMtools pileup file
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-avg-qual  Minimum base quality at a position to count a read [15]
+        --min-var-freq  Minimum variant allele frequency threshold [0.01]
+        --p-value       Default p-value threshold for calling variants [99e-02]
+
+        OUTPUT
+        Tab-delimited SNP calls with the following columns:
+        Chrom           chromosome name
+        Position        position (1-based)
+        Ref             reference allele at this position
+        Cons            Consensus genotype of sample in IUPAC format.
+        Reads1          reads supporting reference allele
+        Reads2          reads supporting variant allele
+        VarFreq         frequency of variant allele by read count
+        Strands1        strands on which reference allele was observed
+        Strands2        strands on which variant allele was observed
+        Qual1           average base quality of reference-supporting read bases
+        Qual2           average base quality of variant-supporting read bases
+        Pvalue          Significance of variant read count vs. expected baseline error
+        MapQual1        Average map quality of ref reads (only useful if in pileup)
+        MapQual2        Average map quality of var reads (only useful if in pileup)
+        Reads1Plus      Number of reference-supporting reads on + strand
+        Reads1Minus     Number of reference-supporting reads on - strand
+        Reads2Plus      Number of variant-supporting reads on + strand
+        Reads2Minus     Number of variant-supporting reads on - strand
+        VarAllele       Most frequent non-reference allele observed
+
+
+pileup2indel
+
+This command calls indels from a pileup file based on user-defined parameters:
+
+        USAGE: java -jar VarScan.jar pileup2indel [pileup file] OPTIONS
+        pileup file - The SAMtools pileup file
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-avg-qual  Minimum base quality at a position to count a read [15]
+        --min-var-freq  Minimum variant allele frequency threshold [0.01]
+        --p-value       Default p-value threshold for calling variants [99e-02]
+
+        OUTPUT
+        Tab-delimited indel calls with the following columns:
+        Chrom           chromosome name
+        Position        position (1-based)
+        Ref             reference allele at this position
+        Cons            Consensus genotype of sample; */(var) indicates heterozygous
+        Reads1          reads supporting reference allele
+        Reads2          reads supporting variant allele
+        VarFreq         frequency of variant allele by read count
+        Strands1        strands on which reference allele was observed
+        Strands2        strands on which variant allele was observed
+        Qual1           average base quality of reference-supporting read bases
+        Qual2           average base quality of variant-supporting read bases
+        Pvalue          Significance of variant read count vs. expected baseline error
+        MapQual1        Average map quality of ref reads (only useful if in pileup)
+        MapQual2        Average map quality of var reads (only useful if in pileup)
+        Reads1Plus      Number of reference-supporting reads on + strand
+        Reads1Minus     Number of reference-supporting reads on - strand
+        Reads2Plus      Number of variant-supporting reads on + strand
+        Reads2Minus     Number of variant-supporting reads on - strand
+        VarAllele       Most frequent non-reference allele observed
+
+
+pileup2cns
+
+This command makes consensus calls (SNP/Indel/Reference) from a pileup file
+based on user-defined parameters:
+
+        USAGE: java -jar VarScan.jar pileup2cns [pileup file] OPTIONS
+        pileup file - The SAMtools pileup file
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-avg-qual  Minimum base quality at a position to count a read [15]
+        --min-var-freq  Minimum variant allele frequency threshold [0.01]
+        --p-value       Default p-value threshold for calling variants [99e-02]
+
+        OUTPUT
+        Tab-delimited consensus calls with the following columns:
+        Chrom           chromosome name
+        Position        position (1-based)
+        Ref             reference allele at this position
+        Cons            Consensus genotype of sample; */(var) indicates heterozygous
+        Reads1          reads supporting reference allele
+        Reads2          reads supporting variant allele
+        VarFreq         frequency of variant allele by read count
+        Strands1        strands on which reference allele was observed
+        Strands2        strands on which variant allele was observed
+        Qual1           average base quality of reference-supporting read bases
+        Qual2           average base quality of variant-supporting read bases
+        Pvalue          Significance of variant read count vs. expected baseline error
+        MapQual1        Average map quality of ref reads (only useful if in pileup)
+        MapQual2        Average map quality of var reads (only useful if in pileup)
+        Reads1Plus      Number of reference-supporting reads on + strand
+        Reads1Minus     Number of reference-supporting reads on - strand
+        Reads2Plus      Number of variant-supporting reads on + strand
+        Reads2Minus     Number of variant-supporting reads on - strand
+        VarAllele       Most frequent non-reference allele observed
+
+mpileup2snp
+
+This command calls SNPs from an mpileup file based on user-defined parameters:
+
+        USAGE: java -jar VarScan.jar mpileup2snp [mpileup file] OPTIONS
+        mpileup file - The SAMtools mpileup file
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-avg-qual  Minimum base quality at a position to count a read [15]
+        --min-var-freq  Minimum variant allele frequency threshold [0.01]
+        --min-freq-for-hom      Minimum frequency to call homozygote [0.75]
+        --p-value       Default p-value threshold for calling variants [99e-02]
+        --strand-filter Ignore variants with >90% support on one strand [1]
+        --output-vcf    If set to 1, outputs in VCF format
+        --variants      Report only variant (SNP/indel) positions (mpileup2cns only) [0]
+
+
+        OUTPUT
+        Tab-delimited SNP calls with the following columns:
+        Chrom           chromosome name
+        Position        position (1-based)
+        Ref                     reference allele at this position
+        Var                     variant allele observed
+        PoolCall        Cross-sample call using all data (Cons:Cov:Reads1:Reads2:Freq:P-value)
+                        Cons - consensus genotype in IUPAC format
+                        Cov - total depth of coverage
+                        Reads1 - number of reads supporting reference
+                        Reads2 - number of reads supporting variant
+                        Freq - the variant allele frequency by read count
+                        P-value - FET p-value of observed reads vs expected non-variant
+        StrandFilt      Information to look for strand bias using all reads (R1+:R1-:R2+:R2-:pval)
+                        R1+ = reference supporting reads on forward strand
+                        R1- = reference supporting reads on reverse strand
+                        R2+ = variant supporting reads on forward strand
+                        R2- = variant supporting reads on reverse strand
+                        pval = FET p-value for strand distribution, R1 versus R2
+        SamplesRef      Number of samples called reference (wildtype)
+        SamplesHet      Number of samples called heterozygous-variant
+        SamplesHom      Number of samples called homozygous-variant
+        SamplesNC       Number of samples not covered / not called
+        SampleCalls     The calls for each sample in the mpileup, space-delimited
+                        Each sample has six values separated by colons:
+                        Cons - consensus genotype in IUPAC format
+                        Cov - total depth of coverage
+                        Reads1 - number of reads supporting reference
+                        Reads2 - number of reads supporting variant
+                        Freq - the variant allele frequency by read count
+                        P-value - FET p-value of observed reads vs expected non-variant
+
+
+mpileup2indel
+
+This command calls indels from a mpileup file based on user-defined parameters:
+
+        USAGE: java -jar VarScan.jar mpileup2indel [mpileup file] OPTIONS
+        mpileup file - The SAMtools mpileup file
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-avg-qual  Minimum base quality at a position to count a read [15]
+        --min-var-freq  Minimum variant allele frequency threshold [0.01]
+        --min-freq-for-hom      Minimum frequency to call homozygote [0.75]
+        --p-value       Default p-value threshold for calling variants [99e-02]
+        --strand-filter Ignore variants with >90% support on one strand [1]
+        --output-vcf    If set to 1, outputs in VCF format
+        --variants      Report only variant (SNP/indel) positions (mpileup2cns only) [0]
+
+
+        OUTPUT
+        Tab-delimited SNP calls with the following columns:
+        Chrom           chromosome name
+        Position        position (1-based)
+        Ref                     reference allele at this position
+        Var                     variant allele observed
+        PoolCall        Cross-sample call using all data (Cons:Cov:Reads1:Reads2:Freq:P-value)
+                                Cons - consensus genotype in IUPAC format
+                                Cov - total depth of coverage
+                                Reads1 - number of reads supporting reference
+                                Reads2 - number of reads supporting variant
+                                Freq - the variant allele frequency by read count
+                                P-value - FET p-value of observed reads vs expected non-variant
+        StrandFilt      Information to look for strand bias using all reads, format R1+:R1-:R2+:R2-:pval
+                                R1+ = reference supporting reads on forward strand
+                                R1- = reference supporting reads on reverse strand
+                                R2+ = variant supporting reads on forward strand
+                                R2- = variant supporting reads on reverse strand
+                                pval = FET p-value for strand distribution, R1 versus R2
+        SamplesRef      Number of samples called reference (wildtype)
+        SamplesHet      Number of samples called heterozygous-variant
+        SamplesHom      Number of samples called homozygous-variant
+        SamplesNC       Number of samples not covered / not called
+        SampleCalls     The calls for each sample in the mpileup, space-delimited
+                        Each sample has six values separated by colons:
+                        Cons - consensus genotype in IUPAC format
+                        Cov - total depth of coverage
+                        Reads1 - number of reads supporting reference
+                        Reads2 - number of reads supporting variant
+                        Freq - the variant allele frequency by read count
+                        P-value - FET p-value of observed reads vs expected non-variant
+
+
+mpileup2cns
+
+This command makes consensus calls (SNP/Indel/Reference) from a mpileup file
+based on user-defined parameters:
+
+        USAGE: java -jar VarScan.jar mpileup2cns [mpileup file] OPTIONS
+        mpileup file - The SAMtools mpileup file
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-avg-qual  Minimum base quality at a position to count a read [15]
+        --min-var-freq  Minimum variant allele frequency threshold [0.01]
+        --min-freq-for-hom      Minimum frequency to call homozygote [0.75]
+        --p-value       Default p-value threshold for calling variants [99e-02]
+        --strand-filter Ignore variants with >90% support on one strand [1]
+        --output-vcf    If set to 1, outputs in VCF format
+        --variants      Report only variant (SNP/indel) positions (mpileup2cns only) [0]
+
+
+        OUTPUT
+        Tab-delimited SNP calls with the following columns:
+        Chrom           chromosome name
+        Position        position (1-based)
+        Ref                     reference allele at this position
+        Var                     variant allele observed
+        PoolCall        Cross-sample call using all data (Cons:Cov:Reads1:Reads2:Freq:P-value)
+                                Cons - consensus genotype in IUPAC format
+                                Cov - total depth of coverage
+                                Reads1 - number of reads supporting reference
+                                Reads2 - number of reads supporting variant
+                                Freq - the variant allele frequency by read count
+                                P-value - FET p-value of observed reads vs expected non-variant
+        StrandFilt      Information to look for strand bias using all reads, format R1+:R1-:R2+:R2-:pval
+                                R1+ = reference supporting reads on forward strand
+                                R1- = reference supporting reads on reverse strand
+                                R2+ = variant supporting reads on forward strand
+                                R2- = variant supporting reads on reverse strand
+                                pval = FET p-value for strand distribution, R1 versus R2
+        SamplesRef      Number of samples called reference (wildtype)
+        SamplesHet      Number of samples called heterozygous-variant
+        SamplesHom      Number of samples called homozygous-variant
+        SamplesNC       Number of samples not covered / not called
+        SampleCalls     The calls for each sample in the mpileup, space-delimited
+                        Each sample has six values separated by colons:
+                        Cons - consensus genotype in IUPAC format
+                        Cov - total depth of coverage
+                        Reads1 - number of reads supporting reference
+                        Reads2 - number of reads supporting variant
+                        Freq - the variant allele frequency by read count
+                        P-value - FET p-value of observed reads vs expected non-variant
+
+
+somatic
+
+This command calls variants and identifies their somatic status (Germline/LOH/
+Somatic) using pileup files from a matched tumor-normal pair.
+
+        USAGE: java -jar VarScan.jar somatic [normal_pileup] [tumor_pileup] [output] OPTIONS
+        normal_pileup - The SAMtools pileup file for Normal
+        tumor_pileup - The SAMtools pileup file for Tumor
+        output - Output base name for SNP and indel output
+
+You can also give it a single mpileup file with normal and tumor data.
+
+
+        USAGE: java -jar VarScan.jar somatic [normal-tumor.mpileup] [output] --mpileup 1 OPTIONS
+        normal-tumor.mpileup - The SAMtools mpileup file with normal and then tumor
+        output - Output base name for SNP and indel output
+
+Both formats of the command share these common options:
+
+
+        OPTIONS:
+        --output-snp - Output file for SNP calls [default: output.snp]
+        --output-indel - Output file for indel calls [default: output.indel]
+        --min-coverage - Minimum coverage in normal and tumor to call variant [8]
+        --min-coverage-normal - Minimum coverage in normal to call somatic [8]
+        --min-coverage-tumor - Minimum coverage in tumor to call somatic [6]
+        --min-var-freq - Minimum variant frequency to call a heterozygote [0.10]
+        --min-freq-for-hom      Minimum frequency to call homozygote [0.75]
+        --normal-purity - Estimated purity (non-tumor content) of normal sample [1.00]
+        --tumor-purity - Estimated purity (tumor content) of tumor sample [1.00]
+        --p-value - P-value threshold to call a heterozygote [0.99]
+        --somatic-p-value - P-value threshold to call a somatic site [0.05]
+        --strand-filter - If set to 1, removes variants with >90% strand bias
+        --validation - If set to 1, outputs all compared positions even if non-variant
+
+Note that more specific options (e.g. min-coverage-normal) will override the
+default or specificied value of less specific options (e.g. min-coverage).
+
+The normal and tumor purity values should be a value between 0 and 1. The
+default (1) implies that the normal is 100% pure with no contaminating tumor
+cells, and the tumor is 100% pure with no contaminating stromal or other
+non-malignant cells. You would change tumor-purity to something less than 1 if
+you have a low-purity tumor sample and thus expect lower variant allele
+frequencies for mutations. You would change normal-purity to something less
+than 1 only if it's possible that there will be some tumor content in your
+"normal" sample, e.g. adjacent normal tissue for a solid tumor, malignant blood
+cells in the skin punch normal for some liquid tumors, etc.
+
+There are two p-value options. One (p-value) is the significance threshold for
+the first-pass algorithm that determines, for each position, if either normal
+or tumor is variant at that position. The second (somatic-p-value) is more
+important; this is the threshold below which read count differences between
+tumor and normal are deemed significant enough to classify the sample as a
+somatic mutation or an LOH event. In the case of a shared (germline) variant,
+this p-value is used to determine if the combined normal and tumor evidence
+differ significantly enough from the null hypothesis (no variant with same
+coverage) to report the variant. See the somatic mutation calling section for
+details.
+
+
+        OUTPUT
+        Two tab-delimited files (SNPs and Indels) with the following columns:
+        chrom                                   chromosome name
+        position                                position (1-based from the pileup)
+        ref                                             reference allele at this position
+        var                                             variant allele at this position
+        normal_reads1                   reads supporting reference allele
+        normal_reads2                   reads supporting variant allele
+        normal_var_freq                 frequency of variant allele by read count
+        normal_gt                               genotype call for Normal sample
+        tumor_reads1                    reads supporting reference allele
+        tumor_reads2                    reads supporting variant allele
+        tumor_var_freq                  frequency of variant allele by read count
+        tumor_gt                                genotype call for Tumor sample
+        somatic_status                  status of variant (Germline, Somatic, or LOH)
+        variant_p_value                 Significance of variant read count vs. baseline error rate
+        somatic_p_value                 Significance of tumor read count vs. normal read count
+        tumor_reads1_plus       Ref-supporting reads from + strand in tumor
+        tumor_reads1_minus      Ref-supporting reads from - strand in tumor
+        tumor_reads2_plus       Var-supporting reads from + strand in tumor
+        tumor_reads2_minus              Var-supporting reads from - strand in tumor
+
+
+copynumber
+
+This command calls variants and identifies their somatic status (Germline/LOH/
+Somatic) using pileup files from a matched tumor-normal pair.
+
+        USAGE: java -jar VarScan.jar copynumber [normal_pileup] [tumor_pileup] [output] OPTIONS
+        normal_pileup - The SAMtools pileup file for Normal
+        tumor_pileup - The SAMtools pileup file for Tumor
+        output - Output base name for SNP and indel output
+
+You can also give it a single mpileup file with normal and tumor data.
+
+
+        USAGE: java -jar VarScan.jar copynumber [normal-tumor.mpileup] [output] --mpileup 1 OPTIONS
+        normal-tumor.mpileup - The SAMtools mpileup file with normal and then tumor
+        output - Output base name for SNP and indel output
+
+Both formats of the command share these common options:
+
+
+        OPTIONS:
+        --min-base-qual - Minimum base quality to count for coverage [20]
+        --min-map-qual - Minimum read mapping quality to count for coverage [20]
+        --min-coverage - Minimum coverage threshold for copynumber segments [20]
+        --min-segment-size - Minimum number of consecutive bases to report a segment [10]
+        --max-segment-size - Max size before a new segment is made [100]
+        --p-value - P-value threshold for significant copynumber change-point [0.01]
+        --data-ratio - The normal/tumor input data ratio for copynumber adjustment [1.0]
+
+Note: The data ratio is intended to help you account for overall differences in
+the amount of sequencing coverage between normal and tumor, which might
+otherwise give the appearance of global copy number differences. If normal has
+more data than tumor, set this to something greater than 1. If tumor has more
+data than normal, adjust it to something below 1. A basic formula for data
+ratio might be something like ratio = normal_unique_bp / tumor_unique_bp where
+unique base pairs are computed as mapped_non_dup_reads * read_length.
+
+
+        OUTPUT
+        chrom                           Chromosome name
+        chr_start                       Region start position (1-based from the pileup)
+        chr_stop                        Region stop position (1-based from the pileup)
+    num_positions               Size of the region in base pairs
+    normal_depth                Average normal sequence depth for the region
+    tumor_depth                 Average tumor sequence depth for the region
+    log2_ratio                  Log-base-2 ratio of: adjusted tumor depth over normal depth
+    gc_content                  Estimated GC content of the region (0-100)
+
+The raw regions reported by VarScan are delineated by drops in coverage or
+changes in the tumor/normal ratio, so there are many small, nearby regions with
+similar copy number. It is therefore recommended that raw VarScan copynumber
+output be processed with circular binary segmentation (CBS) or a similar
+algorithm, which will generate larger segments delineated by statistically
+significant change points. See the copy number calling section for details.
+
+filter
+
+This command filters variants in a file by coverage, supporting reads, variant
+frequency, or average base quality. It is for use with output from pileup2snp
+or pileup2indel.
+
+        USAGE: java -jar VarScan.jar filter [variants file] OPTIONS
+        variants file - A file of SNP or indel calls from VarScan pileup2snp or pileup2indel
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [10]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-strands2  Minimum # of strands on which variant observed (1 or 2) [1]
+        --min-avg-qual  Minimum average base quality for variant-supporting reads [20]
+        --min-var-freq  Minimum variant allele frequency threshold [0.20]
+        --p-value       Default p-value threshold for calling variants [1e-01]
+        --indel-file    File of indels for filtering nearby SNPs, from pileup2indel command
+        --output-file   File to contain variants passing filters
+
+
+
+somaticFilter
+
+This command filters somatic mutation calls to remove clusters of false
+positives and SNV calls near indels. Note: this is a basic filter. More
+advanced filtering strategies consider mapping quality, read mismatches,
+soft-trimming, and other factors when deciding whether or not to filter a
+variant. See the VarScan 2 publication (Koboldt et al, Genome Research, Feb
+2012) for details.
+
+        USAGE: java -jar VarScan.jar somaticFilter [mutations file] OPTIONS
+        mutations file - A file of SNVs from VarScan somatic
+
+        OPTIONS:
+        --min-coverage  Minimum read depth [10]
+        --min-reads2    Minimum supporting reads for a variant [2]
+        --min-strands2  Minimum # of strands on which variant observed (1 or 2) [1]
+        --min-avg-qual  Minimum average base quality for variant-supporting reads [20]
+        --min-var-freq  Minimum variant allele frequency threshold [0.20]
+        --p-value       Default p-value threshold for calling variants [1e-01]
+        --indel-file    File of indels for filtering nearby SNPs
+        --output-file   Optional output file for filtered variants
+
+
+limit
+
+This command limits variants in a file to a set of positions or regions
+
+USAGE: java -jar VarScan.jar limit [infile] OPTIONS
+        infile - A file of chromosome-positions, tab-delimited
+
+        OPTIONS
+        --positions-file - a file of chromosome-positions, tab delimited
+        --regions-file - a file of chromosome-start-stops, tab delimited
+        --output-file - Output file for the matching variants
+
+
+readcounts
+
+This command reports the read counts for each base at positions in a pileup
+file
+
+USAGE: java -jar VarScan.jar readcounts [pileup file] OPTIONS
+        pileup file - The SAMtools pileup file
+
+        OPTIONS:
+        --variants-file A list of variants at which to report readcounts
+        --output-file   Output file to contain the readcounts
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-base-qual Minimum base quality at a position to count a read [30]
+
+
+compare
+
+This command performs set-comparison operations on two files of variants.
+
+USAGE: java -jar VarScan.jar compare [file1] [file2] [type] [output] OPTIONS
+        file1 - A file of chromosome-positions, tab-delimited
+        file2 - A file of chromosome-positions, tab-delimited
+        type - Type of comparison [intersect|merge|unique1|unique2]
+        output - Output file for the comparison result
+
+
+
+For detailed usage information, see the VarScan JavaDoc.
+
+
+
+
+VarScan Documentation (v2.2.2 and before)
+
+
+        USAGE: java -jar VarScan.jar  [COMMAND] [OPTIONS]
+
+        COMMANDS
+        pileup2snp [pileup file]
+        pileup2indel [pileup file]
+        pileup2cns [pileup file]
+        somatic [normal pileup] [tumor pileup]
+        filter [variants file]
+        somaticFilter [mutations file]
+        limit [variants file]
+        readcounts [pileup file]
+        compare [file1] [file2]
+
+
+
+pileup2snp
+
+This command calls SNPs from a pileup file based on user-defined parameters:
+
+        USAGE: java -jar VarScan.jar pileup2snp [pileup file] OPTIONS
+        pileup file - The SAMtools pileup file
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [10]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-avg-qual  Minimum base quality at a position to count a read [15]
+        --min-var-freq  Minimum variant allele frequency threshold [0.01]
+        --p-value       Default p-value threshold for calling variants [99e-02]
+
+        OUTPUT
+        Tab-delimited SNP calls with the following columns:
+        Chrom           chromosome name
+        Position        position (1-based)
+        Ref             reference allele at this position
+        Var             variant allele at this position
+        Reads1          reads supporting reference allele
+        Reads2          reads supporting variant allele
+        VarFreq         frequency of variant allele by read count
+        Strands1        strands on which reference allele was observed
+        Strands2        strands on which variant allele was observed
+        Qual1           average base quality of reference-supporting read bases
+        Qual2           average base quality of variant-supporting read bases
+        Pvalue          Significance of variant read count vs. expected baseline error
+
+
+pileup2indel
+
+This command calls indels from a pileup file based on user-defined parameters:
+
+        USAGE: java -jar VarScan.jar pileup2indel [pileup file] OPTIONS
+        pileup file - The SAMtools pileup file
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-avg-qual  Minimum base quality at a position to count a read [15]
+        --min-var-freq  Minimum variant allele frequency threshold [0.01]
+        --p-value       Default p-value threshold for calling variants [99e-02]
+
+        OUTPUT
+        Tab-delimited indel calls with the following columns:
+        Chrom           chromosome name
+        Position        position (1-based)
+        Ref             reference allele at this position
+        Var             variant allele at this position
+        Reads1          reads supporting reference allele
+        Reads2          reads supporting variant allele
+        VarFreq         frequency of variant allele by read count
+        Strands1        strands on which reference allele was observed
+        Strands2        strands on which variant allele was observed
+        Qual1           average base quality of reference-supporting read bases
+        Qual2           average base quality of variant-supporting read bases
+        Pvalue          Significance of variant read count vs. expected baseline error
+
+
+pileup2cns
+
+This command makes consensus calls (SNP/Indel/Reference) from a pileup file
+based on user-defined parameters:
+
+        USAGE: java -jar VarScan.jar pileup2cns [pileup file] OPTIONS
+        pileup file - The SAMtools pileup file
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-avg-qual  Minimum base quality at a position to count a read [15]
+        --min-var-freq  Minimum variant allele frequency threshold [0.01]
+        --p-value       Default p-value threshold for calling variants [99e-02]
+
+        OUTPUT
+        Tab-delimited consensus calls with the following columns:
+        Chrom           chromosome name
+        Position        position (1-based)
+        Ref             reference allele at this position
+        Var             consensus call (reference, IUPAC SNP code, or indel)
+        Reads1          reads supporting reference allele
+        Reads2          reads supporting variant allele
+        VarFreq         frequency of variant allele by read count
+        Strands1        strands on which reference allele was observed
+        Strands2        strands on which variant allele was observed
+        Qual1           average base quality of reference-supporting read bases
+        Qual2           average base quality of variant-supporting read bases
+        Pvalue          Significance of variant read count vs. expected baseline error
+
+
+somatic
+
+This command calls variants and identifies their somatic status (Germline/LOH/
+Somatic) using pileup files from a matched tumor-normal pair.
+
+        USAGE: java -jar VarScan.jar somatic [normal_pileup] [tumor_pileup] [output] OPTIONS
+        normal_pileup - The SAMtools pileup file for Normal
+        tumor_pileup - The SAMtools pileup file for Tumor
+        output - Output base name for SNP and indel output
+
+        OPTIONS:
+        --output-snp    Output file for SNP calls [output.snp]
+        --output-indel  Output file for indel calls [output.indel]
+        --min-coverage  Minimum coverage in normal and tumor to call variant [10]
+        --min-coverage-normal   Minimum coverage in normal to call somatic [10]
+        --min-coverage-tumor    Minimum coverage in tumor to call somatic [5]
+        --min_var_freq  Minimum variant frequency to call a heterozygote [0.20]
+        --p-value       P-value threshold to call a heterozygote [1.0e-01]
+        --somatic-p-value       P-value threshold to call a somatic site [1.0e-04]
+
+        OUTPUT
+        Two tab-delimited files (SNPs and Indels) with the following columns:
+        Chrom           chromosome name
+        Position        position (1-based)
+        Ref             reference allele at this position
+        Var             variant allele at this position
+        Normal_Reads1   reads supporting reference allele
+        Normal_Reads2   reads supporting variant allele
+        Normal_VarFreq  frequency of variant allele by read count
+        Normal_Gt       genotype call for Normal sample
+        Tumor_Reads1    reads supporting reference allele
+        Tumor_Reads2    reads supporting variant allele
+        Tumor_VarFreq   frequency of variant allele by read count
+        Tumor_Gt        genotype call for Tumor sample
+        Somatic_Status  status of variant (Germline, Somatic, or LOH)
+        Pvalue          Significance of variant read count vs. expected baseline error
+        Somatic_Pvalue  Significance of tumor read count vs. normal read count
+
+
+filter
+
+This command filters variants in a file by coverage, supporting reads, variant
+frequency, or average base quality
+
+        USAGE: java -jar VarScan.jar filter [variants file] OPTIONS
+        variants file - A file of SNP or indel calls from VarScan
+
+        OPTIONS:
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-reads2    Minimum supporting reads at a position to call variants [2]
+        --min-avg-qual  Minimum base quality at a position to count a read [15]
+        --min-var-freq  Minimum variant allele frequency threshold [0.01]
+        --p-value       Default p-value threshold for calling variants [99e-02]
+
+
+somaticFilter
+
+This command filters somatic mutation calls to remove clusters of false
+positives and SNV calls near indels.
+
+        USAGE: java -jar VarScan.jar somaticFilter [mutations file] OPTIONS
+        mutations file - A file of SNVs from VarScan somatic
+
+        OPTIONS:
+        --min-coverage  Minimum read depth [10]
+        --min-reads2    Minimum supporting reads for a variant [2]
+        --min-strands2  Minimum # of strands on which variant observed (1 or 2) [1]
+        --min-avg-qual  Minimum average base quality for variant-supporting reads [20]
+        --min-var-freq  Minimum variant allele frequency threshold [0.20]
+        --p-value       Default p-value threshold for calling variants [1e-01]
+        --indel-file    File of indels for filtering nearby SNPs
+        --output-file   Optional output file for filtered variants
+
+
+limit
+
+This command limits variants in a file to a set of positions or regions
+
+USAGE: java -jar VarScan.jar limit [infile] OPTIONS
+        infile - A file of chromosome-positions, tab-delimited
+
+        OPTIONS
+        --positions-file - a file of chromosome-positions, tab delimited
+        --regions-file - a file of chromosome-start-stops, tab delimited
+        --output-file - Output file for the matching variants
+
+
+readcounts
+
+This command reports the read counts for each base at positions in a pileup
+file
+
+USAGE: java -jar VarScan.jar readcounts [pileup file] OPTIONS
+        pileup file - The SAMtools pileup file
+
+        OPTIONS:
+        --variants-file A list of variants at which to report readcounts
+        --output-file   Output file to contain the readcounts
+        --min-coverage  Minimum read depth at a position to make a call [8]
+        --min-base-qual Minimum base quality at a position to count a read [30]
+
+
+compare
+
+This command performs set-comparison operations on two files of variants.
+
+USAGE: java -jar VarScan.jar compare [file1] [file2] [type] [output] OPTIONS
+        file1 - A file of chromosome-positions, tab-delimited
+        file2 - A file of chromosome-positions, tab-delimited
+        type - Type of comparison [intersect|merge|unique1|unique2]
+        output - Output file for the comparison result
+
+
+
+For detailed usage information, see the VarScan JavaDoc.
+
+
+How to Build a SAMtools (m)pileup File
+
+
+The variant calling features of VarScan for single samples (pileup2snp,
+pileup2indel, pileup2cns) and multiple samples (mpileup2snp, mpileup2indel,
+mpileup2cns, and somatic) expect input in SAMtools pileup or mpileup format. In
+current versions of SAMtools, the "pileup" command has now been replaced with
+the "mpileup" command. For a single sample, these operate in a very similar
+fashion, except that mpileup applies BAQ adjustments by default, and the output
+is identical. When you give it multiple BAM files, however, SAMtools mpileup
+generates a multi-sample pileup format that must be processed with the
+mpileup2* commands in VarScan. To build a mpileup file, you will need:
+
+  • One or more BAM files ("myData.bam") that have been sorted using the sort
+    command of SAMtools.
+  • The reference sequence ("reference.fasta") to which reads were aligned, in
+    FASTA format.
+  • The SAMtools software package.
+
+
+Generate a mpileup file with the following command:
+
+
+samtools mpileup -f [reference sequence] [BAM file(s)] >myData.mpileup
+
+
+Note, to save disk space and file I/O, you can redirect mpileup output directly
+to VarScan with a "pipe" command. For example:
+
+One sample:
+samtools mpileup -f reference.fasta myData.bam | java -jar VarScan.v2.2.jar pileup2snp
+
+Multiple samples:
+samtools mpileup -f reference.fasta sample1.bam sample2.bam | java -jar VarScan.v2.2.jar pileup2snp
+
+Copyright © 2009-2013 by Washington University in St. Louis. Design by CSS
+Templates

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/varscan.git



More information about the debian-med-commit mailing list