[med-svn] [dazzdb] 01/01: Create manpage stubs from README contents

Afif Elghraoui afif-guest at moszumanska.debian.org
Mon Sep 14 05:39:52 UTC 2015


This is an automated email from the git hooks/post-receive script.

afif-guest pushed a commit to branch master
in repository dazzdb.

commit a4e0884de3568954a4e470b4153706ad2ece0732
Author: Afif Elghraoui <afif at ghraoui.name>
Date:   Sun Sep 13 22:38:34 2015 -0700

    Create manpage stubs from README contents
---
 debian/man/Catrack.1.md   | 20 ++++++++++++++++++++
 debian/man/DAM2fasta.1.md | 21 +++++++++++++++++++++
 debian/man/DB2fasta.1.md  | 27 +++++++++++++++++++++++++++
 debian/man/DB2quiva.1.md  | 24 ++++++++++++++++++++++++
 debian/man/DBdust.1.md    | 32 ++++++++++++++++++++++++++++++++
 debian/man/DBrm.1.md      | 19 +++++++++++++++++++
 debian/man/DBshow.1.md    | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 debian/man/DBsplit.1.md   | 28 ++++++++++++++++++++++++++++
 debian/man/DBstats.1.md   | 23 +++++++++++++++++++++++
 debian/man/Makefile       | 12 ++++++++++++
 debian/man/fasta2DAM.1.md | 21 +++++++++++++++++++++
 debian/man/fasta2DB.1.md  | 28 ++++++++++++++++++++++++++++
 debian/man/quiva2DB.1.md  | 25 +++++++++++++++++++++++++
 debian/man/simulator.1.md | 37 +++++++++++++++++++++++++++++++++++++
 14 files changed, 363 insertions(+)

diff --git a/debian/man/Catrack.1.md b/debian/man/Catrack.1.md
new file mode 100644
index 0000000..41adc6a
--- /dev/null
+++ b/debian/man/Catrack.1.md
@@ -0,0 +1,20 @@
+% (1) 1.0
+%
+% September 2015
+
+# NAME
+
+# SYNOPSIS
+
+# DESCRIPTION
+
+Catrack [-v] <path:db|dam> <track:name>
+
+Find all block tracks of the form .<path>.#.<track>... and merge them into a single
+track, .<path>.<track>..., for the given DB or DAM.   The block track files must all
+encode the same kind of track data (this is checked), and the files must exist for
+block 1, 2, 3, ... up to the last block number.
+
+# OPTIONS
+
+# SEE ALSO
diff --git a/debian/man/DAM2fasta.1.md b/debian/man/DAM2fasta.1.md
new file mode 100644
index 0000000..aaac72c
--- /dev/null
+++ b/debian/man/DAM2fasta.1.md
@@ -0,0 +1,21 @@
+% (1) 1.0
+%
+% September 2015
+
+# NAME
+
+# SYNOPSIS
+
+DAM2fasta [-vU] [-w<int(80)>] <path:dam>
+
+# DESCRIPTION
+
+The set of .fasta files for the given map DB or DAM are recreated from the DAM
+exactly as they were input. That is, this is a perfect inversion, including the
+reconstitution of the proper .fasta headers and the concatenation of contigs with
+the proper number of N's between them. By default the output sequences are in lower
+case and 80 chars per line. The -U option specifies upper case should be used, and
+the characters per line, or line width, can be set to any positive value with
+the -w option.
+
+# SEE ALSO
diff --git a/debian/man/DB2fasta.1.md b/debian/man/DB2fasta.1.md
new file mode 100644
index 0000000..acb8fab
--- /dev/null
+++ b/debian/man/DB2fasta.1.md
@@ -0,0 +1,27 @@
+% DB2FASTA(1) 1.0
+%
+% September 2015
+
+# NAME
+
+DB2fasta - create fasta files from a Dazzler database
+
+# SYNOPSIS
+
+**DB2fasta** [**-vU**] [**-w***int(80)*] *path:db*
+
+# DESCRIPTION
+
+The set of .fasta files for the given DB are recreated from the DB exactly as
+they were input. That is, this is a perfect inversion, including the
+reconstitution of the proper .fasta headers. Because of this property, one can,
+if desired, delete the .fasta source files once they are in the DB as they can
+always be recreated from it. By default the output sequences are in lower case
+and 80 chars per line.  The **-U** option specifies upper case should be used,
+and the characters per line, or line width, can be set to any positive value
+with the **-w** option.
+
+# SEE ALSO
+
+**daligner**(1)
+**fasta2DB**(1)
diff --git a/debian/man/DB2quiva.1.md b/debian/man/DB2quiva.1.md
new file mode 100644
index 0000000..4b1a03b
--- /dev/null
+++ b/debian/man/DB2quiva.1.md
@@ -0,0 +1,24 @@
+% (1) 1.0
+%
+% September 2015
+
+# NAME
+
+# SYNOPSIS
+
+DB2quiva [-vU] <path:db>
+
+# DESCRIPTION
+
+The set of .quiva files within the given DB are recreated from the DB exactly as they
+were input.  That is, this is a perfect inversion, including the reconstitution of the
+proper .quiva headers.  Because of this property, one can, if desired, delete the
+.quiva source files once they are in the DB as they can always be recreated from it.
+By .fastq convention each QV vector is output as a line without new-lines, and by
+default the Deletion Tag entry is in lower case letters.  The -U option specifies
+upper case letters should be used instead.
+
+
+# OPTIONS
+
+# SEE ALSO
diff --git a/debian/man/DBdust.1.md b/debian/man/DBdust.1.md
new file mode 100644
index 0000000..383707d
--- /dev/null
+++ b/debian/man/DBdust.1.md
@@ -0,0 +1,32 @@
+% (1) 1.0
+%
+% September 2015
+
+# NAME
+
+# SYNOPSIS
+
+# DESCRIPTION
+
+DBdust [-b] [-w<int(64)>] [-t<double(2.)>] [-m<int(10)>] <path:db|dam>
+
+Runs the symmetric DUST algorithm over the reads in the untrimmed DB <path>.db or
+<path>.dam producing a track .<path>.dust[.anno,.data] that marks all intervals of low
+complexity sequence, where the scan window is of size -w, the threshold for being a
+low-complexity interval is -t, and only perfect intervals of size greater than -m are
+recorded.  If the -b option is set then the definition of low complexity takes into
+account the frequency of a given base.  The command is incremental if given a DB to
+which new data has been added since it was last run on the DB, then it will extend
+the track to include the new reads.  It is important to set this flag for genomes with
+a strong AT/GC bias, albeit the code is a tad slower.  The dust track, if present,
+is understood and used by DBshow, DBstats, and dalign.
+
+DBdust can also be run over an untriimmed DB block in which case it outputs a track
+encoding where the trace file names contain the block number, e.g. .FOO.3.dust.anno
+and .FOO.3.dust.data, given FOO.3 on the command line.  We call this a *block track*.
+This permits job parallelism in block-sized chunks, and the resulting sequence of
+block tracks can then be merged into a track for the entire untrimmed DB with Catrack.
+
+# OPTIONS
+
+# SEE ALSO
diff --git a/debian/man/DBrm.1.md b/debian/man/DBrm.1.md
new file mode 100644
index 0000000..c923cd9
--- /dev/null
+++ b/debian/man/DBrm.1.md
@@ -0,0 +1,19 @@
+% (1) 1.0
+%
+% September 2015
+
+# NAME
+
+# SYNOPSIS
+
+# DESCRIPTION
+
+DBrm <path:db|dam> ...
+
+Delete all the files for the given data bases.  Do not use **rm**(1) to remove a database, as
+there are at least two and often several secondary files for each DB including track
+files, and all of these are removed by DBrm.
+
+# OPTIONS
+
+# SEE ALSO
diff --git a/debian/man/DBshow.1.md b/debian/man/DBshow.1.md
new file mode 100644
index 0000000..c163fa9
--- /dev/null
+++ b/debian/man/DBshow.1.md
@@ -0,0 +1,46 @@
+% (1) 1.0
+%
+% September 2015
+
+# NAME
+
+# SYNOPSIS
+
+# DESCRIPTION
+
+DBshow [-unqUQ] [-w<int(80)>] [-m<track>]+
+                    <path:db|dam> [ <reads:FILE> | <reads:range> ... ]
+
+Displays the requested reads in the database <path>.db or <path>.dam.  By default the
+command applies to the trimmed database, but if -u is set then the entire DB is used.
+If no read arguments are given then every read in the database or database block is
+displayed.  Otherwise the input file or the list of supplied integer ranges give the
+ordinal positions in the actively loaded portion of the db.  In the case of a file, it
+should simply contain a read index, one per line.  In the other case, a read range is
+either a lone integer or the symbol $, in which case the read range consists of just
+that read (the last read in the database if $).  One may also give two positive
+integers separated by a dash to indicate a range of integers, where again a $
+represents the index of the last read in the actively loaded db.  For example,
+1 3-5 $ displays reads 1, 3, 4, 5, and the last read in the active db.  As another
+example, 1-$ displays every read in the active db (the default).
+
+By default a .fasta file of the read sequences is displayed.  If the -q option is
+set, then the QV streams are also displayed in a non-standard modification of the
+fasta format.  If the -n option is set then the DNA sequence is *not* displayed.
+If the -Q option is set then a .quiva file is displayed  and in this case the -n
+and -m options mayt not be set (and the -q and -w options have no effect).
+
+If one or more masks are set with the -m option then the track intervals are also
+displayed in an additional header line and the bases within an interval are displayed
+in the case opposite that used for all the other bases.  By default the output
+sequences are in lower case and 80 chars per line.  The -U option specifies upper
+case should be used, and the characters per line, or line width, can be set to any
+positive value with the -w option.
+
+The .fasta or .quiva files that are output can be converted into a DB by fasta2DB
+and quiva2DB (if the -q and -n options are not set and no -m options are set),
+giving one a simple way to make a DB of a subset of the reads for testing purposes.
+
+# OPTIONS
+
+# SEE ALSO
diff --git a/debian/man/DBsplit.1.md b/debian/man/DBsplit.1.md
new file mode 100644
index 0000000..2a60e26
--- /dev/null
+++ b/debian/man/DBsplit.1.md
@@ -0,0 +1,28 @@
+% (1) 1.0
+%
+% September 2015
+
+# NAME
+
+# SYNOPSIS
+
+# DESCRIPTION
+
+ DBsplit [-a] [-x<int>] [-s<int(200)>] <path:db|dam>
+
+Divide the database <path>.db or <path>.dam conceptually into a series of blocks
+referable to on the command line as <path>.1, <path>.2, ...  If the -x option is set
+then all reads less than the given length are ignored, and if the -a option is not
+set then secondary reads from a given well are also ignored.  The remaining reads,
+constituting what we call the trimmed DB, are split amongst the blocks so that each
+block is of size -s * 1Mbp except for the last which necessarily contains a smaller
+residual.  The default value for -s is 200Mbp because blocks of this size can be
+compared by our "overlapper" dalign in roughly 16Gb of memory.  The blocks are very
+space efficient in that their sub-index of the master .idx is computed on the fly
+when loaded, and the .bps and .qvs files (if a .db) of base pairs and quality values,
+respectively, is shared with the master DB.  Any relevant portions of tracks
+associated with the DB are also computed on the fly when loading a database block.
+
+# OPTIONS
+
+# SEE ALSO
diff --git a/debian/man/DBstats.1.md b/debian/man/DBstats.1.md
new file mode 100644
index 0000000..c240a33
--- /dev/null
+++ b/debian/man/DBstats.1.md
@@ -0,0 +1,23 @@
+% (1) 1.0
+%
+% September 2015
+
+# NAME
+
+# SYNOPSIS
+
+# DESCRIPTION
+
+DBstats [-nu] [-b<int(1000)] [-m<track>]+ <path:db|dam>
+
+Show overview statistics for all the reads in the trimmed data base <path>.db or
+<path>.dam, including a histogram of read lengths where the bucket size is set
+with the -b option (default 1000).  If the -u option is given then the untrimmed
+database is summarized.  If the -n option is given then the histogran of read lengths
+is not displayed.  Any track such as a "dust" track that gives a seried of
+intervals along the read can be specified with the -m option in which case a summary
+and a histogram of the interval lengths is displayed.
+
+# OPTIONS
+
+# SEE ALSO
diff --git a/debian/man/Makefile b/debian/man/Makefile
new file mode 100644
index 0000000..7733e1b
--- /dev/null
+++ b/debian/man/Makefile
@@ -0,0 +1,12 @@
+
+MANPAGES = $(basename $(wildcard  *.1.md))
+
+all: $(MANPAGES)
+
+%.1: %.1.md
+	pandoc -s -f markdown -t man $< > $@
+
+clean:
+	$(RM) $(MANPAGES)
+
+.PHONY: clean
diff --git a/debian/man/fasta2DAM.1.md b/debian/man/fasta2DAM.1.md
new file mode 100644
index 0000000..2b64f93
--- /dev/null
+++ b/debian/man/fasta2DAM.1.md
@@ -0,0 +1,21 @@
+% (1) 1.0
+%
+% September 2015
+
+# NAME
+
+# SYNOPSIS
+
+fasta2DAM [-v] <path:dam> ( -f<file> | <input:fasta> ... )
+
+# DESCRIPTION
+
+Builds a map DB or DAM from the list of .fasta files following the map database name
+argument, or if the -f option is used, the list of .fasta files in <file>.  Any .fasta
+entry that has a run of N's in it will be split into separate "contig" entries and
+the interval of the contig in the original entry recorded.  The header for each .fasta
+entry is saved with the contigs created from it.
+
+# OPTIONS
+
+# SEE ALSO
diff --git a/debian/man/fasta2DB.1.md b/debian/man/fasta2DB.1.md
new file mode 100644
index 0000000..c1d8713
--- /dev/null
+++ b/debian/man/fasta2DB.1.md
@@ -0,0 +1,28 @@
+% FASTA2DB(1) 1.0
+%
+% September 2015
+
+# NAME
+
+fasta2DB - create a Dazzler database from fasta files
+
+# SYNOPSIS
+
+**fasta2DB** [**-v**] *path:db* {**-f***file* | *input:fasta* ...}
+
+# DESCRIPTION
+
+Builds an initial database, or adds to an existing database, the list of
+.fasta files following the database name argument, or if the **-f** option is
+used, the list of .fasta files in *file*. A given .fasta file can only be
+added once to the DB (this is checked by the command). The .fasta headers must
+be in the "Pacbio" format (i.e. the output of the Pacbio tools or our
+**dextract**(1) program) and the well, pulse interval, and read quality are
+extracted from the header and kept with each read record. If the files are
+being added to an existing database, and the partition settings of the DB have
+already been set (see **DBsplit**(1)), then the partitioning of the database
+is updated to include the new data.
+
+# SEE ALSO
+
+**daligner**(1)
diff --git a/debian/man/quiva2DB.1.md b/debian/man/quiva2DB.1.md
new file mode 100644
index 0000000..4d46eef
--- /dev/null
+++ b/debian/man/quiva2DB.1.md
@@ -0,0 +1,25 @@
+% QUIVA2DB(1) 1.0
+%
+% September 2015
+
+# NAME
+
+quiva2DB - 
+
+# SYNOPSIS
+
+**quiva2DB** [**-vl**] *path:db* (**-f***file* | **input:quiva** ... )
+
+# DESCRIPTION
+
+Adds the given .quiva files on the command line or in the file specified by the
+**-f** option to an existing DB "path". The input files must be added in the
+same order as the .fasta files were and have the same root names,
+e.g. FOO.fasta and FOO.quiva. The files can be added incrementally but must be
+added in the same order as the .fasta files. This is enforced by the program.
+With the **-l** option set the compression scheme is a bit lossy to get more
+compression (see the description of dexqv in the DEXTRACTOR module).
+
+# SEE ALSO
+
+**daligner**(1)
diff --git a/debian/man/simulator.1.md b/debian/man/simulator.1.md
new file mode 100644
index 0000000..aec8a14
--- /dev/null
+++ b/debian/man/simulator.1.md
@@ -0,0 +1,37 @@
+% SIMULATOR(1) 1.0
+%
+% September 2015
+
+# NAME
+
+# SYNOPSIS
+
+# DESCRIPTION
+
+simulator <genlen:double> [-c<double(20.)>] [-b<double(.5)] [-r<int>]
+                              [-m<int(10000)>]  [-s<int(2000)>]
+                              [-x<int(4000)>]   [-e<double(.15)>]
+                              [-M<file>]
+
+In addition to the DB commands we include here, somewhat tangentially, a simple
+simulator that generates synthetic reads for a random genome.  simulator first
+generates a fake genome of size genlen*1Mb long, that has an AT-bias of -b.  It then
+generates sample reads of mean length -m from a log-normal length distribution with
+standard deviation -s, but ignores reads of length less than -x.  It collects enough
+reads to cover the genome -c times and introduces -e fraction errors into each read
+where the ratio of insertions, deletions, and substitutions are set by defined
+constants INS_RATE (default 73%) and DEL_RATE (default 20%) within generate.c.  One
+can also control the rate at which reads are picked from the forward and reverse
+strands by setting the defined constant FLIP_RATE (default 50/50).  The -r option seeds
+the random number generator for the generation of the genome so that one can
+reproducibly generate the same underlying genome to sample from.  If this parameter is
+missing, then the job id of the invocation seeds the random number generator.  The
+output is sent to the standard output (i.e. it is a UNIX pipe).  The output is in
+Pacbio .fasta format suitable as input to fasta2DB.  Finally, the -M option requests
+that the coordinates from which each read has been sampled are written to the indicated
+file, one line per read, ASCII encoded.  This "map" file essentially tells one where
+every read belongs in an assembly and is very useful for debugging and testing
+purposes.  If a read pair is say b,e then if b < e the read was sampled from [b,e] in
+the forward direction, and if b > e from [e,b] in the reverse direction.
+
+# SEE ALSO

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/dazzdb.git



More information about the debian-med-commit mailing list