[med-svn] r10605 - in trunk/packages/cd-hit/trunk/debian: . patches
Andreas Tille
tille at alioth.debian.org
Thu Apr 26 09:14:26 UTC 2012
Author: tille
Date: 2012-04-26 09:14:25 +0000 (Thu, 26 Apr 2012)
New Revision: 10605
Added:
trunk/packages/cd-hit/trunk/debian/patches/enable_help2man.patch
Removed:
trunk/packages/cd-hit/trunk/debian/cdhit-2d.1
trunk/packages/cd-hit/trunk/debian/cdhit.1
trunk/packages/cd-hit/trunk/debian/manpages
Modified:
trunk/packages/cd-hit/trunk/debian/changelog
trunk/packages/cd-hit/trunk/debian/control
trunk/packages/cd-hit/trunk/debian/patches/series
trunk/packages/cd-hit/trunk/debian/rules
Log:
Use help2man in build process to get more manpages; drop manually enhanced previous manpages (the difference is that sections "SEE ALSO" and "AUTHOR" are not specificall marked but we have eight manpages always up to date more)
Deleted: trunk/packages/cd-hit/trunk/debian/cdhit-2d.1
===================================================================
--- trunk/packages/cd-hit/trunk/debian/cdhit-2d.1 2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/cdhit-2d.1 2012-04-26 09:14:25 UTC (rev 10605)
@@ -1,143 +0,0 @@
-.TH CDHIT-2D "1" "December 2011" "cdhit-2d" "User Commands"
-.SH NAME
-cdhit-2d \- quickly group sequences in db1 or db2 format
-.SH SYNOPSIS
-.B cdhit-2d
-[\fIOptions\fR]
-.SH DESCRIPTION
-.PP
-Options
-.TP
-\fB\-i\fR
-input filename for db1 in fasta format, required
-.HP
-\fB\-i2\fR input filename for db2 in fasta format, required
-.TP
-\fB\-o\fR
-output filename, required
-.TP
-\fB\-c\fR
-sequence identity threshold, default 0.9
-this is the default cd\-hit's "global sequence identity" calculated as:
-number of identical amino acids in alignment
-divided by the full length of the shorter sequence
-.TP
-\fB\-G\fR
-use global sequence identity, default 1
-if set to 0, then use local sequence identity, calculated as :
-number of identical amino acids in alignment
-divided by the length of the alignment
-NOTE!!! don't use \fB\-G\fR 0 unless you use alignment coverage controls
-see options \fB\-aL\fR, \fB\-AL\fR, \fB\-aS\fR, \fB\-AS\fR
-.TP
-\fB\-b\fR
-band_width of alignment, default 20
-.TP
-\fB\-M\fR
-memory limit (in MB) for the program, default 800; 0 for unlimitted;
-.TP
-\fB\-T\fR
-number of threads, default 1; with 0, all CPUs will be used
-.TP
-\fB\-n\fR
-word_length, default 5, see user's guide for choosing it
-.TP
-\fB\-l\fR
-length of throw_away_sequences, default 10
-.TP
-\fB\-t\fR
-tolerance for redundance, default 2
-.TP
-\fB\-d\fR
-length of description in .clstr file, default 20
-if set to 0, it takes the fasta defline and stops at first space
-.TP
-\fB\-s\fR
-length difference cutoff, default 0.0
-if set to 0.9, the shorter sequences need to be
-at least 90% length of the representative of the cluster
-.TP
-\fB\-S\fR
-length difference cutoff in amino acid, default 999999
-if set to 60, the length difference between the shorter sequences
-and the representative of the cluster can not be bigger than 60
-.HP
-\fB\-s2\fR length difference cutoff for db1, default 1.0
-.IP
-by default, seqs in db1 >= seqs in db2 in a same cluster
-if set to 0.9, seqs in db1 may just >= 90% seqs in db2
-.HP
-\fB\-S2\fR length difference cutoff, default 0
-.IP
-by default, seqs in db1 >= seqs in db2 in a same cluster
-if set to 60, seqs in db2 may 60aa longer than seqs in db1
-.HP
-\fB\-aL\fR alignment coverage for the longer sequence, default 0.0
-.IP
-if set to 0.9, the alignment must covers 90% of the sequence
-.HP
-\fB\-AL\fR alignment coverage control for the longer sequence, default 99999999
-.IP
-if set to 60, and the length of the sequence is 400,
-then the alignment must be >= 340 (400\-60) residues
-.HP
-\fB\-aS\fR alignment coverage for the shorter sequence, default 0.0
-.IP
-if set to 0.9, the alignment must covers 90% of the sequence
-.HP
-\fB\-AS\fR alignment coverage control for the shorter sequence, default 99999999
-.IP
-if set to 60, and the length of the sequence is 400,
-then the alignment must be >= 340 (400\-60) residues
-.TP
-\fB\-A\fR
-minimal alignment coverage control for the both sequences, default 0
-alignment must cover >= this value for both sequences
-.HP
-\fB\-uL\fR maximum unmatched percentage for the longer sequence, default 1.0
-.IP
-if set to 0.1, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10% of the sequence
-.HP
-\fB\-uS\fR maximum unmatched percentage for the shorter sequence, default 1.0
-.IP
-if set to 0.1, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10% of the sequence
-.TP
-\fB\-U\fR
-maximum unmatched length, default 99999999
-if set to 10, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10 bases
-.TP
-\fB\-B\fR
-1 or 0, default 0, by default, sequences are stored in RAM
-if set to 1, sequence are stored on hard drive
-it is recommended to use \fB\-B\fR 1 for huge databases
-.TP
-\fB\-p\fR
-1 or 0, default 0
-if set to 1, print alignment overlap in .clstr file
-.TP
-\fB\-g\fR
-1 or 0, default 0
-by cd\-hit's default algorithm, a sequence is clustered to the first
-cluster that meet the threshold (fast cluster). If set to 1, the program
-will cluster it into the most similar cluster that meet the threshold
-(accurate but slow mode)
-but either 1 or 0 won't change the representatives of final clusters
-.HP
-\fB\-h\fR print this help
-.SH AUTHOR
-Questions, bugs, contact Weizhong Li at liwz at sdsc.edu
-.IP
-If you find cd\-hit useful, please kindly cite:
-.IP
-"Clustering of highly homologous sequences to reduce thesize of large protein database", Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2001) 17:282\-283
-"Cd\-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences", Weizhong Li & Adam Godzik. Bioinformatics, (2006) 22:1658\-1659
-.PP
-This manual page was written by Andreas Tille <tille at debian.org> using
-help2man for the \fBDebian GNU/Linux\fP system (but may be used by
-others). Permission is granted to copy, distribute and/or modify this
-document under the terms of the GNU General Public License, Version 2
-any later version published by the Free Software Foundation.
-
Deleted: trunk/packages/cd-hit/trunk/debian/cdhit.1
===================================================================
--- trunk/packages/cd-hit/trunk/debian/cdhit.1 2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/cdhit.1 2012-04-26 09:14:25 UTC (rev 10605)
@@ -1,133 +0,0 @@
-.TH CDHIT "1" "December 2011" "4.5.7 (built on Dec 22 2011)" "User Commands"
-.SH NAME
-cdhit \- quickly group sequences
-.SH SYNOPSIS
-.B cdhit
-[\fIOptions\fR]
-.SH DESCRIPTION
-.PP
-Options
-.TP
-\fB\-i\fR
-input filename in fasta format, required
-.TP
-\fB\-o\fR
-output filename, required
-.TP
-\fB\-c\fR
-sequence identity threshold, default 0.9
-this is the default cd\-hit's "global sequence identity" calculated as:
-number of identical amino acids in alignment
-divided by the full length of the shorter sequence
-.TP
-\fB\-G\fR
-use global sequence identity, default 1
-if set to 0, then use local sequence identity, calculated as :
-number of identical amino acids in alignment
-divided by the length of the alignment
-NOTE!!! don't use \fB\-G\fR 0 unless you use alignment coverage controls
-see options \fB\-aL\fR, \fB\-AL\fR, \fB\-aS\fR, \fB\-AS\fR
-.TP
-\fB\-b\fR
-band_width of alignment, default 20
-.TP
-\fB\-M\fR
-memory limit (in MB) for the program, default 800; 0 for unlimitted;
-.TP
-\fB\-T\fR
-number of threads, default 1; with 0, all CPUs will be used
-.TP
-\fB\-n\fR
-word_length, default 5, see user's guide for choosing it
-.TP
-\fB\-l\fR
-length of throw_away_sequences, default 10
-.TP
-\fB\-t\fR
-tolerance for redundance, default 2
-.TP
-\fB\-d\fR
-length of description in .clstr file, default 20
-if set to 0, it takes the fasta defline and stops at first space
-.TP
-\fB\-s\fR
-length difference cutoff, default 0.0
-if set to 0.9, the shorter sequences need to be
-at least 90% length of the representative of the cluster
-.TP
-\fB\-S\fR
-length difference cutoff in amino acid, default 999999
-if set to 60, the length difference between the shorter sequences
-and the representative of the cluster can not be bigger than 60
-.HP
-\fB\-aL\fR alignment coverage for the longer sequence, default 0.0
-.IP
-if set to 0.9, the alignment must covers 90% of the sequence
-.HP
-\fB\-AL\fR alignment coverage control for the longer sequence, default 99999999
-.IP
-if set to 60, and the length of the sequence is 400,
-then the alignment must be >= 340 (400\-60) residues
-.HP
-\fB\-aS\fR alignment coverage for the shorter sequence, default 0.0
-.IP
-if set to 0.9, the alignment must covers 90% of the sequence
-.HP
-\fB\-AS\fR alignment coverage control for the shorter sequence, default 99999999
-.IP
-if set to 60, and the length of the sequence is 400,
-then the alignment must be >= 340 (400\-60) residues
-.TP
-\fB\-A\fR
-minimal alignment coverage control for the both sequences, default 0
-alignment must cover >= this value for both sequences
-.HP
-\fB\-uL\fR maximum unmatched percentage for the longer sequence, default 1.0
-.IP
-if set to 0.1, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10% of the sequence
-.HP
-\fB\-uS\fR maximum unmatched percentage for the shorter sequence, default 1.0
-.IP
-if set to 0.1, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10% of the sequence
-.TP
-\fB\-U\fR
-maximum unmatched length, default 99999999
-if set to 10, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10 bases
-.TP
-\fB\-B\fR
-1 or 0, default 0, by default, sequences are stored in RAM
-if set to 1, sequence are stored on hard drive
-it is recommended to use \fB\-B\fR 1 for huge databases
-.TP
-\fB\-p\fR
-1 or 0, default 0
-if set to 1, print alignment overlap in .clstr file
-.TP
-\fB\-g\fR
-1 or 0, default 0
-by cd\-hit's default algorithm, a sequence is clustered to the first
-cluster that meet the threshold (fast cluster). If set to 1, the program
-will cluster it into the most similar cluster that meet the threshold
-(accurate but slow mode)
-but either 1 or 0 won't change the representatives of final clusters
-.HP
-\fB\-h\fR print this help
-.SH SEE ALSO
-Questions, bugs, contact Limin Fu at l2fu at ucsd.edu, or Weizhong Li at liwz at sdsc.edu
-For updated versions and information, please visit: http://cd\-hit.org
-.IP
-cd\-hit web server is also available from http://cd\-hit.org
-.SH AUTHOR
-If you find cd\-hit useful, please kindly cite:
-.IP
-"Clustering of highly homologous sequences to reduce thesize of large protein database", Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2001) 17:282\-283
-"Tolerating some redundancy significantly speeds up clustering of large protein databases", Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2002) 18:77\-82
-.PP
-This manual page was written by Andreas Tille <tille at debian.org> using
-help2man for the \fBDebian GNU/Linux\fP system (but may be used by
-others). Permission is granted to copy, distribute and/or modify this
-document under the terms of the GNU General Public License, Version 2
-any later version published by the Free Software Foundation.
Modified: trunk/packages/cd-hit/trunk/debian/changelog
===================================================================
--- trunk/packages/cd-hit/trunk/debian/changelog 2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/changelog 2012-04-26 09:14:25 UTC (rev 10605)
@@ -1,4 +1,4 @@
-cd-hit (4.6-2012-04-25-1) UNRELEASED; urgency=low
+cd-hit (4.6-2012-04-25-1) unstable; urgency=low
* New upstream version incorporating the previous patches as
well as the LaTeX source of the documentation
@@ -7,6 +7,14 @@
* debian/{control,rules}: Use mpi version of cd-hit
* README.Debian: Tell users that we are using mpi and they should
ask for an alternative if needed
+ * debian/*.1, debian/manpages: Deleted in favour of autogenerated
+ manpages using help2man; Remark: the manually edited pages were
+ slightly better regarding "SEE ALSO" and "AUTHORS" sections but
+ it is better to auto-generate the pages to stay with changes of
+ future versions
+ * debian/rules: Use help2man 2 create manpages whereever possible
+ * debian/patches/enable_help2man.patch: Fix some minor issues to
+ get less error output in man pages
-- Andreas Tille <tille at debian.org> Thu, 26 Apr 2012 07:54:23 +0200
Modified: trunk/packages/cd-hit/trunk/debian/control
===================================================================
--- trunk/packages/cd-hit/trunk/debian/control 2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/control 2012-04-26 09:14:25 UTC (rev 10605)
@@ -5,7 +5,7 @@
Uploaders: Tim Booth <tbooth at ceh.ac.uk>,
Andreas Tille <tille at debian.org>
DM-Upload-Allowed: yes
-Build-Depends: debhelper (>= 9), mpi-default-dev
+Build-Depends: debhelper (>= 9), mpi-default-dev, help2man
Standards-Version: 3.9.3
Homepage: http://weizhong-lab.ucsd.edu/cd-hit/
Vcs-Browser: http://svn.debian.org/wsvn/debian-med/trunk/packages/cd-hit/trunk/
Deleted: trunk/packages/cd-hit/trunk/debian/manpages
===================================================================
--- trunk/packages/cd-hit/trunk/debian/manpages 2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/manpages 2012-04-26 09:14:25 UTC (rev 10605)
@@ -1 +0,0 @@
-debian/*.1
Added: trunk/packages/cd-hit/trunk/debian/patches/enable_help2man.patch
===================================================================
--- trunk/packages/cd-hit/trunk/debian/patches/enable_help2man.patch (rev 0)
+++ trunk/packages/cd-hit/trunk/debian/patches/enable_help2man.patch 2012-04-26 09:14:25 UTC (rev 10605)
@@ -0,0 +1,25 @@
+Author: Andreas Tille <tille at debian.org>
+Description: Help help2man to run without producing errors
+
+--- cd-hit-v4.6-2012-04-25.orig/cd-hit-2d-para.pl
++++ cd-hit-v4.6-2012-04-25/cd-hit-2d-para.pl
+@@ -55,7 +55,7 @@
+ elsif ($arg eq "--Q" ) { $queue = shift; }
+ elsif ($arg eq "--T" ) { $queue_type = shift; }
+ elsif ($arg eq "--R" ) { $restart_in = shift; }
+- else {$arg_pass .= " $arg " . shift; }
++ else {$arg_pass .= " $arg " ; $arg_pass .= shift if (shift) }
+ }
+ ($in and $out) || print_usage();
+ if (not ($seg_no2 >1)) {
+--- cd-hit-v4.6-2012-04-25.orig/cd-hit-para.pl
++++ cd-hit-v4.6-2012-04-25/cd-hit-para.pl
+@@ -52,7 +52,7 @@
+ elsif ($arg eq "--Q") { $queue = shift; }
+ elsif ($arg eq "--T") { $queue_type = shift; }
+ elsif ($arg eq "--R") { $restart_in = shift; }
+- else {$arg_pass .= " $arg " . shift; }
++ else {$arg_pass .= " $arg " ; $arg_pass .= shift if (shift) }
+ }
+ ($in and $out) || print_usage();
+
Modified: trunk/packages/cd-hit/trunk/debian/patches/series
===================================================================
--- trunk/packages/cd-hit/trunk/debian/patches/series 2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/patches/series 2012-04-26 09:14:25 UTC (rev 10605)
@@ -1 +1,2 @@
use-dpkg-buildflags.patch
+enable_help2man.patch
Modified: trunk/packages/cd-hit/trunk/debian/rules
===================================================================
--- trunk/packages/cd-hit/trunk/debian/rules 2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/rules 2012-04-26 09:14:25 UTC (rev 10605)
@@ -8,13 +8,80 @@
#export DH_VERBOSE=1
pkg := $(shell dpkg-parsechangelog | sed -n 's/^Source: //p')
+ver := $(shell dpkg-parsechangelog | sed -ne 's/^Version: \(\([0-9]\+\):\)\?\(.*\)-.*/\3/p')
+mandir=$(CURDIR)/debian/$(pkg)/usr/share/man/man1/
+
%:
dh $@
override_dh_auto_build:
dh_auto_build -- openmp=yes
+override_dh_installman:
+ mkdir -p $(mandir)
+
+ help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+ --name='quickly group sequences' \
+ $(CURDIR)/cd-hit | \
+ sed -e 's/^cd-\(hit \\-\)/cd\1/' -e 's/^.B cd-hit/.B cdhit/' \
+ > $(mandir)/cdhit.1
+
+ help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+ --name='quickly group sequences in db1 or db2 format' \
+ $(CURDIR)/cd-hit-2d | \
+ sed -e 's/^cd-\(hit-2d \\-\)/cd\1/' -e 's/^.B cd-hit/.B cdhit/' \
+ > $(mandir)/cdhit-2d.1
+
+ help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+ --name='run CD-HIT algorithm on RNA/DNA sequences' \
+ $(CURDIR)/cd-hit-est | \
+ sed -e 's/^cd-\(hit-est \\-\)/cd\1/' -e 's/^.B cd-hit/.B cdhit/' \
+ > $(mandir)/cdhit-est.1
+
+ help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+ --name='run CD-HIT algorithm on RNA/DNA sequences in db1 or db2 format' \
+ $(CURDIR)/cd-hit-est-2d | \
+ sed -e 's/^cd-\(hit-est-2d \\-\)/cd\1/' -e 's/^.B cd-hit/.B cdhit/' \
+ > $(mandir)/cdhit-est-2d.1
+
+ help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+ --name='divide a big clustering job into pieces to run cd-hit-2d or cd-hit-est-2d jobs' \
+ $(CURDIR)/cd-hit-2d-para.pl > $(mandir)/cd-hit-2d-para.1
+
+ help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+ --name='divide a big clustering job into pieces to run cd-hit or cd-hit-est jobs' \
+ $(CURDIR)/cd-hit-para.pl > $(mandir)/cd-hit-para.1
+
+ # psi-cd-hit.pl is throwing some errors which are fixed using sed
+ help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+ --name='runs similar algorithm like CD-HIT but using BLAST to calculate similarities' \
+ $(CURDIR)/psi-cd-hit.pl | \
+ sed -e '/^Name "main::.*" used only once:/d' \
+ > $(mandir)/psi-cd-hit.1
+
+ help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+ --name='runs similar algorithm like CD-HIT but using BLAST to calculate similarities in db1 or db2 format' \
+ $(CURDIR)/psi-cd-hit-2d.pl > $(mandir)/psi-cd-hit-2d.1
+
+ # FIXME: what is the difference between psi-cd-hit-2d.pl and psi-cd-hit-2d-g1.pl ?
+ help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+ --name='runs similar algorithm like CD-HIT but using BLAST to calculate similarities in db1 or db2 format' \
+ $(CURDIR)/psi-cd-hit-2d-g1.pl > $(mandir)/psi-cd-hit-2d-g1.1
+
+ # No help output from
+ # cd-hit-div.pl
+ # clstr2tree.pl
+ # clstr_merge.pl
+ # clstr_merge_noorder.pl
+ # clstr_reduce.pl
+ # clstr_renumber.pl
+ # clstr_rev.pl
+ # clstr_sort_by.pl
+ # clstr_sort_prot_by
+ # make_multi_seq
+ # psi-cd-hit-local.pl -> even throws several "used only once: possible typo" errors
+
override_dh_auto_install:
dh_auto_install -- PREFIX=debian/$(pkg)/usr/lib/cd-hit
More information about the debian-med-commit
mailing list