[med-svn] r10605 - in trunk/packages/cd-hit/trunk/debian: . patches

Andreas Tille tille at alioth.debian.org
Thu Apr 26 09:14:26 UTC 2012


Author: tille
Date: 2012-04-26 09:14:25 +0000 (Thu, 26 Apr 2012)
New Revision: 10605

Added:
   trunk/packages/cd-hit/trunk/debian/patches/enable_help2man.patch
Removed:
   trunk/packages/cd-hit/trunk/debian/cdhit-2d.1
   trunk/packages/cd-hit/trunk/debian/cdhit.1
   trunk/packages/cd-hit/trunk/debian/manpages
Modified:
   trunk/packages/cd-hit/trunk/debian/changelog
   trunk/packages/cd-hit/trunk/debian/control
   trunk/packages/cd-hit/trunk/debian/patches/series
   trunk/packages/cd-hit/trunk/debian/rules
Log:
Use help2man in build process to get more manpages; drop manually enhanced previous manpages (the difference is that sections "SEE ALSO" and "AUTHOR" are not specificall marked but we have eight manpages always up to date more)


Deleted: trunk/packages/cd-hit/trunk/debian/cdhit-2d.1
===================================================================
--- trunk/packages/cd-hit/trunk/debian/cdhit-2d.1	2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/cdhit-2d.1	2012-04-26 09:14:25 UTC (rev 10605)
@@ -1,143 +0,0 @@
-.TH CDHIT-2D "1" "December 2011" "cdhit-2d" "User Commands"
-.SH NAME
-cdhit-2d \- quickly group sequences in db1 or db2 format
-.SH SYNOPSIS
-.B cdhit-2d
-[\fIOptions\fR]
-.SH DESCRIPTION
-.PP
-Options
-.TP
-\fB\-i\fR
-input filename for db1 in fasta format, required
-.HP
-\fB\-i2\fR input filename for db2 in fasta format, required
-.TP
-\fB\-o\fR
-output filename, required
-.TP
-\fB\-c\fR
-sequence identity threshold, default 0.9
-this is the default cd\-hit's "global sequence identity" calculated as:
-number of identical amino acids in alignment
-divided by the full length of the shorter sequence
-.TP
-\fB\-G\fR
-use global sequence identity, default 1
-if set to 0, then use local sequence identity, calculated as :
-number of identical amino acids in alignment
-divided by the length of the alignment
-NOTE!!! don't use \fB\-G\fR 0 unless you use alignment coverage controls
-see options \fB\-aL\fR, \fB\-AL\fR, \fB\-aS\fR, \fB\-AS\fR
-.TP
-\fB\-b\fR
-band_width of alignment, default 20
-.TP
-\fB\-M\fR
-memory limit (in MB) for the program, default 800; 0 for unlimitted;
-.TP
-\fB\-T\fR
-number of threads, default 1; with 0, all CPUs will be used
-.TP
-\fB\-n\fR
-word_length, default 5, see user's guide for choosing it
-.TP
-\fB\-l\fR
-length of throw_away_sequences, default 10
-.TP
-\fB\-t\fR
-tolerance for redundance, default 2
-.TP
-\fB\-d\fR
-length of description in .clstr file, default 20
-if set to 0, it takes the fasta defline and stops at first space
-.TP
-\fB\-s\fR
-length difference cutoff, default 0.0
-if set to 0.9, the shorter sequences need to be
-at least 90% length of the representative of the cluster
-.TP
-\fB\-S\fR
-length difference cutoff in amino acid, default 999999
-if set to 60, the length difference between the shorter sequences
-and the representative of the cluster can not be bigger than 60
-.HP
-\fB\-s2\fR length difference cutoff for db1, default 1.0
-.IP
-by default, seqs in db1 >= seqs in db2 in a same cluster
-if set to 0.9, seqs in db1 may just >= 90% seqs in db2
-.HP
-\fB\-S2\fR length difference cutoff, default 0
-.IP
-by default, seqs in db1 >= seqs in db2 in a same cluster
-if set to 60, seqs in db2 may 60aa longer than seqs in db1
-.HP
-\fB\-aL\fR alignment coverage for the longer sequence, default 0.0
-.IP
-if set to 0.9, the alignment must covers 90% of the sequence
-.HP
-\fB\-AL\fR alignment coverage control for the longer sequence, default 99999999
-.IP
-if set to 60, and the length of the sequence is 400,
-then the alignment must be >= 340 (400\-60) residues
-.HP
-\fB\-aS\fR alignment coverage for the shorter sequence, default 0.0
-.IP
-if set to 0.9, the alignment must covers 90% of the sequence
-.HP
-\fB\-AS\fR alignment coverage control for the shorter sequence, default 99999999
-.IP
-if set to 60, and the length of the sequence is 400,
-then the alignment must be >= 340 (400\-60) residues
-.TP
-\fB\-A\fR
-minimal alignment coverage control for the both sequences, default 0
-alignment must cover >= this value for both sequences
-.HP
-\fB\-uL\fR maximum unmatched percentage for the longer sequence, default 1.0
-.IP
-if set to 0.1, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10% of the sequence
-.HP
-\fB\-uS\fR maximum unmatched percentage for the shorter sequence, default 1.0
-.IP
-if set to 0.1, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10% of the sequence
-.TP
-\fB\-U\fR
-maximum unmatched length, default 99999999
-if set to 10, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10 bases
-.TP
-\fB\-B\fR
-1 or 0, default 0, by default, sequences are stored in RAM
-if set to 1, sequence are stored on hard drive
-it is recommended to use \fB\-B\fR 1 for huge databases
-.TP
-\fB\-p\fR
-1 or 0, default 0
-if set to 1, print alignment overlap in .clstr file
-.TP
-\fB\-g\fR
-1 or 0, default 0
-by cd\-hit's default algorithm, a sequence is clustered to the first
-cluster that meet the threshold (fast cluster). If set to 1, the program
-will cluster it into the most similar cluster that meet the threshold
-(accurate but slow mode)
-but either 1 or 0 won't change the representatives of final clusters
-.HP
-\fB\-h\fR print this help
-.SH AUTHOR
-Questions, bugs, contact Weizhong Li at liwz at sdsc.edu
-.IP
-If you find cd\-hit useful, please kindly cite:
-.IP
-"Clustering of highly homologous sequences to reduce thesize of large protein database", Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2001) 17:282\-283
-"Cd\-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences", Weizhong Li & Adam Godzik. Bioinformatics, (2006) 22:1658\-1659
-.PP
-This manual page was written by Andreas Tille <tille at debian.org> using
-help2man for the \fBDebian GNU/Linux\fP system (but may be used by
-others).  Permission is granted to copy, distribute and/or modify this
-document under the terms of the GNU General Public License, Version 2
-any later version published by the Free Software Foundation.
-

Deleted: trunk/packages/cd-hit/trunk/debian/cdhit.1
===================================================================
--- trunk/packages/cd-hit/trunk/debian/cdhit.1	2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/cdhit.1	2012-04-26 09:14:25 UTC (rev 10605)
@@ -1,133 +0,0 @@
-.TH CDHIT "1" "December 2011" "4.5.7 (built on Dec 22 2011)" "User Commands"
-.SH NAME
-cdhit \- quickly group sequences
-.SH SYNOPSIS
-.B cdhit
-[\fIOptions\fR]
-.SH DESCRIPTION
-.PP
-Options
-.TP
-\fB\-i\fR
-input filename in fasta format, required
-.TP
-\fB\-o\fR
-output filename, required
-.TP
-\fB\-c\fR
-sequence identity threshold, default 0.9
-this is the default cd\-hit's "global sequence identity" calculated as:
-number of identical amino acids in alignment
-divided by the full length of the shorter sequence
-.TP
-\fB\-G\fR
-use global sequence identity, default 1
-if set to 0, then use local sequence identity, calculated as :
-number of identical amino acids in alignment
-divided by the length of the alignment
-NOTE!!! don't use \fB\-G\fR 0 unless you use alignment coverage controls
-see options \fB\-aL\fR, \fB\-AL\fR, \fB\-aS\fR, \fB\-AS\fR
-.TP
-\fB\-b\fR
-band_width of alignment, default 20
-.TP
-\fB\-M\fR
-memory limit (in MB) for the program, default 800; 0 for unlimitted;
-.TP
-\fB\-T\fR
-number of threads, default 1; with 0, all CPUs will be used
-.TP
-\fB\-n\fR
-word_length, default 5, see user's guide for choosing it
-.TP
-\fB\-l\fR
-length of throw_away_sequences, default 10
-.TP
-\fB\-t\fR
-tolerance for redundance, default 2
-.TP
-\fB\-d\fR
-length of description in .clstr file, default 20
-if set to 0, it takes the fasta defline and stops at first space
-.TP
-\fB\-s\fR
-length difference cutoff, default 0.0
-if set to 0.9, the shorter sequences need to be
-at least 90% length of the representative of the cluster
-.TP
-\fB\-S\fR
-length difference cutoff in amino acid, default 999999
-if set to 60, the length difference between the shorter sequences
-and the representative of the cluster can not be bigger than 60
-.HP
-\fB\-aL\fR alignment coverage for the longer sequence, default 0.0
-.IP
-if set to 0.9, the alignment must covers 90% of the sequence
-.HP
-\fB\-AL\fR alignment coverage control for the longer sequence, default 99999999
-.IP
-if set to 60, and the length of the sequence is 400,
-then the alignment must be >= 340 (400\-60) residues
-.HP
-\fB\-aS\fR alignment coverage for the shorter sequence, default 0.0
-.IP
-if set to 0.9, the alignment must covers 90% of the sequence
-.HP
-\fB\-AS\fR alignment coverage control for the shorter sequence, default 99999999
-.IP
-if set to 60, and the length of the sequence is 400,
-then the alignment must be >= 340 (400\-60) residues
-.TP
-\fB\-A\fR
-minimal alignment coverage control for the both sequences, default 0
-alignment must cover >= this value for both sequences
-.HP
-\fB\-uL\fR maximum unmatched percentage for the longer sequence, default 1.0
-.IP
-if set to 0.1, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10% of the sequence
-.HP
-\fB\-uS\fR maximum unmatched percentage for the shorter sequence, default 1.0
-.IP
-if set to 0.1, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10% of the sequence
-.TP
-\fB\-U\fR
-maximum unmatched length, default 99999999
-if set to 10, the unmatched region (excluding leading and tailing gaps)
-must not be more than 10 bases
-.TP
-\fB\-B\fR
-1 or 0, default 0, by default, sequences are stored in RAM
-if set to 1, sequence are stored on hard drive
-it is recommended to use \fB\-B\fR 1 for huge databases
-.TP
-\fB\-p\fR
-1 or 0, default 0
-if set to 1, print alignment overlap in .clstr file
-.TP
-\fB\-g\fR
-1 or 0, default 0
-by cd\-hit's default algorithm, a sequence is clustered to the first
-cluster that meet the threshold (fast cluster). If set to 1, the program
-will cluster it into the most similar cluster that meet the threshold
-(accurate but slow mode)
-but either 1 or 0 won't change the representatives of final clusters
-.HP
-\fB\-h\fR print this help
-.SH SEE ALSO
-Questions, bugs, contact Limin Fu at l2fu at ucsd.edu, or Weizhong Li at liwz at sdsc.edu
-For updated versions and information, please visit: http://cd\-hit.org
-.IP
-cd\-hit web server is also available from http://cd\-hit.org
-.SH AUTHOR
-If you find cd\-hit useful, please kindly cite:
-.IP
-"Clustering of highly homologous sequences to reduce thesize of large protein database", Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2001) 17:282\-283
-"Tolerating some redundancy significantly speeds up clustering of large protein databases", Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2002) 18:77\-82
-.PP
-This manual page was written by Andreas Tille <tille at debian.org> using
-help2man for the \fBDebian GNU/Linux\fP system (but may be used by
-others).  Permission is granted to copy, distribute and/or modify this
-document under the terms of the GNU General Public License, Version 2
-any later version published by the Free Software Foundation.

Modified: trunk/packages/cd-hit/trunk/debian/changelog
===================================================================
--- trunk/packages/cd-hit/trunk/debian/changelog	2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/changelog	2012-04-26 09:14:25 UTC (rev 10605)
@@ -1,4 +1,4 @@
-cd-hit (4.6-2012-04-25-1) UNRELEASED; urgency=low
+cd-hit (4.6-2012-04-25-1) unstable; urgency=low
 
   * New upstream version incorporating the previous patches as
     well as the LaTeX source of the documentation
@@ -7,6 +7,14 @@
   * debian/{control,rules}: Use mpi version of cd-hit
   * README.Debian: Tell users that we are using mpi and they should
     ask for an alternative if needed
+  * debian/*.1, debian/manpages: Deleted in favour of autogenerated
+    manpages using help2man; Remark: the manually edited pages were
+    slightly better regarding "SEE ALSO" and "AUTHORS" sections but
+    it is better to auto-generate the pages to stay with changes of
+    future versions
+  * debian/rules: Use help2man 2 create manpages whereever possible
+  * debian/patches/enable_help2man.patch: Fix some minor issues to
+    get less error output in man pages
 
  -- Andreas Tille <tille at debian.org>  Thu, 26 Apr 2012 07:54:23 +0200
 

Modified: trunk/packages/cd-hit/trunk/debian/control
===================================================================
--- trunk/packages/cd-hit/trunk/debian/control	2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/control	2012-04-26 09:14:25 UTC (rev 10605)
@@ -5,7 +5,7 @@
 Uploaders: Tim Booth <tbooth at ceh.ac.uk>,
  Andreas Tille <tille at debian.org>
 DM-Upload-Allowed: yes
-Build-Depends: debhelper (>= 9), mpi-default-dev
+Build-Depends: debhelper (>= 9), mpi-default-dev, help2man
 Standards-Version: 3.9.3
 Homepage: http://weizhong-lab.ucsd.edu/cd-hit/
 Vcs-Browser: http://svn.debian.org/wsvn/debian-med/trunk/packages/cd-hit/trunk/

Deleted: trunk/packages/cd-hit/trunk/debian/manpages
===================================================================
--- trunk/packages/cd-hit/trunk/debian/manpages	2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/manpages	2012-04-26 09:14:25 UTC (rev 10605)
@@ -1 +0,0 @@
-debian/*.1

Added: trunk/packages/cd-hit/trunk/debian/patches/enable_help2man.patch
===================================================================
--- trunk/packages/cd-hit/trunk/debian/patches/enable_help2man.patch	                        (rev 0)
+++ trunk/packages/cd-hit/trunk/debian/patches/enable_help2man.patch	2012-04-26 09:14:25 UTC (rev 10605)
@@ -0,0 +1,25 @@
+Author: Andreas Tille <tille at debian.org>
+Description: Help help2man to run without producing errors
+
+--- cd-hit-v4.6-2012-04-25.orig/cd-hit-2d-para.pl
++++ cd-hit-v4.6-2012-04-25/cd-hit-2d-para.pl
+@@ -55,7 +55,7 @@
+   elsif ($arg eq "--Q" ) { $queue      = shift; }
+   elsif ($arg eq "--T" ) { $queue_type = shift; }
+   elsif ($arg eq "--R" ) { $restart_in = shift; }
+-  else  {$arg_pass         .= " $arg " . shift; }
++  else  {$arg_pass         .= " $arg " ; $arg_pass .= shift if (shift) }
+ }
+ ($in and $out) || print_usage();
+ if (not ($seg_no2 >1)) {
+--- cd-hit-v4.6-2012-04-25.orig/cd-hit-para.pl
++++ cd-hit-v4.6-2012-04-25/cd-hit-para.pl
+@@ -52,7 +52,7 @@
+   elsif ($arg eq "--Q") { $queue      = shift; }
+   elsif ($arg eq "--T") { $queue_type = shift; }
+   elsif ($arg eq "--R") { $restart_in = shift; }
+-  else  {$arg_pass       .= " $arg " . shift; }
++  else  {$arg_pass         .= " $arg " ; $arg_pass .= shift if (shift) }
+ }
+ ($in and $out) || print_usage();
+ 

Modified: trunk/packages/cd-hit/trunk/debian/patches/series
===================================================================
--- trunk/packages/cd-hit/trunk/debian/patches/series	2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/patches/series	2012-04-26 09:14:25 UTC (rev 10605)
@@ -1 +1,2 @@
 use-dpkg-buildflags.patch
+enable_help2man.patch

Modified: trunk/packages/cd-hit/trunk/debian/rules
===================================================================
--- trunk/packages/cd-hit/trunk/debian/rules	2012-04-26 08:21:52 UTC (rev 10604)
+++ trunk/packages/cd-hit/trunk/debian/rules	2012-04-26 09:14:25 UTC (rev 10605)
@@ -8,13 +8,80 @@
 #export DH_VERBOSE=1
 
 pkg := $(shell dpkg-parsechangelog | sed -n 's/^Source: //p')
+ver := $(shell dpkg-parsechangelog | sed -ne 's/^Version: \(\([0-9]\+\):\)\?\(.*\)-.*/\3/p')
 
+mandir=$(CURDIR)/debian/$(pkg)/usr/share/man/man1/
+
 %:
 	dh $@
 
 override_dh_auto_build:
 	dh_auto_build -- openmp=yes
 
+override_dh_installman:
+	mkdir -p $(mandir)
+
+	help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+	   --name='quickly group sequences' \
+	   $(CURDIR)/cd-hit	| \
+	   sed -e 's/^cd-\(hit \\-\)/cd\1/' -e 's/^.B cd-hit/.B cdhit/' \
+	   > $(mandir)/cdhit.1
+
+	help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+	   --name='quickly group sequences in db1 or db2 format' \
+	   $(CURDIR)/cd-hit-2d	| \
+	   sed -e 's/^cd-\(hit-2d \\-\)/cd\1/' -e 's/^.B cd-hit/.B cdhit/' \
+	   	> $(mandir)/cdhit-2d.1
+
+	help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+	   --name='run CD-HIT algorithm on RNA/DNA sequences' \
+	   $(CURDIR)/cd-hit-est	| \
+	   sed -e 's/^cd-\(hit-est \\-\)/cd\1/' -e 's/^.B cd-hit/.B cdhit/' \
+		> $(mandir)/cdhit-est.1
+
+	help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+	   --name='run CD-HIT algorithm on RNA/DNA sequences in db1 or db2 format' \
+	   $(CURDIR)/cd-hit-est-2d	| \
+	   sed -e 's/^cd-\(hit-est-2d \\-\)/cd\1/' -e 's/^.B cd-hit/.B cdhit/' \
+		> $(mandir)/cdhit-est-2d.1
+
+	help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+	   --name='divide a big clustering job into pieces to run cd-hit-2d or cd-hit-est-2d jobs' \
+	   $(CURDIR)/cd-hit-2d-para.pl	> $(mandir)/cd-hit-2d-para.1
+
+	help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+	   --name='divide a big clustering job into pieces to run cd-hit or cd-hit-est jobs' \
+	    $(CURDIR)/cd-hit-para.pl	> $(mandir)/cd-hit-para.1
+
+	# psi-cd-hit.pl is throwing some errors which are fixed using sed
+	help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+	   --name='runs similar algorithm like CD-HIT but using BLAST to calculate similarities' \
+	   $(CURDIR)/psi-cd-hit.pl	| \
+	   sed -e '/^Name "main::.*" used only once:/d' \
+		> $(mandir)/psi-cd-hit.1
+
+	help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+	   --name='runs similar algorithm like CD-HIT but using BLAST to calculate similarities in db1 or db2 format' \
+	      $(CURDIR)/psi-cd-hit-2d.pl	> $(mandir)/psi-cd-hit-2d.1
+
+	# FIXME: what is the difference between psi-cd-hit-2d.pl and psi-cd-hit-2d-g1.pl ?
+	help2man --no-info --no-discard-stderr --version-string='$(ver)' \
+	   --name='runs similar algorithm like CD-HIT but using BLAST to calculate similarities in db1 or db2 format' \
+	      $(CURDIR)/psi-cd-hit-2d-g1.pl	> $(mandir)/psi-cd-hit-2d-g1.1
+
+	# No help output from
+	#   cd-hit-div.pl
+	#   clstr2tree.pl
+	#   clstr_merge.pl
+	#   clstr_merge_noorder.pl
+	#   clstr_reduce.pl
+	#   clstr_renumber.pl
+	#   clstr_rev.pl
+	#   clstr_sort_by.pl
+	#   clstr_sort_prot_by
+	#   make_multi_seq
+	#   psi-cd-hit-local.pl    -> even throws several "used only once: possible typo" errors
+
 override_dh_auto_install:
 	dh_auto_install -- PREFIX=debian/$(pkg)/usr/lib/cd-hit
 




More information about the debian-med-commit mailing list