[med-svn] [fastaq] 04/05: change man page creation

Fri Aug 21 22:32:22 UTC 2015

This is an automated email from the git hooks/post-receive script.

sascha-guest pushed a commit to branch master
in repository fastaq.

commit 1e6837d09c20ce0305a409f14ea4a5a7ab730aa0
Author: Sascha Steinbiss <sascha at steinbiss.name>
Date:   Fri Aug 21 22:28:09 2015 +0000

    change man page creation
---
 debian/control      | 142 +---------------------------------------------------
 debian/make_man     |  16 ++++++
 debian/rules        |   5 +-
 debian/usage_to_man | 142 ----------------------------------------------------
 4 files changed, 20 insertions(+), 285 deletions(-)

diff --git a/debian/control b/debian/control
index 06f13f1..72eba03 100644
--- a/debian/control
+++ b/debian/control
@@ -22,149 +22,9 @@ Depends: ${python3:Depends},
          ${misc:Depends}
 Description: FASTA and FASTQ file manipulation tools
  A collection of scripts that perform useful and common
- fasta/q manipulation tasks.
+ FASTA/FASTQ manipulation tasks.
  .
  All scripts automatically detect whether the input is
  a FASTA or FASTQ file.
  .
  Input and output files can be gzipped.
- .
- fastaq_capillary_to_pairs -
- Given a fasta/q file of capillary reads,
- makes an interleaved file of read pairs
- .
- fastaq_chunker -
- Splits a multi fasta/q file into separate files.
- Splits sequences into chunks of a fixed size.
- .
- fastaq_count_sequences -
- Counts the number of sequences in a fasta/q file
- .
- fastaq_deinterleave -
- Deinterleaves fasta/q file, so that reads are written
- alternately between two output files
- .
- fastaq_enumerate_names -
- Renames sequences in a file, calling them 1,2,3...
- .
- fastaq_expand_nucleotides -
- Makes all combinations of sequences in input file
- by using all possibilities of redundant bases.
- e.g. ART could be AAT or AGT.
- .
- fastaq_extend_gaps -
- Extends the length of all gaps (and trims the start/end
- of sequences) in a fasta/q file.
- .
- fastaq_fasta_to_fastq -
- Given a fasta and qual file, makes a fastq file.
- .
- fastaq_filter -
- Filters a fasta/q file by sequence length and/or
- by name matching a regular expression.
- .
- fastaq_get_ids -
- Gets IDs from each sequence in a fasta or fastq file.
- .
- fastaq_get_seq_flanking_gaps -
- Gets the sequences either side of gaps in a fasta/q file.
- .
- fastaq_insert_or_delete_bases -
- Deletes or inserts bases at given position(s)
- from a fasta/q file.
- .
- fastaq_interleave -
- Interleaves two fasta/q files, so that reads are written
- alternately first/second in output file.
- .
- fastaq_long_read_simulate -
- Simulates long reads from a fasta/q file. Can optionally
- make insertions into the reads, like pacbio does.
- .
- fastaq_make_random_contigs -
- Makes a multi-fasta file of random sequences,
- all of the same length. Each base has equal chance of
- being A,C,G or T
- .
- fastaq_merge -
- Converts multi fasta/q file to single sequence file,
- preserving original order of sequences.
- .
- fastaq_replace_bases -
- Replaces all occurences of one letter with another in
- a fasta/q file.
- .
- fastaq_reverse_complement -
- Reverse complements all sequences in a fasta/q file
- .
- fastaq_scaffolds_to_contigs -
- Creates a file of contigs from a file of scaffolds - i.e.
- breaks at every gap in the input.
- .
- fastaq_search_for_seq -
- Searches for an exact match on a given string and its
- reverese complement, in every sequences of a fasta/q file.
- Case insensitive. Guaranteed to find all hits.
- .
- fastaq_sequence_trim -
- Trims sequences off the start of all sequences in a pair
- of fasta/q files, whenever there is a perfect match.
- Only keeps a read pair if both reads of the pair are at
- least a minimum length after any trimming.
- .
- fastaq_split_by_base_count -
- Splits a multi fasta/q file into separate files.
- Does not split sequences. Puts up to max_bases
- into each split file. The exception is that any
- sequence longer than max_bases is put into its own file.
- .
- fastaq_strip_illumina_suffix -
- Strips /1 or /2 off the end of every read name
- in a fasta/q file.
- .
- fastaq_to_fake_qual -
- Makes fake quality scores file from a fasta/q file.
- .
- fastaq_to_fasta -
- Converts sequence file to FASTA format.
- .
- fastaq_to_mira_xml -
- Creates an xml file from a fasta/q file of reads,
- for use with Mira assembler.
- .
- fastaq_to_orfs_gff -
- Writes a GFF file of open reading frames from a fasta/q file
- .
- fastaq_to_perfect_reads -
- Makes perfect paired end fastq reads from a fasta/q file,
- with insert sizes sampled from a normal distribution.
- Read orientation is innies. Output is an interleaved fastq file.
- .
- fastaq_to_random_subset -
- Takes a random subset of reads from a fasta/q file and optionally
- the corresponding read from a mates file.
- Ouptut is interleaved if mates file given.
- .
- fastaq_to_tiling_bam -
- Takes a fasta/q file. Makes a BAM file containing perfect
- (unpaired) reads tiling the whole genome.
- .
- fastaq_to_unique_by_id -
- Removes duplicate sequences from a fasta/q file,
- based on their names. If the same name is found
- more than once, then the longest sequence is kept.
- Order of sequences is preserved in output.
- .
- fastaq_translate -
- Translates all sequences in a fasta or fastq file.
- Output is always fasta format
- .
- fastaq_trim_ends -
- Trims set number of bases off each sequence in a fasta/q file
- .
- fastaq_trim_Ns_at_end -
- Trims any Ns off each sequence in a fasta/q file.
- Does nothing to gaps in the middle, just trims the ends
- .
- A developer API is also provided by this package.
- There are plenty of examples in tasks.py
diff --git a/debian/make_man b/debian/make_man
new file mode 100644
index 0000000..5487e23
--- /dev/null
+++ b/debian/make_man
@@ -0,0 +1,16 @@
+#!/usr/bin/perl
+use strict;
+use warnings;
+
+mkdir('debian/man');
+`help2man -N -o debian/man/fastaq.1 -n 'FASTA and FASTQ file manipulation tools' --no-discard-stderr --version-string=3.6.1 scripts/fastaq`;
+`sed -i 's/.SH DESCRIPTION/.SH DESCRIPTION\\n.nf/' debian/man/fastaq.1 `;
+
+while(<>) {
+	chomp;
+	my ($name, $desc) = split(/\s{2,}/);
+	my $uname = uc($name);
+	`help2man -N -o debian/man/fastaq-$name.1 -n '$desc' --version-string=3.6.1 'scripts/fastaq $name'`;
+	`sed -i 's/.TH FASTAQ /.TH FASTAQ-$uname /' debian/man/fastaq-$name.1`;
+	`sed -i 's/fastaq $name/fastaq_$name/' debian/man/fastaq-$name.1`;
+}
\ No newline at end of file
diff --git a/debian/rules b/debian/rules
index 58f2a1b..013ee91 100755
--- a/debian/rules
+++ b/debian/rules
@@ -16,10 +16,11 @@ override_dh_auto_build:
 	cd $(CURDIR)/doc
 
 override_dh_auto_clean:
-	rm -rf build .pybuild
+	rm -rf build .pybuild doc pyfastaq.egg-info
+	find . -name __pycache__ | xargs rm -rf
 	rm -rf $(mandir)
 
 override_dh_installman:
 	mkdir -p $(mandir)
-	$(debfolder)/usage_to_man
+	scripts/fastaq 2>&1 | tail -n +13 | perl debian/make_man
 	dh_installman --
\ No newline at end of file
diff --git a/debian/usage_to_man b/debian/usage_to_man
deleted file mode 100755
index ff45c2b..0000000
--- a/debian/usage_to_man
+++ /dev/null
@@ -1,142 +0,0 @@
-#!/usr/bin/perl
-use strict;
-use warnings;
-
-#Converts Fastaq python scripts usage into man pages.
-#The man pages are placed in the man folder of the main Fastaq directory
-
-createManPages();
-
-sub createManPages {
-
-  my $source= 'scripts';
-  my $destination= 'debian/man';
-  my $app_name = 'Fastaq';
-  my $descriptions = shortDescription();
-
-  unless ( -d $destination ) {
-    system(mkdir $destination);
-  }
-
-  my @files;
-
-  push(@files,`ls $source/fastaq_*`);
-
-  if ( scalar @files > 0 ) {
-
-    print "Creating manpages\n";
-    for my $file ( @files ) {
-      $file =~ s/\n$//;
-
-      my $filename = $file;
-      $filename =~ s/$source\///;
-
-      my $uc_filename = uc($filename);
-      my $man_file = $filename;
-
-      $man_file = $destination . '/' . $man_file . '.1';
-
-      open (my $man_fh, ">", $man_file);
-
-      my $grep_string = $filename . ': error: too few arguments';
-
-      my $cmd = "help2man -m $filename -n $filename --no-discard-stderr $file | sed 's/usage://gi'";
-      my @output;
-      push(@output, `$cmd`);
-
-      for my $line (@output) {
-	$line =~ s/\n$//;
-
-      }
-
-      for (my $i = 0; $i < scalar @output; $i++) {
-	my $output_line = $output[$i];
-
-	if ($output_line =~ m/^\.TH/) {
-	  $output_line =~ s/\s+/ /g;
-	  $output_line =~ s/(\.TH) ("\d+") ("[a-zA-Z0-9_ ]*") ("[a-zA-Z0-9_<>\[\]\/\.\(\), ]*") ("[a-zA-Z0-9_]*")/$1 $uc_filename $2 $3 "$app_name" "Fastaq executables"/;
-	}
-
-	$output_line =~ s/ \\- $filename/$filename \- $descriptions->{$filename}/;
-
-	if ( $output_line =~ m/^.PP/ && $output[$i + 1] =~ m/^$filename\:/ ) {
-	  $output_line = $output[$i + 1] = '';
-	}
-
-	if ($output_line =~ m/^\.SH "SEE ALSO"/) {
-	  last;
-	}
-	print $man_fh "$output_line\n";
-      }
-
-      writeAuthorAndCopyright($man_fh,$filename);
-      close($man_fh);
-    }
-    print "Manpage creation complete\n";
-  }
-}
-
-sub writeAuthorAndCopyright {
-
-  my ($man_fh,$filename) = @_;
-
-  my $author_blurb = <<END_OF_AUTHOR_BLURB;
-.SH "AUTHOR"
-.sp
-$filename was originally written by Martin Hunt (mh12\@sanger.ac.uk)
-END_OF_AUTHOR_BLURB
-
-  print $man_fh "$author_blurb\n";
-
-  my $copyright_blurb = <<'END_OF_C_BLURB';
-.SH "COPYING"
-.sp
-Wellcome Trust Sanger Institute Copyright \(co 2013 Wellcome Trust Sanger Institute This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version\&.
-END_OF_C_BLURB
-
-  print $man_fh "$copyright_blurb\n";
-
-}
-
-
-sub shortDescription {
-
-    my %descriptions = (
-	fastaq_capillary_to_pairs => 'makes an interleaved file of read pairs',
-	fastaq_chunker => 'splits a multi fasta/q file into separate files',
-	fastaq_count_sequences => 'counts the number of sequences in a fasta/q file',
-	fastaq_deinterleave => 'deinterleaves fasta/q file',
-	fastaq_enumerate_names => 'renames sequences in a file, calling them 1,2,3...',
-	fastaq_expand_nucleotides => 'makes all combinations of sequences in input file',
-	fastaq_extend_gaps => 'extends the length of all gaps in a fasta/q file',
-	fastaq_fasta_to_fastq => 'given a fasta and qual file, makes a fastq file',
-	fastaq_filter => 'filters a fasta/q file by sequence length and/or by name',
-	fastaq_get_ids => 'gets ids from each sequence in a fasta or fastq file',
-	fastaq_get_seq_flanking_gaps => 'gets the sequences either side of gaps in a fasta/q file',
-	fastaq_insert_or_delete_bases => 'deletes or inserts bases at given position(s)',
-	fastaq_interleave => 'interleaves two fasta/q files',
-	fastaq_long_read_simulate => 'simulates long reads from a fasta/q file',
-	fastaq_make_random_contigs => 'makes a multi-fasta file of random sequences',
-	fastaq_merge => 'converts multi fasta/q file to single sequence file',
-	fastaq_replace_bases => 'replaces all occurences of one letter with another',
-	fastaq_reverse_complement => 'reverse complements all sequences',
-	fastaq_scaffolds_to_contigs => 'creates a file of contigs from a file of scaffolds',
-	fastaq_search_for_seq => 'searches for an exact match on a given string and its reverese complement. guaranteed to find all hits',
-	fastaq_sequence_trim => 'trims sequences off the start of all sequences in a pair of fasta/q files',
-	fastaq_split_by_base_count => 'splits a multi fasta/q file into separate files',
-	fastaq_strip_illumina_suffix => 'strips /1 or /2 off the end of every read name',
-	fastaq_to_fake_qual => 'makes fake quality scores file',
-	fastaq_to_fasta => 'converts sequence file to fasta format',
-	fastaq_to_mira_xml => 'creates an xml file from a fasta/q file of reads, for use with mira assembler',
-	fastaq_to_orfs_gff => 'writes a gff file of open reading frames',
-	fastaq_to_perfect_reads => 'makes perfect paired end fastq reads',
-	fastaq_to_random_subset => 'takes a random subset of reads',
-	fastaq_to_tiling_bam => 'makes a bam file containing perfect (unpaired) reads tiling the whole genome',
-	fastaq_to_unique_by_id => 'removes duplicate sequences',
-	fastaq_translate => 'translates all sequences',
-	fastaq_trim_ends => 'trims set number of bases off each sequence',
-	fastaq_trim_Ns_at_end => 'trims any ns off each sequence'
-	);
-
-    return(\%descriptions);
-}

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/fastaq.git