[med-svn] r405 - in trunk/packages: . sim4 sim4/trunk sim4/trunk/debian sim4/trunk/debian/patches

naoliv at alioth.debian.org naoliv at alioth.debian.org
Wed Aug 15 16:51:36 UTC 2007


Author: naoliv
Date: 2007-08-15 16:51:35 +0000 (Wed, 15 Aug 2007)
New Revision: 405

Added:
   trunk/packages/sim4/
   trunk/packages/sim4/branches/
   trunk/packages/sim4/tags/
   trunk/packages/sim4/trunk/
   trunk/packages/sim4/trunk/debian/
   trunk/packages/sim4/trunk/debian/changelog
   trunk/packages/sim4/trunk/debian/compat
   trunk/packages/sim4/trunk/debian/control
   trunk/packages/sim4/trunk/debian/copyright
   trunk/packages/sim4/trunk/debian/dirs
   trunk/packages/sim4/trunk/debian/install
   trunk/packages/sim4/trunk/debian/manpages
   trunk/packages/sim4/trunk/debian/patches/
   trunk/packages/sim4/trunk/debian/patches/Makefile.diff
   trunk/packages/sim4/trunk/debian/rules
   trunk/packages/sim4/trunk/debian/sim4.1
   trunk/packages/sim4/trunk/debian/watch
Log:
Initial upload of sim4


Added: trunk/packages/sim4/trunk/debian/changelog
===================================================================
--- trunk/packages/sim4/trunk/debian/changelog	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/changelog	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1,21 @@
+sim4 (0.0.20030921-2) unstable; urgency=low
+
+  * Updated patches/Makefile.diff, so sim4 doesn't get stripped when
+    using DEB_BUILD_OPTIONS=nostrip (Closes: #438021);
+  * Changed maintainer to Debian-Med Packaging Team
+    <debian-med-packaging at lists.alioth.debian.org>;
+  * Added SVN repository URL in debian/control;
+  * Updated my email address;
+  * Updated Standards-Version to 3.7.2;
+  * Updated debhelper compatibility level to 5;
+  * Updated FSF address in copyright file;
+  * Updated watch file.
+
+ -- Nelson A. de Oliveira <naoliv at debian.org>  Wed, 15 Aug 2007 11:53:53 -0300
+
+sim4 (0.0.20030921-1) unstable; urgency=low
+
+  * Initial release. (Closes: #321180)
+
+ -- Nelson A. de Oliveira <naoliv at gmail.com>  Wed,  3 Aug 2005 18:21:06 -0300
+

Added: trunk/packages/sim4/trunk/debian/compat
===================================================================
--- trunk/packages/sim4/trunk/debian/compat	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/compat	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1 @@
+5

Added: trunk/packages/sim4/trunk/debian/control
===================================================================
--- trunk/packages/sim4/trunk/debian/control	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/control	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1,30 @@
+Source: sim4
+Section: science
+Priority: optional
+Maintainer: Debian-Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>
+Uploaders: Nelson A. de Oliveira <naoliv at debian.org>
+Build-Depends: cdbs, debhelper (>= 5)
+Standards-Version: 3.7.2
+XS-Vcs-Browser: http://svn.debian.org/wsvn/debian-med/trunk/packages/sim4/trunk/
+XS-Vcs-Svn: svn://svn.debian.org/svn/debian-med/trunk/packages/sim4
+
+Package: sim4
+Architecture: any
+Depends: ${shlibs:Depends}, ${misc:Depends}
+Description: tool for aligning cDNA and genomic DNA
+ sim4 is a similarity-based tool for aligning an expressed DNA sequence
+ (EST, cDNA, mRNA) with a genomic sequence for the gene. It also detects end
+ matches when the two input sequences overlap at one end (i.e., the start of
+ one sequence overlaps the end of the other).
+ .
+ sim4 employs a blast-based technique to first determine the basic matching
+ blocks representing the "exon cores". In this first stage, it detects all
+ possible exact matches of W-mers (i.e., DNA words of size W) between the two
+ sequences and extends them to maximal scoring gap-free segments. In the
+ second stage, the exon cores are extended into the adjacent as-yet-unmatched
+ fragments using greedy alignment algorithms, and heuristics are used to favor
+ configurations that conform to the splice-site recognition signals (GT-AG,
+ CT-AC). If necessary, the process is repeated with less stringent parameters
+ on the unmatched fragments.
+ .
+  Homepage: http://www.bx.psu.edu/miller_lab/

Added: trunk/packages/sim4/trunk/debian/copyright
===================================================================
--- trunk/packages/sim4/trunk/debian/copyright	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/copyright	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1,32 @@
+This package was debianized by Nelson A. de Oliveira <naoliv at gmail.com> on
+Wed,  3 Aug 2005 18:21:06 -0300.
+
+It was downloaded from http://globin.cse.psu.edu/ftp/dist/sim4/
+
+Upstream author: Liliana Florea <florea at gwu.edu>
+
+Copyright Holder: 
+
+Copyright (C) 1998-2001  Liliana Florea
+Copyright (C) 1998-2001  Scott Schwartz
+
+License:
+
+   This package is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This package is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this package; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+   MA 02110-1301, USA.
+
+On Debian systems, the complete text of the GNU General
+Public License can be found in `/usr/share/common-licenses/GPL'.
+

Added: trunk/packages/sim4/trunk/debian/dirs
===================================================================
--- trunk/packages/sim4/trunk/debian/dirs	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/dirs	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1 @@
+usr/bin

Added: trunk/packages/sim4/trunk/debian/install
===================================================================
--- trunk/packages/sim4/trunk/debian/install	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/install	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1 @@
+sim4 usr/bin

Added: trunk/packages/sim4/trunk/debian/manpages
===================================================================
--- trunk/packages/sim4/trunk/debian/manpages	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/manpages	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1 @@
+debian/sim4.1

Added: trunk/packages/sim4/trunk/debian/patches/Makefile.diff
===================================================================
--- trunk/packages/sim4/trunk/debian/patches/Makefile.diff	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/patches/Makefile.diff	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1,14 @@
+diff -Nur sim4-0.0.20030921/Makefile sim4-0.0.20030921.new/Makefile
+--- sim4-0.0.20030921/Makefile	2003-09-21 13:55:12.000000000 -0300
++++ sim4-0.0.20030921.new/Makefile	2007-08-15 12:00:04.000000000 -0300
+@@ -3,8 +3,8 @@
+ # the best optimization flag is for your computer.
+ # For Sun's compilers under Solaris, ``-fast'' works well.
+ # For gcc, ``-O2'' works well.
+-CC=cc
+-CFLAGS=-O
++CC=gcc
++CFLAGS=-g -O2 -Wall
+ LDLIBS=-lm
+  
+ sim4:

Added: trunk/packages/sim4/trunk/debian/rules
===================================================================
--- trunk/packages/sim4/trunk/debian/rules	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/rules	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1,8 @@
+#!/usr/bin/make -f
+
+include /usr/share/cdbs/1/rules/simple-patchsys.mk
+include /usr/share/cdbs/1/rules/debhelper.mk
+include /usr/share/cdbs/1/class/makefile.mk
+
+DEB_DESTDIR = $(CURDIR)/debian/sim4
+DEB_MAKE_CLEAN_TARGET := clean


Property changes on: trunk/packages/sim4/trunk/debian/rules
___________________________________________________________________
Name: svn:executable
   + *

Added: trunk/packages/sim4/trunk/debian/sim4.1
===================================================================
--- trunk/packages/sim4/trunk/debian/sim4.1	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/sim4.1	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1,86 @@
+.TH SIM4 1 "Wed, 03 Aug 2005 18:40:58 -0300"
+.SH NAME
+sim4 \- align an expressed DNA sequence with a genomic sequence
+.SH SYNOPSIS
+.B sim4
+\fIseqfile1\fR \fIseqfile2\fR {[WXKCRDAPNB]=\fIvalue\fR}
+.SH DESCRIPTION
+\fBsim4\fP is a similarity-based tool for aligning an expressed DNA sequence (EST, cDNA, mRNA) with a genomic sequence for the gene. It also detects end matches when the two input sequences overlap at one end (i.e., the start of one sequence overlaps the end of the other). If \fIseqfile2\fR is a database of sequences, the sequence in \fIseqfile1\fR will be aligned with each of the sequences in \fIseqfile2\fR.
+
+\fBsim4\fP employs a blast-based technique to first determine the basic matching blocks representing the "exon cores". In this first stage, it detects all possible exact matches of W-mers (i.e., DNA words of size W) between the two sequences and extends them to maximal scoring gap-free segments. In the second stage, the exon cores are extended into the adjacent as-yet-unmatched fragments using greedy alignment algorithms, and heuristics are used to favor configurations that conform to the splice-site recognition signals (GT-AG, CT-AC). If necessary, the process is repeated with less stringent parameters on the unmatched fragments.
+
+By default, \fBsim4\fP searches both strands and reports the best match, measured by the number of matching nucleotides found in the alignment. The R command line option can be used to restrict the search to one orientation (strand) only.
+
+Currently, five major alignment display options are supported, controlled by the A option. By default (A=0), only the endpoints, overall similarity, and orientation of the introns are reported. An arrow sign (`->' or `<-') indicates the orientation of the intron (`+' or `-' strand), when the signals flanking the intron have three or more position matches with either the GT-AG or the CT-AC splice recognition signals. When the same number of matches is found for both orientations, the intron is reported as ambiguous, and represented by `--'. The sign `==' marks the absence from the alignment of a cDNA fragment starting at that position. Alternative formats (lav-block format, text, PipMaker-type `exons file', or certain combinations of these options) can be requested by specifying a different value for A.
+
+If the P option is specified with a non-zero value, \fBsim4\fP will remove any 3'-end poly-A tails that it detects in the alignment.
+
+Occasionally, \fBsim4\fP may miss an internal exon when surrounded by very large introns, typically longer than 100 Kb. When this is suspected, the H option can be used to reset the exons' weight to compensate for the intron gap penalty.
+
+Ambiguity codes are by default allowed in sequence data, but \fBsim4\fP treats them non-differentially. If desired, the B command option can restrict the set of acceptable characters to A,C,G,T,N and X only.
+
+\fBsim4\fP compares the lengths of the input sequences to distinguish between the cDNA (`short') and the genomic (`long') components in the comparison. When \fIseqfile2\fR contains a collection of sequences, the first entry in the file will be used to determine the type of this and all subsequent comparisons.
+
+In the description below, the term MSP denotes a \fIM\fRaximal \fIS\fRegment \fIP\fRair, that is, a pair of highly similar fragments in the two sequences, obtained during the blast-like procedure by extending a W-mer hit by matches and perhaps a few mismatches. 
+.PP
+.SH OPTIONS
+The algorithm parameters (included in the first two sections below) have already been tuned and do not normally require adjustment by the user.
+
+Parameters internal to the blast-like procedure:
+.TP
+.B W
+Sets the word size for blast hits in the first stage of the algorithm. The default value is 12, but it can be increased for a more stringent search or decreased to find weaker matches.
+.TP
+.B X
+Controls the limits for terminating word extensions in the blast-like stage of the algorithm. The default value is 12.
+.TP
+.B K
+Sets the threshold for the MSP scores when determining the basic `exon cores', during the first stage of the algorithm. (If this option is not specified, the threshold is computed from the lengths of the sequences, using statistical criteria.) For example, a good value for genomic sequences in the range of a few hundred Kb is 16. To avoid spurious matches, however, a larger value may be needed for longer sequences.
+.TP
+.B C
+Sets the threshold for the MSP scores when aligning the as-yet-unmatched fragments, during the second stage of the algorithm. By default, the smaller of the constant 12 and a statistics-based threshold is chosen.
+.PP
+Additional algorithm parameters:
+.TP
+.B D
+Sets the bound for the "diagonal" distance within consecutive MSPs in an exon. The default value is 10.
+.PP
+Context parameters:
+.TP
+.B R
+Specifies the direction of the search. If R=0, only the "+" (direct) strand is searched. If R=1, only the "-" (reverse complement) matches are sought. By default (R=2), sim4 searches both strands and reports the best match, measured by the number of matching pairs in the alignment.
+.TP
+.B A
+Specifies the format of the output: exon endpoints only (A=0), exon endpoints and boundaries of the coding region (CDS) in the genomic sequence, when specified for the input mRNA (A=5), alignment text (A=1), alignment in lav-block format (A=2), or both exon endpoints and alignment text (A=3 or A=4). If a reverse complement match is found, A=0,1,2,3,5 will give its position in the "+" strand of the longer sequence and the "-" strand of the shorter sequence. A=4 will give its position in the "+" strand of the first sequence (seqfile1) and the "-" strand of the second sequence (seqfile2), regardless of which sequence is longer. The A=5 option can be used with the S command line option to specify the endpoints of the CDS in the mRNA, and produces output in the `exons file' format required by PipMaker.
+.TP
+.B P
+Specifies whether or not the program should report the fragment of the alignment containing the poly-A tail (if found). By default (P=0) the alignment is displayed as computed, but specifying a non-zero value will request sim4 to remove the poly-A tails. When this feature is enabled, all display options produce additional lav alignment headers.
+.TP
+.B H
+Resets the MSPs' weight to compensate for very large introns. The default value is H=500, but some introns larger than 100 Kb may require higher values, typically between 1000 and 2500. This option should be used cautiously, generally in cases where an unmatched internal portion of the cDNA may disguise a missed exon within a very large intron. It is not recommended for ESTs, where they may produce spurious exons.
+.TP
+.B N
+Requests an additional search for small marginal exons (N=1) guided by the splice-site recognition signals. This option can be used when a high accuracy match is expected. The default value is N=0, specifying no additional search.
+.TP
+.B B
+Controls the set of characters allowed in the input sequences. By default (B=1), ambiguity characters (ABCDGHKMNRSTVWXY) are allowed. By specifying B=0, the set of acceptable characters is restricted to A,C,G,T,N and X only.
+.TP
+.B S
+Allows the user to specify the endpoints of the CDS in the input mRNA, with the syntax: S=n1..n2. This option is only available with the A=5 flag, which produces output in the format required by PipMaker. Alternatively, the CDS coordinates could appear in a construct CDS=n1..n2 in the FastA header of the mRNA sequence. When the second file is an mRNA database, the command line specification for the CDS will apply to the first sequence in the file only.
+.SH EXAMPLES
+sim4 est genomic
+        
+sim4 genomic estdb
+
+sim4 est genomic A=1 P=1 
+
+sim4 est1 est2 R=1
+
+sim4 mRNA genomic A=5 S=123..1020
+        
+sim4 mouse_cDNA human_genomic K=15 C=11 A=3 W=10
+.SH AUTHORS
+sim4 was written by Liliana Florea <florea at gwu.edu> and Scott Schwartz.
+.PP
+This manual page was written by Nelson A. de Oliveira <naoliv at gmail.com>, based on the online documentation at http://globin.cse.psu.edu/html/docs/sim4.html, 
+for the Debian project (but may be used by others).

Added: trunk/packages/sim4/trunk/debian/watch
===================================================================
--- trunk/packages/sim4/trunk/debian/watch	                        (rev 0)
+++ trunk/packages/sim4/trunk/debian/watch	2007-08-15 16:51:35 UTC (rev 405)
@@ -0,0 +1,3 @@
+version=3
+opts="uversionmangle=s/^/0.0./;s/-//g" \
+http://globin.cse.psu.edu/ftp/dist/sim4/ sim4.(.*)\.tar\.gz




More information about the debian-med-commit mailing list