[Debian-med-packaging] [j.johnson at imb.uq.edu.au: Re: Installation of binary tools inside MEME]
Tim Booth
avarus at fastmail.fm
Thu Feb 14 18:08:08 UTC 2013
Hi Andreas,
Yes, I did start looking at Meme but quickly realised it was a lot more
work than I thought to do a proper job on it. I think all I wanted to
do in the first instance was to get an updated glam2 binary package
based upon the improved glam2 source within the meme code. I guess this
is now the definitive glam2 as the original standalone source hasn't
been updated since 2008.
Armed with the list below, it should be fairly straightforward to make a
passably neat meme package, with the binaries living in /usr/lib/meme
and a little wrapper to set the path so that the "user" programs can see
the "utility" programs. It looks from the SVN logs like you are working
on this right now? Or did you want me to have a crack?
Cheers,
TIM
On Thu, 2013-02-14 at 08:28 +0100, Andreas Tille wrote:
> Hi Tim,
>
> as you at least in my perception are the main driver behind meme could
> you try a first answer onto this mail on Debian Med mailing list?
>
> My motivation is to do as much as possible in preparation for Kiel that
> we will be able to do the last polishing / testing there.
>
> See you
>
> Andreas.
>
> ----- Forwarded message from James Johnson <j.johnson at imb.uq.edu.au> -----
>
> Date: Thu, 14 Feb 2013 14:41:28 +1000
> From: James Johnson <j.johnson at imb.uq.edu.au>
> To: Andreas Tille <andreas at fam-tille.de>
> CC: MEME Support <meme at nbcr.net>,
> Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>,
> "H. Soon Gweon" <soonio at gmail.com>,
> Faheem Mitha <faheem at faheem.info>
> Subject: Re: Installation of binary tools inside MEME
> X-Spam_score: 0.0
>
> Hi Andreas,
>
> On 13/02/13 16:36, Andreas Tille wrote:
> > Hi James,
> >
> > On Wed, Feb 13, 2013 at 10:18:58AM +1000, James Johnson wrote:
> >> ...
> >>> Any hint would be welcome.
> >> I'll work on a list of what each of the programs is used for and get
> >> back to you. Even discounting the fossils there are probably quite a
> >> few scripts and programs only relevant to installing the MEME Suite
> >> on a webserver. For example a local user would almost never want to
> >> use the update_db script as it's a very clumsy way to get sequence
> >> data for specific tasks.
> > This sounds great - so we stay in idle mode until we hear some news from
> > your side.
> >
> > Thanks for your quick and helpful response
> I've attached an annotated list of the things that the MEME Suite
> currently installs to the bin directory in our main development branch
> (there may be minor differences to the current distribution). There's
> quite a few things that shouldn't be in there like Python libraries
> (there are 4) and a few Perl libraries (there are 2). There are 6
> programs which have no good reason to be there anymore annotated as
> "Fossil".
>
> Aside from that there's programs which have been obsoleted because
> there's a newer better version (see most of the mhmm related things)
> and a few scripts which are only used by us developers. There's also a
> few scripts that would only be useful to someone running a webserver.
>
> After that the decisions get a lot harder. There are programs which
> are generally only called by other programs in the suite, however if
> they're not in the bin directory they won't be found. I'm not sure how
> you're going to work around the restriction on installing only
> programs "a user should execute" with those ones (like meme.bin called
> by the meme script or mast2txt called by mast). There's also quite a
> lot of programs that might be conceivably useful to someone somewhere
> so I'm not sure how you decide with those.
>
> ~James
>
> >
> > Andreas.
> >
>
>
> alphtype Webserver determines if a alphabet string is DNA or PROTEIN, I will probably reimplement as part of a Perl module
> ama Useful calculates average/maximum motif score for sequences, first step in gomo analysis
> ama-qvalues Useful calculates q-values for ama output
> ame Useful calculates motif enrichment in sequences
> beadstring Obsolete Obsoleted by MCAST
> beeml2meme Useful converts motifs to MEME format from BEEML format
> cat_max Fossil ...
> centrimo Useful calculates areas of localised motif enrichment
> ceqlogo Rarely Useful generates a single motif logo, it's not very user friendly and mostly called through the c interface
> changetoweb Fossil ...
> chen2meme Useful converts motifs to MEME format from Chen format
> clustalw2fasta Possibly Useful converts sequences in clustalw format to fasta format
> clustalw2phylip Possibly Useful converts sequences in clustalw format to phylip format
> clustalw-io Not Useful allows testing of the clustalw parser
> compare_dates Fossil another script used by the "download" script
> compute-prior-dist Possibly Useful computes the distribution of priors in a MEME PSP file
> compute-uniform-priors Possibly Useful computes a uniform prior psp file equal to the mean of all input priors in another psp file (missing doc)
> create-priors Useful allows running MEME in discriminative mode by creating a position specific prior file
> download Fossil old code for downloading sequence databases
> draw-mhmm Possibly Useful produces a graphvis representation of a MHMM model
> dreme Useful Discover short DNA motifs
> dust Useful filters low complexity regions from sequences
> fasta-center Possibly Useful filters a set of sequences to only leave the central region
> fasta-dinucleotide-shuffle Useful shuffles a sequence while maintaining di-nucleotide frequencies
> fasta-dinucleotide-shuffle.py Python Library I'll get this moved into the libs directory like with the Perl modules.
> fasta-fetch Possibly Useful Seems to use an index generated by fasta-make-index to fetch sequences out of a fasta file
> fasta-get-markov Useful generates a Markov model of letter frequencies used as backgrounds by many MEME Suite programs
> fasta-hamming-enrich Possibly Useful compute the Hamming distance from a word to each sequence in two sets, apply Fisher Exact test
> fasta-hamming-enrich.py Python Library ...
> fasta-io Not Useful just allows testing of the fasta file reading
> fasta-make-index Possibly Useful makes an index of a fasta file
> fasta-most Useful finds the length of sequence that occurs most, used by MEME-ChIP
> fasta-shuffle-letters Possibly Useful shuffles letters of a sequence, though fasta-dinucleotide-shuffle is better
> fasta-subsample Useful selects a subset of the sequences
> fasta-unique-names Possibly Useful makes sequence names unique, replaces U+0001 with space
> fimo Useful searches for motif sites
> fisher_exact Possibly Useful computes the result of the Fisher Exact test with the given numbers
> fitevd Possibly Useful fits an extreme value distribution to a set of score-length pairs.
> gendb Possibly Useful generates a synthetic fasta database from a background model
> get_db_csv Webserver queries an online sequence repository for its databases and creates a csv file for update_db to use
> getsize Useful measures statistics about a fasta file
> glam2 Useful Discover gapped motifs
> glam2format Possibly Useful converts GLAM2 output to FASTA (with gaps) or MSF
> glam2html Useful converts GLAM2 output to HTML, called by glam2 but not often by users
> glam2mask Useful used with GLAM2 to mask out found motifs and find weaker ones
> glam2psfm Useful convert GLAM2 output to a MEME motif
> glam2scan Useful scans a sequence with a GLAM2 motif
> glam2scan2html Useful converts GLAM2SCAN output to HTML, called by glam2scan but not often by users
> gomo Useful finds enriched GO terms associated with high ranking genes
> gomo_highlight Useful post processes gomo XML output to include further information which makes the HTML better
> hart2meme-bkg Possibly Useful Convert a Hartemink background to a MEME background
> hartemink2psp Possibly Useful Convert a Hartemink PSP file into a MEME PSP file
> hypergeometric.py Python Library ...
> iupac2meme Useful Make a MEME motif from a IUPAC string
> jaspar2meme Useful Convert motifs from JASPAR to MEME format
> llr Possibly Useful Compute the probability distribution for the log-likelihood ratio (LLR) of N letters.
> mast Useful Find sequences which best match a group of motifs
> mast2txt Useful Convert mast XML output to mast text output. Called by mast
> mcast Useful Find matches to a motif hidden markov model
> meme Useful Discover motifs, this script calls meme.bin handling the details of parallelization
> meme2images Useful Create motif logos for all motifs in a MEME motif file
> meme2meme Useful Combine multiple MEME motif files into 1
> meme.bin Useful Discover motifs, this is typically called by the meme script
> meme-chip Useful Discover motifs, look for enriched motifs, calls MEME, DREME, CentriMo, TOMTOM, eventually FIMO and SpaMo
> meme-get-motif Possibly Useful Extract motifs from a MEME text file.
> meme-rename Possibly Useful Renames all the output HTML files from MEME-ChIP so they can be kept in one folder and emailed easily
> meme-xml-html Possibly Useful Does an XML transformation to convert XML output to HTML output, not actually specific to MEME
> metameme Fossil Used to handle web jobs for metameme
> mhmm Obsolete Given MEME motif, write a motif-based HMM, obsoleted by MCAST
> mhmm2html Obsolete Convert MHMM output to HTML, obsoleted by MCAST
> mhmme Obsolete obsoleted by MCAST
> mhmm-io Not Useful allows testing of reading/writing MHMM models
> mhmms Obsolete obsoleted by MCAST
> mhmmscan Obsolete obsoleted by MCAST
> motiph Possibly Useful part of a publised paper so we want to keep it around
> nmica2meme Useful converts motifs from NMICA format to MEME format
> oldmeme2meme Fossil converts reallly really old MEME files into only really old MEME files...
> plotgen Webserver used to generate usage plots from the logs
> pmp_bf Possibly Useful calculates the statistical power of a phylogenetic motif model
> priority2meme Useful converts motifs in priority format to MEME format
> prior_utils.pl Perl library ...
> psp-gen Useful calculates PSP files for MEME
> purge Possibly Useful filters sequences to remove repeats
> qvalue Possibly Useful computes q-values from a list of p-values
> ramen Useful (bugs?) integration was never tested so it may have major bugs, does regression analysis of motif enrichment
> ranksum_test Possibly Useful calculates the rank-sum test
> read_fasta_file.pl Perl Library ...
> readseq Useful converts sequence formats
> reconcile-tree-alignment Possibly Useful identify the intersection of the sets of sequence IDs and leaf labels
> reduce-alignment Possibly Useful Extract specified columns from a multiple alignment.
> remove-alignment-gaps Possibly Useful Remove from an alignment all columns that correspond to a gap in a specified species.
> rna2meme Useful convert an RNA sequence to it's binding motif in MEME format
> scpd2meme Useful convert motifs in SCPD format to MEME format
> sd Possibly Useful calcualtes mean and standard deviation of a list of numbers
> sequence.py Python Library ...
> shadow Not Useful related to the motiph program but never went anywhere, Perform phylogenetic shadowing
> spamo Useful motif spacing enrichment analysis
> taipale2meme Useful converts motifs from Taipale format to MEME format
> tamo2meme Useful converts motifs from TAMO format to MEME format
> tomtom Useful comparison of DNA motifs
> transfac2meme Useful converts motifs from Transfac matrics to MEME format
> tree Obsolete obsoleted by MCAST
> uniprobe2meme Useful convert motifs from Uniprobe format to MEME format
> update_db Webserver download sequences listed on the page get_db_list.cgi if there timestamp is newer
> update_meme_tests Not Useful updates the MEME and MAST smoke tests (they must be run first), shouldn't be needed by end users
> xsltproc_lite Rarely Useful used to generate the documentation from XML, the next version will not need it
>
>
> ----- End forwarded message -----
>
--
If you can't find an apposite quote for your sig, just make one up.
- Anon
More information about the Debian-med-packaging
mailing list