[Debian-med-packaging] Installation of binary tools inside MEME

James Johnson j.johnson at imb.uq.edu.au
Thu Feb 14 04:41:28 UTC 2013


Hi Andreas,

On 13/02/13 16:36, Andreas Tille wrote:
> Hi James,
>
> On Wed, Feb 13, 2013 at 10:18:58AM +1000, James Johnson wrote:
>> ...
>>> Any hint would be welcome.
>> I'll work on a list of what each of the programs is used for and get
>> back to you. Even discounting the fossils there are probably quite a
>> few scripts and programs only relevant to installing the MEME Suite
>> on a webserver. For example a local user would almost never want to
>> use the update_db script as it's a very clumsy way to get sequence
>> data for specific tasks.
> This sounds great - so we stay in idle mode until we hear some news from
> your side.
>
> Thanks for your quick and helpful response
I've attached an annotated list of the things that the MEME Suite 
currently installs to the bin directory in our main development branch 
(there may be minor differences to the current distribution). There's 
quite a few things that shouldn't be in there like Python libraries 
(there are 4) and a few Perl libraries (there are 2). There are 6 
programs which have no good reason to be there anymore annotated as 
"Fossil".

Aside from that there's programs which have been obsoleted because 
there's a newer better version (see most of the mhmm related things) and 
a few scripts which are only used by us developers. There's also a few 
scripts that would only be useful to someone running a webserver.

After that the decisions get a lot harder. There are programs which are 
generally only called by other programs in the suite, however if they're 
not in the bin directory they won't be found. I'm not sure how you're 
going to work around the restriction on installing only programs "a user 
should execute" with those ones (like meme.bin called by the meme script 
or mast2txt called by mast). There's also quite a lot of programs that 
might be conceivably useful to someone somewhere so I'm not sure how you 
decide with those.

~James

>
>        Andreas.
>

-------------- next part --------------
alphtype                        	Webserver         	determines if a alphabet string is DNA or PROTEIN, I will probably reimplement as part of a Perl module
ama                             	Useful            	calculates average/maximum motif score for sequences, first step in gomo analysis
ama-qvalues                     	Useful            	calculates q-values for ama output
ame                             	Useful            	calculates motif enrichment in sequences
beadstring                      	Obsolete          	Obsoleted by MCAST
beeml2meme                      	Useful            	converts motifs to MEME format from BEEML format
cat_max                         	Fossil            	...        
centrimo                        	Useful            	calculates areas of localised motif enrichment
ceqlogo                         	Rarely Useful     	generates a single motif logo, it's not very user friendly and mostly called through the c interface
changetoweb                     	Fossil            	...
chen2meme                       	Useful            	converts motifs to MEME format from Chen format
clustalw2fasta                  	Possibly Useful   	converts sequences in clustalw format to fasta format
clustalw2phylip                 	Possibly Useful   	converts sequences in clustalw format to phylip format
clustalw-io                     	Not Useful        	allows testing of the clustalw parser
compare_dates                   	Fossil            	another script used by the "download" script
compute-prior-dist              	Possibly Useful   	computes the distribution of priors in a MEME PSP file
compute-uniform-priors          	Possibly Useful   	computes a uniform prior psp file equal to the mean of all input priors in another psp file (missing doc)
create-priors                   	Useful            	allows running MEME in discriminative mode by creating a position specific prior file
download                        	Fossil            	old code for downloading sequence databases
draw-mhmm                       	Possibly Useful   	produces a graphvis representation of a MHMM model
dreme                           	Useful            	Discover short DNA motifs
dust                            	Useful            	filters low complexity regions from sequences
fasta-center                    	Possibly Useful   	filters a set of sequences to only leave the central region
fasta-dinucleotide-shuffle      	Useful            	shuffles a sequence while maintaining di-nucleotide frequencies
fasta-dinucleotide-shuffle.py   	Python Library    	I'll get this moved into the libs directory like with the Perl modules.
fasta-fetch                     	Possibly Useful   	Seems to use an index generated by fasta-make-index to fetch sequences out of a fasta file
fasta-get-markov                	Useful            	generates a Markov model of letter frequencies used as backgrounds by many MEME Suite programs
fasta-hamming-enrich            	Possibly Useful   	compute the Hamming distance from a word to each sequence in two sets, apply Fisher Exact test
fasta-hamming-enrich.py         	Python Library    	...
fasta-io                        	Not Useful        	just allows testing of the fasta file reading
fasta-make-index                	Possibly Useful   	makes an index of a fasta file
fasta-most                      	Useful            	finds the length of sequence that occurs most, used by MEME-ChIP
fasta-shuffle-letters           	Possibly Useful   	shuffles letters of a sequence, though fasta-dinucleotide-shuffle is better
fasta-subsample                 	Useful            	selects a subset of the sequences
fasta-unique-names              	Possibly Useful   	makes sequence names unique, replaces U+0001 with space
fimo                            	Useful            	searches for motif sites
fisher_exact                    	Possibly Useful   	computes the result of the Fisher Exact test with the given numbers
fitevd                          	Possibly Useful   	fits an extreme value distribution to a set of score-length pairs.
gendb                           	Possibly Useful   	generates a synthetic fasta database from a background model
get_db_csv                      	Webserver         	queries an online sequence repository for its databases and creates a csv file for update_db to use
getsize                         	Useful            	measures statistics about a fasta file
glam2                           	Useful            	Discover gapped motifs
glam2format                     	Possibly Useful   	converts GLAM2 output to FASTA (with gaps) or MSF
glam2html                       	Useful            	converts GLAM2 output to HTML, called by glam2 but not often by users
glam2mask                       	Useful            	used with GLAM2 to mask out found motifs and find weaker ones
glam2psfm                       	Useful            	convert GLAM2 output to a MEME motif
glam2scan                       	Useful            	scans a sequence with a GLAM2 motif
glam2scan2html                  	Useful            	converts GLAM2SCAN output to HTML, called by glam2scan but not often by users
gomo                            	Useful            	finds enriched GO terms associated with high ranking genes
gomo_highlight                  	Useful            	post processes gomo XML output to include further information which makes the HTML better
hart2meme-bkg                   	Possibly Useful   	Convert a Hartemink background to a MEME background
hartemink2psp                   	Possibly Useful   	Convert a Hartemink PSP file into a MEME PSP file
hypergeometric.py               	Python Library    	...
iupac2meme                      	Useful            	Make a MEME motif from a IUPAC string
jaspar2meme                     	Useful            	Convert motifs from JASPAR to MEME format
llr                             	Possibly Useful   	Compute the probability distribution for the log-likelihood ratio (LLR) of N letters.
mast                            	Useful            	Find sequences which best match a group of motifs
mast2txt                        	Useful            	Convert mast XML output to mast text output. Called by mast
mcast                           	Useful            	Find matches to a motif hidden markov model
meme                            	Useful            	Discover motifs, this script calls meme.bin handling the details of parallelization
meme2images                     	Useful            	Create motif logos for all motifs in a MEME motif file
meme2meme                       	Useful            	Combine multiple MEME motif files into 1
meme.bin                        	Useful            	Discover motifs, this is typically called by the meme script
meme-chip                       	Useful            	Discover motifs, look for enriched motifs, calls MEME, DREME, CentriMo, TOMTOM, eventually FIMO and SpaMo
meme-get-motif                  	Possibly Useful   	Extract motifs from a MEME text file.
meme-rename                     	Possibly Useful   	Renames all the output HTML files from MEME-ChIP so they can be kept in one folder and emailed easily
meme-xml-html                   	Possibly Useful   	Does an XML transformation to convert XML output to HTML output, not actually specific to MEME
metameme                        	Fossil            	Used to handle web jobs for metameme
mhmm                            	Obsolete          	Given MEME motif, write a motif-based HMM, obsoleted by MCAST
mhmm2html                       	Obsolete          	Convert MHMM output to HTML, obsoleted by MCAST
mhmme                           	Obsolete          	obsoleted by MCAST
mhmm-io                         	Not Useful        	allows testing of reading/writing MHMM models
mhmms                           	Obsolete          	obsoleted by MCAST
mhmmscan                        	Obsolete          	obsoleted by MCAST
motiph                          	Possibly Useful   	part of a publised paper so we want to keep it around
nmica2meme                      	Useful            	converts motifs from NMICA format to MEME format
oldmeme2meme                    	Fossil            	converts reallly really old MEME files into only really old MEME files...
plotgen                         	Webserver         	used to generate usage plots from the logs
pmp_bf                          	Possibly Useful   	calculates the statistical power of a phylogenetic motif model
priority2meme                   	Useful            	converts motifs in priority format to MEME format
prior_utils.pl                  	Perl library      	...
psp-gen                         	Useful            	calculates PSP files for MEME
purge                           	Possibly Useful   	filters sequences to remove repeats
qvalue                          	Possibly Useful   	computes q-values from a list of p-values
ramen                           	Useful (bugs?)    	integration was never tested so it may have major bugs, does regression analysis of motif enrichment
ranksum_test                    	Possibly Useful   	calculates the rank-sum test
read_fasta_file.pl              	Perl Library      	...
readseq                         	Useful            	converts sequence formats
reconcile-tree-alignment        	Possibly Useful   	identify the intersection of the sets of sequence IDs and leaf labels
reduce-alignment                	Possibly Useful   	Extract specified columns from a multiple alignment.
remove-alignment-gaps           	Possibly Useful   	Remove from an alignment all columns that correspond to a gap in a specified species. 
rna2meme                        	Useful            	convert an RNA sequence to it's binding motif in MEME format
scpd2meme                       	Useful            	convert motifs in SCPD format to MEME format
sd                              	Possibly Useful   	calcualtes mean and standard deviation of a list of numbers
sequence.py                     	Python Library    	...
shadow                          	Not Useful        	related to the motiph program but never went anywhere, Perform phylogenetic shadowing
spamo                           	Useful            	motif spacing enrichment analysis
taipale2meme                    	Useful            	converts motifs from Taipale format to MEME format
tamo2meme                       	Useful            	converts motifs from TAMO format to MEME format
tomtom                          	Useful            	comparison of DNA motifs
transfac2meme                   	Useful            	converts motifs from Transfac matrics to MEME format
tree                            	Obsolete          	obsoleted by MCAST
uniprobe2meme                   	Useful            	convert motifs from Uniprobe format to MEME format
update_db                       	Webserver         	download sequences listed on the page get_db_list.cgi if there timestamp is newer
update_meme_tests               	Not Useful        	updates the MEME and MAST smoke tests (they must be run first), shouldn't be needed by end users
xsltproc_lite                   	Rarely Useful     	used to generate the documentation from XML, the next version will not need it


More information about the Debian-med-packaging mailing list