[med-svn] r2195 - trunk/community/infrastructure/getData

smoe-guest at alioth.debian.org smoe-guest at alioth.debian.org
Tue Jul 8 10:34:13 UTC 2008


Author: smoe-guest
Date: 2008-07-08 10:34:12 +0000 (Tue, 08 Jul 2008)
New Revision: 2195

Added:
   trunk/community/infrastructure/getData/README
Modified:
   trunk/community/infrastructure/getData/TODO
   trunk/community/infrastructure/getData/getData
Log:
Added Pfam A,B,C, no unpacking yet, though.


Added: trunk/community/infrastructure/getData/README
===================================================================
--- trunk/community/infrastructure/getData/README	                        (rev 0)
+++ trunk/community/infrastructure/getData/README	2008-07-08 10:34:12 UTC (rev 2195)
@@ -0,0 +1,18 @@
+getData
+=======
+
+While the community is discussing the possibility to prepared large Debian packages
+for the distribution of biological databases, this approach sets out to prepare such
+automatically - not the Debian package, but the provisioning of local updates and
+first installations of the data.
+
+Well performing should already be the swiss.dat (manually curated fraction of the
+UniProt protein sequence database) and trembl.dat (automated translating of
+coding sequences in the nucleotide sequence database EMBL). For sites that have
+the EMBOSS tool kit installed, also the respective indexing is performed.
+
+Please add databases for your purposes and let feedback fly in.
+
+Steffen Moeller
+
+-- Steffen Moeller <moeller at debian.org>  Tue, 08 Jul 2008 12:33:06 +0200

Modified: trunk/community/infrastructure/getData/TODO
===================================================================
--- trunk/community/infrastructure/getData/TODO	2008-07-08 07:49:30 UTC (rev 2194)
+++ trunk/community/infrastructure/getData/TODO	2008-07-08 10:34:12 UTC (rev 2195)
@@ -1,11 +1,18 @@
 getData - TODO
 ==============
 
+* separate machinery from database descriptions
 
-* separate machinery from database descriptions
+  Some configuration file should be read it at the time the
+  application is started.
+
 * allow for Debian maintainers to install additional
   database descriptions
+
+  This could be achieved by reading in all files in some /etc/getData.d directory or so.
+
 * come up with a set of standard use cases as a start
+
 * define Debian packages and/or applications that must
   be installed prior to the execution of some scripts.
 

Modified: trunk/community/infrastructure/getData/getData
===================================================================
--- trunk/community/infrastructure/getData/getData	2008-07-08 07:49:30 UTC (rev 2194)
+++ trunk/community/infrastructure/getData/getData	2008-07-08 10:34:12 UTC (rev 2195)
@@ -222,6 +222,21 @@
 		source => "wget --mirror http://www.reactome.org/download/interactions.README.txt http://www.reactome.org/download/current/homo_sapiens.interactions.txt.gz"
 	},
 
+	"pfam-a" => {
+		name => "Pfam-A : Manually curated protein families and domains, only the seed is presented.",
+		source => "wget --mirror ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_release/Pfam-A-seed.gz"
+	},
+
+	"pfam-b" => {
+		name => "Pfam-B : Automated assembly of protein families",
+		source => "wget --mirror ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_release/Pfam-B.gz"
+	},
+
+	"pfam-c" => {
+		name => "Pfam-C : Clans of sequences that may be assigned to multiple Pfam entries",
+		source => "wget --mirror ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_release/Pfam-C.gz"
+	},
+
 	"trembl.dat" => {
 		name => "UniProt - TrEMBL in EMBL format",
 		source => "wget --mirror ftp://ftp.ebi.ac.uk/pub/databases/swissprot/release_compressed/uniprot_trembl.dat.gz",




More information about the debian-med-commit mailing list