[med-svn] [Debian Wiki] Update of "DebianMed/Meeting/Aberdeen2014_Report" by TimBooth
Debian Wiki
debian-www at lists.debian.org
Wed Feb 19 22:48:10 UTC 2014
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Debian Wiki" for change notification.
The "DebianMed/Meeting/Aberdeen2014_Report" page has been changed by TimBooth:
https://wiki.debian.org/DebianMed/Meeting/Aberdeen2014_Report?action=diff&rev1=2&rev2=3
Comment:
Done, at last
* #4 Install “ARB” X11 GUI phylogenetics suite (part of Bio-Linux)
* Looking at MrBayes MPI on Sunday. There were some issues getting this to work.
+ * Diagnosed problems on Sunday - fix requires package rebuild
+ ==== Work on Edam/Debtags/Tools registry ====
+ ''Investigate possible mechanisms of data interchange''
+
+ Steffen Möller, Matúš Kalaš, Kristoffer Rapacki, Olivier Sallou, Piotr Chmura, Emil Rydza
+ (probably splitting into subgroups)
+
+ * '''Main activities'''
+ * Mapping of DebTags to EDAM
+ * Draft 4.0 of tool description model
+ * Synchronisation of Debian Med's packages with the Tool Registry
+ * Integration of EDAM annotations into Debian Med
+ * '''Achievements'''
+ * made sure that the Registry description model accommodates all the attributes needed by Debian/DebianMed
+ * made sure that all the attributes essential to the Registry exist in DebianMed
+ * established from where in DebianMed the tool descriptions can be obtained regularly via programmatic access
+ * Outcome: Integration of Debtags with EDAM and with the registry model '''without loss of data determined to be do-able.'''
+ * This meeting has sparked off significant tasks and collaborations in this area
+ * Production of status report - see below.
+
+ ==== Packaging demonstration ====
+ Brad Chapman, Daniel Barker, Andreas Tille, Detlef Wolf, Iain Learmonth
+
+ * Packaged seqTK as a demo packaging task
+ * Demo went well - seqTK and DNAclust both packaged
+ * Attendees started their own packaging tasks:
+ * python3-fitbitscraper now built - asked Deb Python about pushing it
+ * started on ngila - work in progress
+
+ ==== Packaging the PubMed search (C + Java) from BioinfoC ====
+ Detlef Wolf, with help from Olivier Sallou, Andreas Tille, Jorge Soares, Iain Learmonth, Steffen Moeller, Tim Booth
+
+ * Roadmap to PubmedSearch packaging
+ * libbioinfoc-0.1.0: setup of GNU build system (configure.ac, Makefile.am)
+ to produce library (for shared and static linking)
+ * towards Debian initiation: gpg key & show passport, alioth account
+ * Java part of pubmed search: for next sprint
+ * Many contributors to improving the build system, so that
+ * AM build now works and an initial package has been produced.
+ * Gave an impromptu demo at 2pm (http://bioinfoc.ch)
+ * continued work to neaten up the package on Sunday:
+ * Steffen added the example prog to the bioinfoc package.
+ * Also the library gets a debugging package,
+ * close to ready for a Debian upload.
=== Personal Reports ===
@@ -71, +115 @@
* Gave a demo of a Python application visualising personal health data from FitBit ([[http://i.imgur.com/0NUpGc2.png|Screenshot]]).
* Took some pictures, link at bottom of main page.
+ ==== Steffen Möller ====
+
+ * Did a package of bio-parser-isatab (Perl lib) and it is nearly ready for commit
+
+ ==== Jorge Soares ====
+
+ * Commit fixes to snp-sites and tidy up
+ * Committed. snp-sites v1.5.0 has now installed in all sid architectures
+ * Successful debugging of upstream issues
+ * Package Fastaq
+ * Initial debian git commit of Fastaq python package.
+ * Initial editing of several debian files.
+
==== Tim Booth ====
* Gave a short talk on Bio-Linux and recent updates
@@ -81, +138 @@
* Discussed roadmap to making Bio-Linux love the Galaxy toolshed with Brad and Peter C
* Planned with Kristoffer how to use the Tools Registry in BL and how to contribute
* Connected to the Qlustar cluster and tried some basic ops
+
+ ==== Niall Beard ====
+
+ * Interested in new packages + involvement in the tools registry group (Biocatalogue)
+ * Maybe looking at external tools in taverna with Steffen
+ * Joined Andreas and started packaging Coot - work ongoing
+ * Proceeded to productive discussion on tool description related issues
+
+ ==== Olivier Sallou ====
+
+ * Fix biojava and libgo-perl
+ * Package new upstream version biojava3
+ * New packages: discosnp and mapsembler2
+
+ ==== Peter Cock ====
+
+ * Worked with Brad and Tim - incl looking at Galaxy DEB package.
+ * Planned BOF at the next Galaxy conference after in-depth discussion on Galaxy toolshed issues and sane packaging.
+ * Looked into packaging an astronomy package.
+
+ ==== Brad Chapman ====
+
+ * Gave talk and demo on Cloud BioLinux
+ * Participated in packaging demo
+ * Worked on the manifest idea – list installed progs in CBL
+
+ A critical missing component of CloudBioLinux full and flavor-based custom
+ installs is defining the full environment of packages and versions available on
+ the system. We worked at the 2011 [BOSC] Codefest hackathon to add minimal support for
+ creating this full manifest of packages and versions, but the script required
+ integration into production workflows and numerous cleanups. During the first
+ day of the DebianMed Sprint I focused on converting this manifest creation into
+ a [[[https://github.com/chapmanb/cloudbiolinux/blob/master/cloudbio/manifest.py|production ready importable module]]. It
+ now handles creation of YAML files with packages and versions for all install
+ methods supported by CloudBioLinux (Debian packages; Python, R and Ruby library
+ installs; Homebrew packages; and custom CloudBioLinux scripts). The Debian
+ version is 10x faster than previously thanks to tips on querying apt repos from
+ Tim Booth.
+
+ These updates to manifest creation make it possible to integrate it into
+ existing tools that use CloudBioLinux for installation. The community developed
+ open source [[https://github.com/chapmanb/bcbio-nextgen|bcbio-nextgen]]
+ next-generation sequencing pipeline uses this, and we adjusted the build scripts
+ to
+ [[https://github.com/chapmanb/bcbio-nextgen/blob/master/bcbio/install.py#L231|generate manifests on installation]]
+ and then use these manifests to provide a list of the biological packages that
+ run as
+ [[https://github.com/chapmanb/bcbio-nextgen/blob/master/bcbio/provenance/programs.py#L184|part of the pipeline]].
+ This replaces brittle code existing in bcbio-nextgen and ties automated
+ installation to the new manifest feature, ensuring that manifest creation will be
+ regularly updated going forward for all CloudBioLinux installs.
+
+ Additional, I worked to learn Debian package building thanks to help from
+ Andreas Tille. This resulted in creation of my first Debian package for
+ [[https://github.com/ekg/freebayes|FreeBayes]], a highly accurate variant caller
+ from Erik Garrison in the Marth Lab. I pushed a nearly completed version to
+ DebianMed, which Andreas helped to finalize and make available. The hope for
+ future versions of CloudBioLinux is to move back to Debian/Ubuntu based support
+ inside Docker containers, which will help this package replace a custom build
+ function in CloudBioLinux with a proper package.
+
+ === Report of tool registry working group (Reporting by Matúš) ===
+
+
+ ==== Summary ====
+
+ '''The following was achieved:'''
+ * we made sure that the Registry description model accommodates all the attributes needed by Debian DebianMed;
+ * we made sure that all the attributes essential to the Registry exist in DebianMed
+ * we established from where in DebianMed the tool descriptions can be obtained regularly via programmatic access
+
+ '''Motivation:'''
+ * Community of ToolRegistry and Debian Med is expected to significantly overlap -> effort should not be performed redundantly
+ * Expected Synergies
+ * increased visibility of Debian Med's efforts to scientific community
+ * head start for ToolRegistry with data provided
+ * facilitation of Debian packaging with tool descriptions, prioritization of efforts
+ * any mechanism for ensuring that tool description of desired 'tasks' is in the Tool Registry? (Not via harassment of Andreas)
+ * maybe in the later future: test I/O data pairs for automated testing & benchmarking may be recorded in the registry and useful for automated testing in Debian
+ * Best possible annotation for tools (and databases) in Computational Biology, resulting in improved accessibility, visibility & attribution, and provenance within the field
+
+ '''Constraints and challenges:'''
+ * Non-intrusive to ease acceptance in working communities
+ * maintainers are not forced to link to the Tool Registry
+ * maintainers may perhaps have a choice of ignoring the Tool Registry, importing information from the registry upon request, or some form of automatic updates with or without confirmation
+ * Licensing of debian-provided annotation - [[http://anonscm.debian.org/viewvc/debian-med/trunk/packages/dialign/trunk/debian/copyright?view=markup|Example here]]
+ * Difficulty to distinguish "source" packages and their general annotation with Debian's more fine-grained separation of binaries, APIs/libs, data, scripts, debug information ... and many bits and pieces that should be considered intrinsic parts of one tool
+ * Other way round too, package being a collection or an ad hoc cluster of tools
+
+ '''Ideas for implementation:'''
+ * The Tool Registry harvesting information regularly from Debian Med and including references to the created Tool Registry entries back into Debian
+ * ''into debian/control (probably not) and/or debian/upstream and/or 'tasks' (these 2 probably reasonable and possibly optionable)''
+ * Debian Maintainers are encouraged to add tags to reference an eventual Tool Registry entry and an option for eventual automated imports from the registry
+ * ''into debian/control (probably not) and/or debian/upstream and/or 'tasks' (these 2 probably reasonable and possibly optionable)''
+ * Tool information in Debian Med:
+ * [[UltimateDebianDatabase|Ultimate Debian Database]] should integrate all information from
+ * which packages are in Debian
+ * debian/control - [[http://anonscm.debian.org/viewvc/debian-med/trunk/packages/dialign/trunk/debian/control?view=markup|Example here]]
+ * debian/upstream [[http://anonscm.debian.org/viewvc/debian-med/trunk/packages/dialign/trunk/debian/upstream?view=markup|Example here]]
+ * ''[[http://anonscm.debian.org/viewvc/debian-med/trunk/package_template|recommended template of the debian files]]''
+ * See Also: [[Debtags|DebTags on the Wiki]], [[http://debtags.alioth.debian.org/paper-debtags.html|DebTags paper]], [[http://en.wikipedia.org/wiki/Faceted_classification|Faceted Classification]], [[http://debtags.debian.net|DebTags home]], [[https://wiki.debian.org/Debtags/FAQ|DebTags FAQ]]
+ * [[http://anonscm.debian.org/viewvc/blends/projects/med/trunk/debian-med/tasks|'tasks' page]]; [[http://blends.debian.org/med/tasks/bio|is shown here]]; [[http://blends.debian.org/blends/apa.html#staticwebpages|populated from here]]; [[http://anonscm.debian.org/gitweb/?p=blends/website.git;a=blob;f=webtools/tasks.py|and code is here]]
+ * description of unpackaged tools ignored until Tool Registry finds it useful to import them
+ * additional information about packages ignored until found useful
+ * matching of packages with registry entries may be implemented via information in the 'tasks' file (this may be likely in case the references are not desired in a package itself)
+ * '' '''Important:''' Andreas has recently made it so that almost all relevant stuff from the ‘tasks’ file is in UDD ''
+ * Contact information in the debian/copyright - [[http://anonscm.debian.org/viewvc/debian-med/trunk/packages/dialign/trunk/debian/copyright?view=markup|Example maintained here]]
+ * ''Andreas may be willing to include these into the UDD, as now they aren’t there''
+ * online manpages can be included in registry among documentation URLs [[http://manpages.debian.net/cgi-bin/man.cgi?query=<pkgname>]]
+ * Access to UDD via public pythonned postgres
+ * First attempt:
+ 1. Get all Deb Med descriptions from UDD
+ 2a. Description of packages that had already been recorded in the registry will be fully overwritten
+ 3a. Return registry accessions for newly created (or all) entries
+ 4. Let Andreas et al decide in which form they want to get them and record them
+ * In a later iteration:
+ 2b. Solve synchronisation to allow update of descriptions without overwriting (likely via timestamps of imported information)
+ 3b. Let Debian people decide whether, how - and eventually with what options - an updated information about packages is recorded back to Debian
+ * Integration of information (the “federated” model):
+ 2c. Handle synchronisation of information updates from multiple sources (for simplicity start with Deb Med and SEQwiki?)
+ * In future, would it be of interest to automatically (optionally manually) populate debian/upstream with enriched scientific & semantic information? [[https://docs.google.com/document/d/19VpzwxZdlz1K4P1q1a-WYZUtiSXwUp2nafM716dzW8I/edit?pli=1#bookmark=id.v415n1pdyfl0|See also the sketch here]]
+
+ '''Other sources:'''
+ * [[http://directory.fsf.org/wiki/Main_Page|Free software directory]] of FSF harvests and integrates information from multiple sources including but not limited to UDD
+ * Should certainly be heavily included in the Tool Registry effort
+ * Would be enormously useful to get design suggestion from FSF directory architects in particular about the information integration (“federated” model)
+ * [[http://taverna.nordugrid.org/sharedRepository/index.php|Nordugrid Taverna WF elements]]
+ * Useful for import to registry, or are those tools anyway better described elsewhere?
+ * Registry accessions could be included into the Nordugrid XML description
+ * [[http://nebc.nerc.ac.uk/tools/bio-linux/package-list|Bio-Linux]] only has few packages that aren’t in Deb Med or aren’t planned to be included in Deb, and still are well-defined software (these may be e.g. unfree or hard to package or too ad-hoc packages)
+ * Should be added to the Tool Registry manually, done person-to-person with Tim (who knows everything relevant about those tools that are relevant for registration)
+ * CloudBioLinux is full of various stuff
+ * thorough tool information starting to be in focus now: the “manifest” which is going to be a YAML about installed stuff. Expected with thrill! :-)
+ * Information from the Tool Registry may be of great benefit to CloudBioLinux
+ * Debian Nonfree: Is included in UDD and among tasks. Bits that aren’t included in those should be included in those :)
+
+
+ '''Integration of EDAM annotations into Debian Med'''
+ * DebTags, Enrico Zini [http://debtags.debian.net, https://wiki.debian.org/Debtags/FAQ]
+ * ‘tasks’ categorisation
+ * '''Challenges:'''
+ A. EDAM concepts need to be identified by alphanumeric IDs/URIs, because terms may & do change in time
+ -- At the same time, of course, the terms need to be presented to both the users and annotators
+ A. Lower priority but possibly high coolness: Search/filtering/grouping by EDAM DAG
+ * '''Solutions:'''
+ A. Separate mapping file (to be packaged) between DebTags and external vocabularies
+ -- Start with EDAM and Media types (in order to have more than EDAM only)
+ -- Before the larger mapping effort, DebTags need to be refactored (by us in accord with Enrico)
+ -- After the mapping, information about the external concepts should be shown to Debian taggers in the tagging Web app
+ A. Record information about available external vocabularies in the mapping files, at the Facet level
+
+
+ '''Tool description model draft 4.0'''
+ * Alignment with Deb Med pkg description
+ * status: fully drafted except DebTags (todo during refactoring of DebTags - see also above)
+ * Compatibility of the tool description XSD with Emil’s & Piotr’s tooling: …
+ * Finishing the tool description XSD and making it compatible with Emil’s & Piotr’s tooling & v.v.
+ * Main bits to finish: Interfaces, Versions
+ * Todo soon: Release new minor BioXSD version catering for the needs of the tool description XSD
+
+ === Miscellanous ===
+
+ Keysigning all round.
+
----
CategorySprint
More information about the debian-med-commit
mailing list