[med-svn] [Git][med-team/sepp][master] 6 commits: Using configuration files per user in ~/.sepp
Pierre Gruet
gitlab at salsa.debian.org
Fri Oct 9 17:01:24 BST 2020
Pierre Gruet pushed to branch master at Debian Med / sepp
Commits:
5f269fbe by Pierre Gruet at 2020-10-08T15:55:34+02:00
Using configuration files per user in ~/.sepp
- - - - -
bca24558 by Pierre Gruet at 2020-10-08T15:56:18+02:00
Correcting another wrong file read in the Java part
- - - - -
f738f0a1 by Pierre Gruet at 2020-10-08T21:49:57+02:00
Documenting the changes from upstream concerning configuration files
debian/README.source and debian/README.Debian explain what changes have been
made and what the user needs to do.
Manpages have been updated to reflect this.
- - - - -
eeabe9c4 by Pierre Gruet at 2020-10-08T22:16:40+02:00
Postponing the inclusion of upp until its dependency pasta is packaged
- - - - -
5eedbb21 by Pierre Gruet at 2020-10-08T22:19:09+02:00
Preparing changelog with ITP number
- - - - -
05aab011 by Pierre Gruet at 2020-10-09T17:54:06+02:00
Polishing files in debian/ with minor changes
Deleting extra whitespaces
Correcting spelling errors
Using neutral they/their instead of she/her
- - - - -
20 changed files:
- + debian/README.Debian
- + debian/README.source
- debian/changelog
- debian/control
- debian/copyright
- debian/docs
- debian/install
- debian/man/run_abundance.py.1
- debian/man/run_sepp.py.1
- debian/man/run_tipp.py.1
- debian/man/run_upp.py.1
- debian/man/sepp.config.5
- debian/man/tipp.config.5
- debian/man/upp.config.5
- debian/patches/use_etc_configuration_files.patch → debian/patches/configuration_files_in_etc_and_per_user.patch
- debian/patches/looking_for_integer_in_placements.patch
- debian/patches/series
- debian/rules
- debian/sepp.manpages
- debian/watch
Changes:
=====================================
debian/README.Debian
=====================================
@@ -0,0 +1,37 @@
+README sepp for Debian
+======================
+
+Programs can be run with the Python launcher files placed into /usr/bin.
+
+Information can be found in the tutorial and readme files written by upstream
+developers. Those files lie in /usr/share/doc/sepp/ .
+Yet, the Debian packaging incorporates some changes, mainly concerning
+configuration files. These are highlighted in the next lines.
+
+
+Differences to sepp (upstream)
+------------------------------
+
+The upstream packaging of sepp includes a file path.config which contains a
+single line with the directory in which the configuration files lie.
+
+We have removed this file and instead placed default versions of the
+configuration files in /etc/sepp/ . These can be overridden by every user by
+copying them into a .sepp/ directory in their home directory, for instance by
+executing
+ mkdir ~/.sepp && cp /etc/sepp/*.config ~/.sepp/
+
+We have also renamed the upstream file from main.config to sepp.config .
+
+
+Tipp: getting the reference dataset and setting things up
+---------------------------------------------------------
+
+Tipp uses a reference dataset, which can be found at
+ https://github.com/tandyw/tipp-reference/releases/download/v2.0.0/tipp.zip
+
+The user willing to use Tipp should thus download the dataset and either have
+their system administrator set its path in the [reference] section of tipp.config
+(see manpage tipp.config(5)) or create their own configuration files as explained
+above to be able to edit ~/.sepp/tipp.config to put the path to the dataset in
+the [reference] section.
=====================================
debian/README.source
=====================================
@@ -0,0 +1,15 @@
+We have brought changes to the upstream packaging by not using the path.config
+file from upstream and instead providing configuration files as conffiles: a
+default version is placed into /etc/sepp, and every user can make a copy to a
+dot directory in their home directory in order to be able to edit them. This
+is explained in debian/README.Debian.
+
+Besides, to be run, tipp depends on a reference dataset that has to be
+downloaded by the user. Again this is explained in debian/README.Debian.
+
+We have not packaged upp for the moment although the effort of preparing it has
+been made, because its dependency pasta is not packaged yet. As the main
+functionalities of the package are already brought by sepp and tipp, we think
+it is worth having sepp and tipp in Debian and adding upp later.
+
+ -- Pierre Gruet <pgtdebian at free.fr> Thu, 08 Oct 2020 21:57:59 +0200
=====================================
debian/changelog
=====================================
@@ -1,5 +1,5 @@
sepp (4.3.10+dfsg-1) UNRELEASED; urgency=medium
- * Initial release (Closes: #<bug>)
+ * Initial release (Closes: #971870)
- -- Pierre Gruet <pgtdebian at free.fr> Wed, 07 Oct 2020 21:32:48 +0200
+ -- Pierre Gruet <pgtdebian at free.fr> Thu, 08 Oct 2020 22:18:43 +0200
=====================================
debian/control
=====================================
@@ -37,9 +37,8 @@ Depends: ${shlibs:Depends},
libcommons-logging-java,
ncbi-blast+
Description: methods use ensembles of Hidden Markov Models (HMM)
- The tools SEPP, TIPP, UPP and HIPPI implementing three methods use
- ensembles of Hidden Markov Models (HMMs) in different ways, each
- focusing on a different problem.
+ The tools SEPP and TIPP implementing these methods use ensembles of Hidden
+ Markov Models (HMMs) in different ways, each focusing on a different problem.
.
SEPP stands for "SATe-enabled Phylogenetic Placement", and addresses the
problem of phylogenetic placement of short reads into reference
@@ -48,12 +47,3 @@ Description: methods use ensembles of Hidden Markov Models (HMM)
TIPP stands for "Taxonomic Identification and Phylogenetic Profiling",
and addresses the problem of taxonomic identification and abundance
profiling of metagenomic data.
- .
- UPP stands for "Ultra-large alignments using Phylogeny-aware Profiles",
- and addresses the problem of alignment of very large datasets,
- potentially containing fragmentary data. UPP can align datasets with up
- to 1,000,000 sequences.
- .
- HIPPI stands for "Highly Accurate Protein Family Classification with
- Ensembles of HMMs", and addresses the problem of classifying query
- sequences to protein families.
=====================================
debian/copyright
=====================================
@@ -14,7 +14,7 @@ License: GPL-3+
Files: debian/*
Copyright: 2020 Andreas Tille <tille at debian.org>
- 2020 Pierre Gruet <pgtdebian at ree.fr>
+ 2020 Pierre Gruet <pgtdebian at free.fr>
License: GPL-3+
License: GPL-3+
=====================================
debian/docs
=====================================
@@ -1,6 +1,6 @@
README.SEPP.md
README.TIPP.md
-README.UPP.md
+# README.UPP.md
tutorial/sepp-tutorial.md
tutorial/tipp-tutorial.md
-tutorial/upp-tutorial.md
+# tutorial/upp-tutorial.md
=====================================
debian/install
=====================================
@@ -1,4 +1,4 @@
sepp.config etc/sepp
tipp.config etc/sepp
-upp.config etc/sepp
+# upp.config etc/sepp
tools/merge/*JsonMerger.jar usr/share/sepp/
=====================================
debian/man/run_abundance.py.1
=====================================
@@ -5,7 +5,7 @@ run_abundance.py \- helper script to estimate the abundance at a given taxonomic
usage: run_abundance.py [\-h] [\-v] [\-A N] [\-P N] [\-F N] [\-\-distance DISTANCE]
.TP
[\-M DIAMETER] [\-S DECOMP] [\-p DIR] [\-o OUTPUT]
-[\-d OUTPUT_DIR] [\-c CONFIG] [\-t TREE] [\-r RAXML]
+[\-d OUTPUT_DIR] [\-t TREE] [\-r RAXML]
[\-a ALIGN] [\-f FRAG] [\-m MOLECULE] [\-x N]
[\-cp CHCK_FILE] [\-cpi N] [\-seed N] [\-at N] [\-pt N]
[\-g N] [\-b N] [\-bin N] [\-D] [\-C N] [\-G GENES]
@@ -74,11 +74,6 @@ this info file to set model parameters), a backbone alignment file (in
fasta format), and a fasta file including fragments. The input sequences
are assumed to be DNA unless specified otherwise.
.TP
-\fB\-c\fR CONFIG, \fB\-\-config\fR CONFIG
-A config file, including options used to run SEPP.
-Options provided as command line arguments overwrite
-config file values for those options. [default: None]
-.TP
\fB\-t\fR TREE, \fB\-\-tree\fR TREE
Input tree file (newick format) [default: None]
.TP
=====================================
debian/man/run_sepp.py.1
=====================================
@@ -5,7 +5,7 @@ run_sepp.py \- a phylogenetic placement tool
usage: run_sepp.py [\-h] [\-v] [\-A N] [\-P N] [\-F N] [\-D DISTANCE] [\-M DIAMETER]
.IP
[\-S DECOMP] [\-p DIR] [\-o OUTPUT] [\-d OUTPUT_DIR]
-[\-c CONFIG] [\-t TREE] [\-r RAXML] [\-a ALIGN] [\-f FRAG]
+[\-t TREE] [\-r RAXML] [\-a ALIGN] [\-f FRAG]
[\-m MOLECULE] [\-x N] [\-cp CHCK_FILE] [\-cpi N] [\-seed N]
.PP
This script runs the SEPP algorithm on an input tree, alignment, fragment
@@ -72,11 +72,6 @@ this info file to set model parameters), a backbone alignment file (in
fasta format), and a fasta file including fragments. The input sequences
are assumed to be DNA unless specified otherwise.
.TP
-\fB\-c\fR CONFIG, \fB\-\-config\fR CONFIG
-A config file, including options used to run SEPP.
-Options provided as command line arguments overwrite
-config file values for those options. [default: None]
-.TP
\fB\-t\fR TREE, \fB\-\-tree\fR TREE
Input tree file (newick format) [default: None]
.TP
=====================================
debian/man/run_tipp.py.1
=====================================
@@ -5,13 +5,20 @@ run_tipp.py \- an identification and phylogenetic profiling tool
usage: run_tipp.py [\-h] [\-v] [\-A N] [\-P N] [\-F N] [\-\-distance DISTANCE]
.IP
[\-M DIAMETER] [\-S DECOMP] [\-p DIR] [\-o OUTPUT]
-[\-d OUTPUT_DIR] [\-c CONFIG] [\-t TREE] [\-r RAXML] [\-a ALIGN]
+[\-d OUTPUT_DIR] [\-t TREE] [\-r RAXML] [\-a ALIGN]
[\-f FRAG] [\-m MOLECULE] [\-x N] [\-cp CHCK_FILE] [\-cpi N]
[\-seed N] [\-R N] [\-at N] [\-D] [\-pt N] [\-PD N]
[\-tx TAXONOMY] [\-txm MAPPING] [\-adt TREE] [\-C N]
.PP
This script runs the SEPP algorithm on an input tree, alignment, fragment
-file, and RAxML info file.
+file, and RAxML info file. It uses a reference dataset which has to be
+downloaded from
+\fBhttps://github.com/tandyw/tipp-reference/releases/download/v2.0.0/tipp.zip\fR
+.PP
+If the local administrator has not set the path to this reference dataset in
+/etc/sepp/tipp.config, you should copy this file to ~/.sepp/ and put the path
+to the dataset in the \fBreference\fR section of the configuration file,
+see \fBtipp.config\fR(5).
.SS "optional arguments:"
.TP
\fB\-h\fR, \fB\-\-help\fR
@@ -74,11 +81,6 @@ this info file to set model parameters), a backbone alignment file (in
fasta format), and a fasta file including fragments. The input sequences
are assumed to be DNA unless specified otherwise.
.TP
-\fB\-c\fR CONFIG, \fB\-\-config\fR CONFIG
-A config file, including options used to run SEPP.
-Options provided as command line arguments overwrite
-config file values for those options. [default: None]
-.TP
\fB\-t\fR TREE, \fB\-\-tree\fR TREE
Input tree file (newick format) [default: None]
.TP
=====================================
debian/man/run_upp.py.1
=====================================
@@ -6,7 +6,7 @@ usage: run_upp.py [\-h] [\-v] [\-F N] [\-D DISTANCE] [\-\-diameter DIAMETER]
.IP
[\-p DIR] [\-o OUTPUT] [\-d OUTPUT_DIR] [\-m MOLECULE] [\-x N]
[\-cp CHCK_FILE] [\-cpi N] [\-seed N] [\-A N] [\-M N] [\-T N]
-[\-B N] [\-S DECOMP] [\-s SEQ] [\-c CONFIG] [\-t TREE] [\-a ALIGN]
+[\-B N] [\-S DECOMP] [\-s SEQ] [\-t TREE] [\-a ALIGN]
[\-l N] [\-P N] [\-r RAXML] [\-f FRAG]
.PP
This script runs the UPP algorithm on set of sequences. A backbone alignment
@@ -95,11 +95,6 @@ alignment is given, the sequence file will be randomly
split into a backbone set (size set to B) and query
set (remaining sequences), [default: None]
.TP
-\fB\-c\fR CONFIG, \fB\-\-config\fR CONFIG
-A config file, including options used to run UPP.
-Options provided as command line arguments overwrite
-config file values for those options. [default: None]
-.TP
\fB\-t\fR TREE, \fB\-\-tree\fR TREE
Input tree file (newick format) [default: None]
.TP
=====================================
debian/man/sepp.config.5
=====================================
@@ -130,6 +130,11 @@ A string, indicating the name of the placer to use. Currently only \fB"pplacer"\
A boolean \fBTrue\fR or \fBFalse\fR, indicating whether to apply weights while caring for fragments or not.
.RE
+.SH FILES
+\fB/etc/sepp/sepp.config\fR
+.PP
+\fB~/.sepp/sepp.config\fR, if existing, overrides the one in /etc/sepp.
+
.SH "SEE ALSO"
.PP
\fBrun_sepp.py\fR(1)
=====================================
debian/man/tipp.config.5
=====================================
@@ -159,6 +159,11 @@ The section begins with \fB[tipp]\fR, and it has the following field:
An boolean \fBtrue\fR or \fBfalse\fR, currently not used.
.RE
+.SH FILES
+\fB/etc/sepp/tipp.config\fR
+.PP
+\fB~/.sepp/tipp.config\fR, if existing, overrides the one in /etc/sepp.
+
.SH "SEE ALSO"
.PP
\fBrun_tipp.py\fR(1)
=====================================
debian/man/upp.config.5
=====================================
@@ -140,6 +140,11 @@ The string \fBrun_pasta.py\fR, pointing to the PASTA program, which has to be
in the PATH.
.RE
+.SH FILES
+\fB/etc/sepp/upp.config\fR
+.PP
+\fB~/.sepp/upp.config\fR, if existing, overrides the one in /etc/sepp.
+
.SH "SEE ALSO"
.PP
\fBrun_upp.py\fR(1)
=====================================
debian/patches/use_etc_configuration_files.patch → debian/patches/configuration_files_in_etc_and_per_user.patch
=====================================
@@ -1,48 +1,62 @@
-Description: using configuration files that are in /etc
+Description: using configuration files that are in /etc or in ~/.sepp
Author: Pierre Gruet <pgtdebian at free.fr>
Forwarded: not-needed
-Last-Update: 2020-09-19
+Last-Update: 2020-10-08
--- a/run_tipp_tool.py
+++ b/run_tipp_tool.py
-@@ -43,9 +43,7 @@
+@@ -43,10 +43,11 @@
return args
-root_p = open(os.path.join(os.path.split(
- os.path.split(sepp.__file__)[0])[0], "home.path")).readlines()[0].strip()
-tipp_config_path = os.path.join(root_p, "tipp.config")
-+tipp_config_path = "/etc/sepp/tipp.config"
-
+-
++home = os.path.expanduser("~")
++if os.path.isfile(home + "/.sepp/tipp.config"):
++ tipp_config_path = home + "/.sepp/tipp.config"
++else:
++ tipp_config_path = "/etc/sepp/tipp.config"
def profile(input, gene, output, prefix, threshold):
+ sepp.config.set_main_config_path(tipp_config_path)
--- a/sepp/config.py
+++ b/sepp/config.py
-@@ -48,9 +48,7 @@
+@@ -48,10 +48,11 @@
_LOG = get_logger(__name__)
-root_p = open(os.path.join(os.path.split(
- os.path.split(__file__)[0])[0], "home.path")).readlines()[0].strip()
-main_config_path = os.path.join(root_p, "main.config")
-+main_config_path = "/etc/sepp/sepp.config"
-
+-
++home = os.path.expanduser("~")
++if os.path.isfile(home + "/.sepp/sepp.config"):
++ main_config_path = home + "/.sepp/sepp.config"
++else:
++ main_config_path = "/etc/sepp/sepp.config"
def set_main_config_path(filename):
+ global main_config_path
--- a/sepp/ensemble.py
+++ b/sepp/ensemble.py
-@@ -161,7 +161,7 @@
+@@ -161,7 +161,11 @@
def augment_parser():
- sepp.config.set_main_config_path(os.path.expanduser("~/.sepp/upp.config"))
-+ sepp.config.set_main_config_path("/etc/sepp/upp.config")
++ home = os.path.expanduser("~")
++ if os.path.isfile(home + "/.sepp/upp.config"):
++ sepp.config.set_main_config_path(home + "/.sepp/upp.config")
++ else:
++ sepp.config.set_main_config_path("/etc/sepp/upp.config")
parser = sepp.config.get_parser()
parser.description = (
"This script runs the UPP algorithm on set of sequences. A backbone "
--- a/sepp/exhaustive_tipp.py
+++ b/sepp/exhaustive_tipp.py
-@@ -483,12 +483,7 @@
+@@ -483,12 +483,11 @@
def augment_parser():
@@ -52,20 +66,28 @@ Last-Update: 2020-09-19
- os.path.split(
- __file__)[0])[0], "home.path")).readlines()[0].strip()
- tipp_config_path = os.path.join(root_p, "tipp.config")
-+ tipp_config_path = "/etc/sepp/tipp.config"
++ home = os.path.expanduser("~")
++ if os.path.isfile(home + "/.sepp/tipp.config"):
++ tipp_config_path = home + "/.sepp/tipp.config"
++ else:
++ tipp_config_path = "/etc/sepp/tipp.config"
sepp.config.set_main_config_path(tipp_config_path)
# default_settings['DEF_P'] = (100 ,
# "Number of taxa (i.e. no decomposition)")
--- a/sepp/exhaustive_upp.py
+++ b/sepp/exhaustive_upp.py
-@@ -324,9 +324,7 @@
+@@ -324,9 +324,11 @@
def augment_parser():
- root_p = open(os.path.join(os.path.split(
- os.path.split(__file__)[0])[0], "home.path")).readlines()[0].strip()
- upp_config_path = os.path.join(root_p, "upp.config")
-+ upp_config_path = "/etc/sepp/upp.config"
++ home = os.path.expanduser("~")
++ if os.path.isfile(home + "/.sepp/upp.config"):
++ upp_config_path = home + "/.sepp/upp.config"
++ else:
++ upp_config_path = "/etc/sepp/upp.config"
sepp.config.set_main_config_path(upp_config_path)
parser = sepp.config.get_parser()
parser.description = (
=====================================
debian/patches/looking_for_integer_in_placements.patch
=====================================
@@ -4,7 +4,7 @@ Description: looking for the right index in placements JSONArray
for the right position.
Author: Pierre Gruet <pgtdebian at free.fr>
Forwarded: https://github.com/smirarab/sepp/issues/86
-Last-Update: 2020-10-05
+Last-Update: 2020-10-08
--- a/tools/merge/src/phylolab/taxonamic/PPlacerJSONMerger.java
+++ b/tools/merge/src/phylolab/taxonamic/PPlacerJSONMerger.java
@@ -30,3 +30,29 @@ Last-Update: 2020-10-05
if (pr.getDouble(3) > ((Double) mainEdgeLen.get(newLab))
.doubleValue())
+--- a/tools/merge/src/phylolab/taxonamic/JSONMerger.java
++++ b/tools/merge/src/phylolab/taxonamic/JSONMerger.java
+@@ -205,6 +205,12 @@
+ HashMap < String, String > labelMap = mapTreeBranchNames(jsonTree, originalTree);
+
+ JSONArray placements = json.getJSONArray("placements");
++ JSONArray fields = json.getJSONArray("fields");
++
++ int locEdgeNum=0;
++ while (! fields.getString(locEdgeNum).equals("edge_num")) {
++ locEdgeNum++;
++ }
+
+ for (Iterator < JSONObject > iterator = placements.iterator(); iterator.hasNext();) {
+ JSONObject placement = iterator.next();
+@@ -254,8 +260,8 @@
+ /*
+ * Adjust the placement edge label
+ */
+- String newLab = (String) labelMap.get(precord.getString(0));
+- newRecord.set(0, new Integer(newLab));
++ String newLab = (String) labelMap.get(precord.getString(locEdgeNum));
++ newRecord.set(locEdgeNum, new Integer(newLab));
+ /*
+ * Adjust edge length values to correspond to somewhere on the main tree.
+ * TODO: This is pretty bad. We should fix this.
=====================================
debian/patches/series
=====================================
@@ -3,7 +3,7 @@ use_debian_packaged_guppy_from_pplacer.patch
change_java_version_for_ant.patch
json_collections.patch
java_build.patch
-use_etc_configuration_files.patch
+configuration_files_in_etc_and_per_user.patch
looking_for_integer_in_placements.patch
deactivating_log_test.patch
make_split_sequences_script.patch
=====================================
debian/rules
=====================================
@@ -1,21 +1,8 @@
#!/usr/bin/make -f
-# DH_VERBOSE := 1
export LC_ALL=C.UTF-8
include /usr/share/dpkg/default.mk
-# this provides:
-# DEB_SOURCE: the source package name
-# DEB_VERSION: the full version of the package (epoch + upstream vers. + revision)
-# DEB_VERSION_EPOCH_UPSTREAM: the package's version without the Debian revision
-# DEB_VERSION_UPSTREAM_REVISION: the package's version without the Debian epoch
-# DEB_VERSION_UPSTREAM: the package's upstream version
-# DEB_DISTRIBUTION: the distribution(s) listed in the current entry of debian/changelog
-# SOURCE_DATE_EPOCH: the source release date as seconds since the epoch, as
-# specified by <https://reproducible-builds.org/specs/source-date-epoch/>
-
-# for hardening you might like to uncomment this:
-# export DEB_BUILD_MAINT_OPTIONS=hardening=+all
%:
dh $@ --with python3 --buildsystem=pybuild
@@ -61,6 +48,9 @@ override_dh_auto_install:
# not really part of the software but instead a helper script.
mkdir -p debian/sepp/usr/share/sepp
mv debian/sepp/usr/bin/split_sequences.py debian/sepp/usr/share/sepp
+
+ # Not installing upp for now.
+ $(RM) debian/sepp/usr/bin/run_upp.py
# To test, we move the test tree into a subdir of /tmp and we proceed to the
# following changes:
@@ -86,5 +76,7 @@ ifeq (,$(filter nocheck,$(DEB_BUILD_OPTIONS)))
PYTHONPATH=$$PYTHONPATH:../../ python3 -m unittest discover -v
endif
+# Invoking dh_installman with --language=C to get "traditional" manpages
+# although the files we put in /usr/bin end with ".py".
override_dh_installman:
dh_installman --language=C
=====================================
debian/sepp.manpages
=====================================
@@ -1,2 +1,8 @@
-debian/man/*.1
-debian/man/*.5
+debian/man/run_abundance.py.1
+debian/man/run_sepp.py.1
+debian/man/run_tipp.py.1
+debian/man/run_tipp_tool.py.1
+# debian/man/run_upp.py.1
+debian/man/sepp.config.5
+debian/man/tipp.config.5
+# debian/man/upp.config.5
=====================================
debian/watch
=====================================
@@ -1,4 +1,4 @@
- version=4
+version=4
opts="repacksuffix=+dfsg,dversionmangle=auto,repack,compression=xz,filenamemangle=s%(?:.*?)?v?(\d[\d.]*)\.tar\.gz%@PACKAGE at -$1.tar.gz%" \
https://github.com/smirarab/sepp//releases .*/archive/v?@ANY_VERSION@\.tar\.gz
View it on GitLab: https://salsa.debian.org/med-team/sepp/-/compare/f625ecdfe81b837b16ba9081941187688a2bc813...05aab011776cfd4a695a212e0d0fe0cd4af18924
--
View it on GitLab: https://salsa.debian.org/med-team/sepp/-/compare/f625ecdfe81b837b16ba9081941187688a2bc813...05aab011776cfd4a695a212e0d0fe0cd4af18924
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20201009/9f7f009b/attachment-0001.html>
More information about the debian-med-commit
mailing list