[med-svn] [Git][med-team/resfinder][upstream] New upstream version 4.1.5

Nilesh Patra (@nilesh) gitlab at salsa.debian.org
Sun Jul 4 14:09:43 BST 2021



Nilesh Patra pushed to branch upstream at Debian Med / resfinder


Commits:
327f6198 by Nilesh Patra at 2021-07-04T17:16:42+05:30
New upstream version 4.1.5
- - - - -


19 changed files:

- README.md
- cge/out/result.py
- cge/out/util/generator.py
- cge/out/valueparsers.py
- cge/phenotype2genotype/isolate.py
- cge/pointfinder.py
- cge/resfinder.py
- cge/standardize_results.py
- + database_tests.md
- dockerfile
- run_resfinder.py
- + scripts/env_variables.txt
- scripts/resfinder.nf
- scripts/resfinder_asm.nf
- + scripts/wdl/README.md
- + scripts/wdl/computerome.conf
- + scripts/wdl/input.json
- + scripts/wdl/resfinder.wdl
- tests/functional_tests.py


Changes:

=====================================
README.md
=====================================
@@ -17,7 +17,7 @@ then the dependencies, and finally the databases. A more detailed breakdown of t
 installation is provided below:
 
 1. Install ResFinder tool
-2. Install python modules: Tabulate, BioPython, CGECore
+2. Install python modules
 3. Install BLAST (optional)
 4. install KMA (optional)
 5. Download ResFinder database
@@ -37,8 +37,8 @@ Setting up ResFinder script and database
 # Go to wanted location for resfinder
 cd /path/to/some/dir
 
-# Clone branch 4.0 and enter the resfinder directory
-git clone -b 4.0 https://git@bitbucket.org/genomicepidemiology/resfinder.git
+# Clone the latest version and enter the resfinder directory
+git clone https://git@bitbucket.org/genomicepidemiology/resfinder.git
 cd resfinder
 
 ```
@@ -51,7 +51,7 @@ KMA is used to analyse read data (ie. FASTQ files).
 #### Python modules: Tabulate, BioPython, CGECore and Python-Git
 To install the needed python modules you can use pip
 ```bash
-pip3 install tabulate biopython cgecore gitpython
+pip3 install tabulate biopython cgecore gitpython python-dateutil
 ```
 For more information visit the respective website
 ```url
@@ -179,6 +179,7 @@ kma_index -i db_pointfinder/mycobacterium_tuberculosis/*.fsa -o db_pointfinder/m
 ```
 
 ### Test ResFinder intallation
+(This will not function with the docker installation.)
 If you did not install BLAST, test 1 and 3 will fail. If you did not install KMA, test 2
 and 4 will fail.
 The 4 tests will in total take approximately take 5-60 seconds, depending on your system.
@@ -259,6 +260,11 @@ optional arguments:
                         Path to kma
   -s SPECIES, --species SPECIES
                         Species in the sample
+						Available species: Campylobacter, Campylobacter jejuni, Campylobacter coli, 
+						Enterococcus faecalis, Enterococcus faecium, Escherichia coli, Helicobacter pylori,
+						Klebsiella, Mycobacterium tuberculosis, Neisseria gonorrhoeae,
+						Plasmodium falciparum, Salmonella, Salmonella enterica, Staphylococcus aureus
+						-s "Other" can be used for metagenomic samples or samples with unknown species.
   -db_res DB_PATH_RES, --db_path_res DB_PATH_RES
                         Path to the databases for ResFinder
   -db_res_kma DB_PATH_RES_KMA, --db_path_res_kma DB_PATH_RES_KMA
@@ -290,6 +296,7 @@ optional arguments:
 											  Threshold for identity of Pointfinder. If None is
 											  selected, the minimum coverage of ResFinder will be
 											  used.
+
 ```
 
 ### Web-server
@@ -298,41 +305,27 @@ A webserver implementing the methods is available at the [CGE
 website](http://www.genomicepidemiology.org/) and can be found here:
 https://cge.cbs.dtu.dk/services/ResFinder/
 
-### Docker
-
-The databases needs to be cloned and possibly index seperately. This is described on each of the database bitbucket sites:
-```url
-https://bitbucket.org/genomicepidemiology/resfinder_db/overview
-https://bitbucket.org/genomicepidemiology/pointfinder_db/overview
-```
-
-The databases needs to be mounted when running docker using the -v option.
-
-It can be desirable to create environment variables:
+### Install ResFinder with Docker
+If you would like to build a docker image with ResFinder, make sure you have cloned the ResFinder directory as well as installed and indexed the databases: `db_pointfinder` and `db_resfinder`. Then run the following commands:
 ```bash
-PF_DB=/Users/rolf/ownCloud2/Scripts-CGE/database_pointfinder
-RF_DB=/Users/rolf/ownCloud2/Scripts-CGE/resfinder_db
+# Go to ResFinder directory
+cd path/to/resfinder
+# Build docker image with name resfinder
+docker build -t resfinder .
 ```
+When running the docker make sure to mount the `db_resfinder` and the `db_pointfinder` with the flag -v, as shown in the examples below. 
 
-#### Build and test docker image
+You can test the installation by running the docker with the test files: 
 ```bash
-# Go to directory containing the dockerfile
-cd /some/dir
-# Build image
-docker build -t resfinder .
-# Test installation
-docker run --rm -v $(pwd):/workdir -v $PF_DB:/resfinder/db_pointfinder -v $RF_DB:/resfinder/db_resfinder --entrypoint /resfinder/tests/functional_tests.py resfinder
-```
-If you did not use environment variables, simply replace them with the appropriate paths.
+cd path/to/resfinder/
+mkdir results
 
-#### Runnning the docker image
+# Run with raw data (this command mounts the results to the local directory "results")
+docker run --rm -it -v $(pwd)/db_resfinder/:/usr/src/db_resfinder -v $(pwd)/results/:/usr/src/results resfinder -ifq /usr/src/tests/data/test_isolate_01_1.fq /usr/src/tests/data/test_isolate_01_2.fq -acq -db_res /usr/src/db_resfinder -o /usr/src/results
 
-The docker image should be used as an executable.
-Example:
-```bash
-docker run --rm -v $(pwd):/workdir -v $PF_DB:/resfinder/db_pointfinder -v $RF_DB:/resfinder/db_resfinder resfinder -o <output_dir> -s "<species>" --acquired --point -ifq <path/to/*.fq>
+# Run with assembled data (this command mounts the results to the local directory "results")
+docker run --rm -it -v $(pwd)/db_resfinder/:/usr/src/db_resfinder  -v $(pwd)/results/:/usr/src/results resfinder -ifa /usr/src/tests/data/test_isolate_01.fa -acq -db_res /usr/src/db_resfinder -o /usr/src/results
 ```
-If you did not use environment variables, simply replace them with the appropriate paths.
 
 Citation
 =======
@@ -366,4 +359,4 @@ Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
-limitations under the License.
+limitations under the License.
\ No newline at end of file


=====================================
cge/out/result.py
=====================================
@@ -78,6 +78,20 @@ class Result(dict):
         else:
             self[cl] = res
 
+    def modify_class(self, cl, result_type=None, **kwargs):
+        type = self._get_type(result_type, **kwargs)
+        res = Result(result_type=type, **kwargs)
+        res_id = res["ref_id"].replace("_", ";;")
+        for key, value in res.items():
+            if key not in self[cl][res_id]:
+                self[cl][res_id][key] = value
+            elif self[cl][res_id][key] != value:
+                self[cl][res_id][key] = \
+                        self[cl][res_id][key] \
+                        + ", " + value
+            else:
+                pass
+
     def check_results(self, errors=None):
         self.errors = {}
 


=====================================
cge/out/util/generator.py
=====================================
@@ -1,6 +1,7 @@
 #!/usr/bin/env python3
 
 from git import Repo
+from git.exc import InvalidGitRepositoryError
 from datetime import datetime, timezone
 
 from ..result import Result
@@ -29,7 +30,7 @@ class Generator():
     def get_version_commit(gitdir):
         try:
             repo = Repo(gitdir)
-        except:
+        except InvalidGitRepositoryError:
             return ("unknown", "unknown")
 
         com2tag = {}


=====================================
cge/out/valueparsers.py
=====================================
@@ -45,5 +45,5 @@ class ValueParsers():
     def parse_float(val):
         try:
             val = float(val)
-        except ValueError:
+        except TypeError:
             return "Value must be a float. Value was: {}".format(val)


=====================================
cge/phenotype2genotype/isolate.py
=====================================
@@ -103,7 +103,7 @@ class Isolate(dict):
                 phenodb_id = ref_id
             # Amino acid mutation
             else:
-                phenodb_id = ref_id[:-3] + feat_res_dict["var_aa"]
+                phenodb_id = ref_id[:-1] + feat_res_dict["var_aa"]
             return phenodb_id
 
         elif(type == "genes"):
@@ -112,13 +112,10 @@ class Isolate(dict):
 
     def load_finder_results(self, std_table, phenodb, type):
         for key, feat_info in std_table[type].items():
-
             if(type == "genes"
                and re.search("PointFinder", feat_info["ref_database"])):
                continue
-
             unique_id = Isolate.get_phenodb_id(feat_info, type)
-
             phenotypes = phenodb.get(unique_id, None)
             ab_class = set()
             if(phenotypes):
@@ -146,7 +143,6 @@ class Isolate(dict):
             nucleotide_mut = True
         else:
             nucleotide_mut = False
-
         feat_res = ResMutation(unique_id=unique_id,
                                seq_region=";;".join(feat_info["genes"]),
                                pos=feat_info["ref_start_pos"],
@@ -257,7 +253,6 @@ class Isolate(dict):
                         nucleotide_mut = True
                     else:
                         nucleotide_mut = False
-
                     feat_res = ResMutation(unique_id=unique_id,
                                            seq_region=feature["template_name"],
                                            pos=feature["query_start_pos"],


=====================================
cge/pointfinder.py
=====================================
@@ -190,7 +190,7 @@ class PointFinder(CGEFinder):
         ]
         # Get all drug names and add header of all drugs to prediction file
         drug_lst = [drug for drug in self.drug_genes.keys()]
-        output_strings[2] = "Sample ID\t" + "\t".join(drug_lst) + "\n"
+        output_strings[2] = "\t".join(drug_lst) + "\n"
 
         # Define variables to write temporary output into
         total_unknown_str = ""
@@ -980,7 +980,8 @@ class PointFinder(CGEFinder):
         del_codon = PointFinder.get_codon(sbjct_seq, codon_no, start_offset)
         pos_name = "p.%s%d" % (PointFinder.aa(del_codon), codon_no)
 
-        if len(sbjct_rf_indel) == 3:
+        # This has been changed
+        if len(sbjct_rf_indel) == 3 and mutation == "del":
             return pos_name + mutation
 
         end_codon_no = codon_no + math.ceil(len(sbjct_nucs) / 3) - 1
@@ -988,10 +989,8 @@ class PointFinder(CGEFinder):
                                           start_offset)
         pos_name += "_%s%d%s" % (PointFinder.aa(end_codon), end_codon_no,
                                  mutation)
-
         if mutation == "delins":
             pos_name += aa_alt
-
         return pos_name
 
     @staticmethod
@@ -1214,9 +1213,6 @@ class PointFinder(CGEFinder):
                     try:
                         indel_data = indels[indel_no]
                     except IndexError:
-                        print(sbjct_codon, qry_codon)
-                        print(indels)
-                        print(gene, indel_data, indel_no)
                         sys.exit("indel_data list is out of range, bug!")
 
                     mut = indel_data[0]
@@ -1275,11 +1271,11 @@ class PointFinder(CGEFinder):
                                 sbjct_seq, indel, sbjct_rf_indel, qry_rf_indel,
                                 codon_no, mut, sbjct_start - 1)
                         )
-
                         if "Frameshift" in mut_name:
                             mut_name = (mut_name.split("-")[0]
                                         + "- Frame restored")
-
+                        if mut_name is "p.V940delins - Frame restored":
+                            sys.exit()
                     mis_matches += [[mut, codon_no_indel, seq_pos, indel,
                                      mut_name, sbjct_rf_indel, qry_rf_indel,
                                      aa_ref, aa_alt]]
@@ -1383,6 +1379,11 @@ class PointFinder(CGEFinder):
                 r"^p.(\D{1})(\d+)_(\D{1})(\d+)delins(\S+)$", m)
             single_delins_match = re.search(
                 r"^p.(\D{1})(\d+)delins(\S+)$", m)
+            # TODO: is both necessary?
+            multi_delins_match2 = re.search(
+                r"^p.(\D{1})(\d+)_(\D{1})(\d+)delins$", m)
+            single_delins_match2 = re.search(
+                r"^p.(\D{1})(\d+)delins$", m)
             multi_ins_match = re.search(
                 r"^p.(\D{1})(\d+)_(\D{1})(\d+)ins(\D*)$", m)
             if(multi_delins_match or single_delins_match):
@@ -1485,7 +1486,6 @@ class PointFinder(CGEFinder):
             # nuc_alt = mis_matches[i][6]
             ref = mis_matches[i][-2]
             alt = mis_matches[i][-1]
-
             mut_dict = PointFinder.mutstr2mutdict(mut_name)
 
             mut_id = ("{gene}_{pos}_{alt}"


=====================================
cge/resfinder.py
=====================================
@@ -117,7 +117,7 @@ class ResFinder(CGEFinder):
 
         return std_results
 
-    def write_results(self, out_path, result, res_type):
+    def write_results(self, out_path, result, res_type, software="ResFinder"):
         """
         """
         if(res_type == ResFinder.TYPE_BLAST):
@@ -130,15 +130,15 @@ class ResFinder(CGEFinder):
             result_str = self.results_to_str(res_type=res_type,
                                              results=result)
 
-        with open(out_path + "/results_tab.txt", "w") as fh:
+        with open(out_path + "/{}_results_tab.txt".format(software), "w") as fh:
             fh.write(result_str[0])
-        with open(out_path + "/results_table.txt", "w") as fh:
+        with open(out_path + "/{}_results_table.txt".format(software), "w") as fh:
             fh.write(result_str[1])
-        with open(out_path + "/results.txt", "w") as fh:
+        with open(out_path + "/{}_results.txt".format(software), "w") as fh:
             fh.write(result_str[2])
-        with open(out_path + "/Resistance_gene_seq.fsa", "w") as fh:
+        with open(out_path + "/{}_Resistance_gene_seq.fsa".format(software), "w") as fh:
             fh.write(result_str[3])
-        with open(out_path + "/Hit_in_genome_seq.fsa", "w") as fh:
+        with open(out_path + "/{}_Hit_in_genome_seq.fsa".format(software), "w") as fh:
             fh.write(result_str[4])
 
     def blast(self, inputfile, out_path, min_cov=0.9, threshold=0.6,
@@ -424,8 +424,8 @@ class ResFinder(CGEFinder):
         """
         """
         # Check if databases and config file are correct/correponds
-        if databases is '':
-                sys.exit("Input Error: No database was specified!\n")
+        if databases == '':
+            sys.exit("Input Error: No database was specified!\n")
         elif databases is None:
             # Choose all available databases from the config file
             self.databases = self.configured_dbs.keys()
@@ -473,7 +473,7 @@ class ResFinder(CGEFinder):
                     db = "%s/%s.%s" % (self.db_path, db_prefix, ext)
                     if not os.path.exists(db):
                         sys.exit(("Input Error: The database file (%s) "
-                                  "could not be found!") % (db_path))
+                                  "could not be found!") % (db))
 
                 if db_prefix not in self.configured_dbs:
                     self.configured_dbs[db_prefix] = []


=====================================
cge/standardize_results.py
=====================================
@@ -12,7 +12,6 @@ import json
 class SeqVariationResult(dict):
     def __init__(self, res_collection, mismatch, region_results, db_name):
         self.res_collection = res_collection
-
         self.load_var_type(mismatch[0])
         self["ref_start_pos"] = mismatch[1]
         self["ref_end_pos"] = mismatch[2]
@@ -23,14 +22,20 @@ class SeqVariationResult(dict):
         if(len(mismatch) > 7):
             self["ref_aa"] = mismatch[7].lower()
             self["var_aa"] = mismatch[8].lower()
-
         region_name = region_results[0]["ref_id"]
         region_name = PhenoDB.if_promoter_rename(region_name)
 
         self["type"] = "seq_variation"
-        self["ref_id"] = ("{id}{deli}{pos}{deli}{var}"
-                          .format(id=region_name, pos=self["ref_start_pos"],
-                                  var=self["var_codon"], deli="_"))
+        if(len(mismatch) > 7):
+            self["ref_id"] = ("{id}{deli}{pos}{deli}{var}"
+                              .format(id=region_name,
+                                      pos=self["ref_start_pos"],
+                                      var=self["var_aa"], deli="_"))
+        else:
+            self["ref_id"] = ("{id}{deli}{pos}{deli}{var}"
+                              .format(id=region_name,
+                                      pos=self["ref_start_pos"],
+                                      var=self["var_codon"], deli="_"))
         self["key"] = self._get_unique_key()
         self["seq_var"] = mut_string
 
@@ -85,13 +90,13 @@ class GeneResult(dict):
 
         self["ref_start_pos"] = res["sbjct_start"]
         self["ref_end_pos"] = res["sbjct_end"]
-        self["key"] = self._get_unique_gene_key(res_collection)
         self["identity"] = res["perc_ident"]
         self["alignment_length"] = res["HSP_length"]
         self["ref_gene_lenght"] = res["sbjct_length"]
         self["query_id"] = res["contig_name"]
         self["query_start_pos"] = res["query_start"]
         self["query_end_pos"] = res["query_end"]
+        self["key"] = self._get_unique_gene_key(res_collection)
 
         # BLAST coverage formatted results
         coverage = res.get("coverage", None)
@@ -108,7 +113,6 @@ class GeneResult(dict):
 
         db_key = DatabaseHandler.get_key(res_collection, db_name)
         self["ref_database"] = db_key
-
         self.remove_NAs()
 
     @staticmethod
@@ -139,15 +143,30 @@ class GeneResult(dict):
                         .format(deli=delimiter, var=self.variant, **self))
         if(self.db_name == "PointFinder"):
             gene_key = self["name"]
-
         # Attach random string if key already exists
         minimum_gene_key = gene_key
+        if gene_key in res_collection["genes"]:
+            if(self["query_id"] == "NA"):
+                gene_key = self.get_rnd_unique_gene_key(
+                    gene_key, res_collection, minimum_gene_key, delimiter)
+            elif (self["query_id"]
+                    != res_collection["genes"][gene_key]["query_id"]
+                  or self["query_start_pos"]
+                    != res_collection["genes"][gene_key]["query_start_pos"]
+                  or self["query_end_pos"]
+                    != res_collection["genes"][gene_key]["query_end_pos"]):
+                gene_key = self.get_rnd_unique_gene_key(
+                    gene_key, res_collection, minimum_gene_key, delimiter)
+
+        return gene_key
+
+    def get_rnd_unique_gene_key(self, gene_key, res_collection,
+                                minimum_gene_key, delimiter):
         while(gene_key in res_collection["genes"]):
             rnd_str = GeneResult.randomString()
             gene_key = ("{key}{deli}{rnd}"
                         .format(key=minimum_gene_key, deli=delimiter,
                                 rnd=rnd_str))
-
         return gene_key
 
     @staticmethod
@@ -162,10 +181,11 @@ class PhenotypeResult(dict):
         self["category"] = "amr"
         self["key"] = antibiotic.name
         self["amr_classes"] = antibiotic.classes
-        self["amr_resistance"] = antibiotic.name
+        self["resistance"] = antibiotic.name
+        self["resistant"] = False
 
-    def resistant(self, res):
-        self["amr_resistant"] = res
+    def set_resistant(self, res):
+        self["resistant"] = res
 
     def add_feature(self, res_collection, isolate, feature):
         # Get all keys in the result that matches the feature in question.
@@ -174,10 +194,8 @@ class PhenotypeResult(dict):
         # they will all have different keys, but identical ref ids.
 
         ref_id, type = PhenotypeResult.get_ref_id_and_type(feature, isolate)
-
         feature_keys = PhenotypeResult.get_keys_matching_ref_id(
             ref_id, res_collection[type])
-
         # Add keys to phenotype results
         pheno_feat_keys = self.get(type, [])
         pheno_feat_keys = pheno_feat_keys + feature_keys
@@ -190,7 +208,6 @@ class PhenotypeResult(dict):
             pheno_keys = feat_result.get("phenotypes", [])
             pheno_keys.append(self["key"])
             feat_result["phenotypes"] = pheno_keys
-
         if(type == "genes"):
             db_key = DatabaseHandler.get_key(res_collection, "ResFinder")
         elif(type == "seq_variations"):
@@ -216,6 +233,7 @@ class PhenotypeResult(dict):
         for key, results in res_collection.items():
             if(ref_id == results["ref_id"]):
                 out_keys.append(key)
+
         return out_keys
 
 
@@ -229,19 +247,15 @@ class ResFinderResultHandler():
             for phenodb_ab in isolate.resprofile.phenodb.antibiotics[ab_class]:
 
                 phenotype = PhenotypeResult(phenodb_ab)
-
                 # Isolate is resistant towards the antibiotic
                 if(phenodb_ab in isolate.resprofile.resistance):
-                    phenotype.resistant(True)
+                    phenotype.set_resistant(True)
 
                     isolate_ab = isolate.resprofile.resistance[phenodb_ab]
                     for unique_id, feature in isolate_ab.features.items():
-                        phenotype.add_feature(res_collection, isolate, feature)
-
-                # No resistance found
-                else:
-                    phenotype.resistant(False)
-
+                        if(isinstance(feature, ResGene)):
+                            phenotype.add_feature(res_collection, isolate,
+                                                  feature)
                 res_collection.add_class(cl="phenotypes", **phenotype)
 
     @staticmethod
@@ -256,9 +270,11 @@ class ResFinderResultHandler():
             for unique_id, hit_db in db.items():
                 if(unique_id in res["excluded"]):
                     continue
-
                 gene_result = GeneResult(res_collection, hit_db, ref_db_name)
-                res_collection.add_class(cl="genes", **gene_result)
+                if gene_result["key"] in res_collection["genes"]:
+                    res_collection.modify_class(cl="genes", **gene_result)
+                else:
+                    res_collection.add_class(cl="genes", **gene_result)
 
 
 class DatabaseHandler():
@@ -293,20 +309,15 @@ class PointFinderResultHandler():
             for phenodb_ab in isolate.resprofile.phenodb.antibiotics[ab_class]:
 
                 phenotype = PhenotypeResult(phenodb_ab)
-
                 # Isolate is resistant towards the antibiotic
                 if(phenodb_ab in isolate.resprofile.resistance):
-                    phenotype.resistant(True)
+                    phenotype.set_resistant(True)
 
                     isolate_ab = isolate.resprofile.resistance[phenodb_ab]
                     for unique_id, feature in isolate_ab.features.items():
-                        if(isinstance(feature, Gene)):
-                            phenotype.add_gene(res_collection, isolate, feature)
-
-                # No resistance found
-                else:
-                    phenotype.resistant(False)
-
+                        if(isinstance(feature, ResMutation)):
+                            phenotype.add_feature(res_collection, isolate,
+                                                  feature)
                 res_collection.add_class(cl="phenotypes", **phenotype)
 
     @staticmethod
@@ -347,8 +358,6 @@ class PointFinderResultHandler():
             mismatches = db["mis_matches"]
 
 #DEBUG
-#            print("MISMATCHES: {}".format(mismatches))
-
             for mismatch in mismatches:
                 seq_var_result = SeqVariationResult(
                     res_collection, mismatch, gene_results, ref_db_name)


=====================================
database_tests.md
=====================================
@@ -0,0 +1,84 @@
+# PhenoDB Test
+
+It is not necessary to test the validity of a database that has just been
+cloned. The main reason for running tests are to test a database that has been
+altered. The tests run are by no mean exhaustive and will not guarantee a valid
+database.
+
+This file will test the validity of the databases installed in the default
+locations. These are:
+- [resfinder app dir]/db_resfinder
+- [resfinder app dir]/db_pointfinder
+Where [resfinder app dir] is the root directory of the ResFinder application.
+You will find the "run_resfinder.py" file in this directory.
+
+Run the following command to test validity of databases.
+
+```bash
+
+python3 -m doctest database_test.md
+
+```
+
+*Note*: Change the database locations to be tested by changing the first three
+lines of the python code below in this file.
+
+```python
+
+>>> db_resfinder = "db_resfinder/"
+>>> db_pointfinder = "db_pointfinder/"
+```
+
+## Test phenotype.txt and resistens-overview.txt files
+
+```python
+
+>>> from cge.phenotype2genotype.res_profile import PhenoDB
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    acquired_file="{}phenotypes.txt".format(db_resfinder),
+...    point_file="{}campylobacter/resistens-overview.txt".format(db_pointfinder),
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    point_file="{}enterococcus_faecalis/resistens-overview.txt".format(db_pointfinder))
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    point_file="{}enterococcus_faecium/resistens-overview.txt".format(db_pointfinder))
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    point_file="{}escherichia_coli/resistens-overview.txt".format(db_pointfinder))
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    point_file="{}helicobacter_pylori/resistens-overview.txt".format(db_pointfinder))
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    point_file="{}klebsiella/resistens-overview.txt".format(db_pointfinder))
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    point_file="{}mycobacterium_tuberculosis/resistens-overview.txt".format(db_pointfinder))
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    point_file="{}neisseria_gonorrhoeae/resistens-overview.txt".format(db_pointfinder))
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    point_file="{}plasmodium_falciparum/resistens-overview.txt".format(db_pointfinder))
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    point_file="{}salmonella/resistens-overview.txt".format(db_pointfinder))
+
+>>> phenodb = PhenoDB(
+...    abclassdef_file="{}antibiotic_classes.txt".format(db_resfinder),
+...    point_file="{}staphylococcus_aureus/resistens-overview.txt".format(db_pointfinder))
+
+
+```


=====================================
dockerfile
=====================================
@@ -1,56 +1,53 @@
-FROM debian:9.4-slim
+FROM debian:stretch
 
 ENV DEBIAN_FRONTEND noninteractive
 
-# Setup .bashrc file for convenience during debugging
-RUN echo "alias ls='ls -h --color=tty'\n"\
-"alias ll='ls -lrt'\n"\
-"alias l='less'\n"\
-"alias du='du -hP --max-depth=1'\n"\
-"alias cwd='readlink -f .'\n"\
-"PATH=$PATH\n">> ~/.bashrc
+### RUN set -ex; \
 
-RUN set -ex && \
-    # Basix setup \
-    apt-get update -y -qq && \
+RUN apt-get update -qq; \
     apt-get install -y -qq git \
     apt-utils \
     wget \
     python3-pip \
+    ncbi-blast+ \
     libz-dev \
-    && \
-    # Python 3 setup \
-    pip3 install --upgrade pip setuptools && \
-    ln -sf /usr/bin/pip3 /usr/bin/pip && \
-    ln -sf /usr/bin/python3 /usr/bin/python && \
-    # Install python dependencies \
-    pip install cython tabulate numpy biopython && \
-    rm -rf /var/cache/apt/* /var/lib/apt/lists/*
-
-# Install BLAST
-RUN wget -q ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.7.1/ncbi-blast-2.7.1+-x64-linux.tar.gz && \
-    tar -zxvf ncbi-blast-2.7.1+-x64-linux.tar.gz && \
-    ln -s /ncbi-blast-2.7.1+/bin/blastn /usr/bin/blastn && \
-    rm ncbi-blast-2.7.1+-x64-linux.tar.gz
-
-# Install ResFinder and databases
-RUN pip install cgecore==1.5.1 && \
-    git clone -b 4.0 https://bitbucket.org/genomicepidemiology/resfinder.git && \
-    mkdir resfinder/db_resfinder && \
-    mkdir resfinder/db_pointfinder && \
-    chmod a+x resfinder/run_resfinder.py && \
-    ln -s /resfinder/run_resfinder.py /usr/bin/resfinder && \
-    # Install KMA \
-    apt-get install -y -qq libz-dev && \
-    git clone --branch 1.2.11 https://bitbucket.org/genomicepidemiology/kma.git resfinder/cge/kma && \
-    cd resfinder/cge/kma && make && cd ../.. && \
-    ln -s /resfinder/cge/kma/kma /usr/bin/kma && \
-    ln -s /resfinder/cge/kma/kma_index /usr/bin/kma_index && \
-    apt-get clean && \
-    rm -rf /var/cache/apt/* /var/lib/apt/lists/*
+    ; \
+    rm -rf /var/cache/apt/* /var/lib/apt/lists/*;
 
 ENV DEBIAN_FRONTEND Teletype
 
-WORKDIR /workdir
+# Install python dependencies
+RUN pip3 install -U biopython==1.73 tabulate cgecore gitpython python-dateutil;
+
+# RESFINDER setup
+COPY run_resfinder.py /usr/src/run_resfinder.py
+
+ADD cge /usr/src/cge
+ADD tests /usr/src/tests
+
+# Install kma
+RUN cd /usr/src/cge; \
+    git clone --depth 1 https://bitbucket.org/genomicepidemiology/kma.git; \
+    cd kma && make; \
+    mv kma* /bin/
+
+
+RUN chmod 755 /usr/src/run_resfinder.py
+RUN chmod 755 /usr/src/tests/functional_tests.py
+
+
+ENV PATH $PATH:/usr/src
+# Setup .bashrc file for convenience during debugging
+RUN echo "alias ls='ls -h --color=tty'\n"\
+"alias ll='ls -lrt'\n"\
+"alias l='less'\n"\
+"alias du='du -hP --max-depth=1'\n"\
+"alias cwd='readlink -f .'\n"\
+"PATH=$PATH\n">> ~/.bashrc
+
+
+# Change working directory
+WORKDIR "/usr/src/"
 
-ENTRYPOINT ["/resfinder/run_resfinder.py"]
+# Execute program when running the container
+ENTRYPOINT ["python3", "/usr/src/run_resfinder.py"]
\ No newline at end of file


=====================================
run_resfinder.py
=====================================
@@ -30,60 +30,6 @@ import json
 def eprint(*args, **kwargs):
     print(*args, file=sys.stderr, **kwargs)
 
-
-def create_tab_acquired(isolate, phenodb):
-    """ Alternative method to create the downloadeable tabbed result file. This
-         method will include the additional information from the phenotype
-         database.
-    """
-    output_str = ("Resistance gene\tIdentity\tAlignment Length/Gene Length\t"
-                  "Position in reference\tContig\tPosition in contig\t"
-                  "Phenotype\tClass\tPMID\tAccession no.\tNotes\n")
-
-    for unique_id in isolate:
-        for feature in isolate[unique_id]:
-
-            # Extract phenotypes
-            phenotype_out_list = []
-            phenotype = phenodb[feature.unique_id]
-
-            # Append stars to phenotypes that are suggested by the curators and
-            # not published
-            for antibiotic in phenotype.phenotype:
-                if(antibiotic in phenotype.sug_phenotype):
-                    antibiotic = antibiotic + "*"
-                phenotype_out_list.append(antibiotic)
-
-            phenotype_out_str = ",".join(phenotype_out_list)
-
-            output_str += (feature.hit.name + "\t"
-                           + str(feature.hit.identity) + "\t"
-                           + str(feature.hit.match_length)
-                           + "/" + str(feature.hit.ref_length) + "\t"
-                           + str(feature.hit.start_ref)
-                           + ".." + str(feature.hit.end_ref) + "\t"
-                           + feature.seq_region + "\t"
-                           + str(feature.start)
-                           + ".." + str(feature.end) + "\t"
-                           + phenotype_out_str + "\t"
-                           + ",".join(phenotype.ab_class) + "\t"
-                           + ",".join(phenotype.pmid) + "\t"
-                           + feature.hit.acc + "\t"
-                           + phenotype.notes + "\n")
-
-    # Find AMR classes with no hits
-    no_class_hits = []
-    for ab_class in phenodb.antibiotics:
-        if(ab_class not in isolate.resprofile.resistance_classes):
-            no_class_hits.append(ab_class)
-
-    if(no_class_hits):
-        output_str += ("\nNo hits found in the classes: "
-                       + ",".join(no_class_hits) + "\n")
-
-    return output_str
-
-
 # TODO: Add fix species choice
 species_transl = {"c. jejuni": "campylobacter jejuni",
                   "c.jejuni": "campylobacter jejuni",
@@ -177,6 +123,11 @@ parser.add_argument("-c", "--point",
 parser.add_argument("-db_point", "--db_path_point",
                     help="Path to the databases for PointFinder",
                     default=None)
+parser.add_argument("-db_point_kma", "--db_path_point_kma",
+                    help="Path to the PointFinder databases indexed with KMA. \
+                          Defaults to the 'kma_indexing' directory inside the \
+                          given database directory.",
+                    default=None)
 parser.add_argument("-g",
                     dest="specific_gene",
                     nargs='+',
@@ -214,6 +165,9 @@ parser.add_argument("--pickle",
 
 args = parser.parse_args()
 
+if(args.species is not None and args.species.lower() == "other"):
+    args.species = None
+
 if(args.point and not args.species):
     sys.exit("ERROR: Chromosomal point mutations cannot be located if no "
              "species has been provided. Please provide species using the "
@@ -270,6 +224,7 @@ else:
     kma = None
 
 db_path_point = None
+
 if(args.species):
     args.species = args.species.lower()
 
@@ -319,6 +274,7 @@ if args.acquired is False and args.point is False:
     sys.exit("Please specify to look for acquired resistance genes, "
              "chromosomal mutaitons or both!\n")
 
+# Check ResFinder database
 if(args.db_path_res is None):
     args.db_path_res = (os.path.dirname(
         os.path.realpath(__file__)) + "/db_resfinder")
@@ -327,7 +283,6 @@ if(not os.path.exists(args.db_path_res)):
     sys.exit("Could not locate ResFinder database path: %s"
              % args.db_path_res)
 
-# Check ResFinder KMA database
 if(args.db_path_res_kma is None and args.acquired and args.inputfastq):
     args.db_path_res_kma = args.db_path_res
     if(not os.path.exists(args.db_path_res_kma)):
@@ -347,7 +302,6 @@ if(args.acquired):
 if(args.point):
     DatabaseHandler.load_database_metadata("PointFinder", std_result,
                                            db_path_point_root)
-
 ##########################################################################
 # ResFinder
 ##########################################################################
@@ -406,7 +360,7 @@ if args.acquired is True:
         # DEPRECATED
         # use std_result
         new_std_res = ResFinder.old_results_to_standard_output(
-            blast_results.results, software="ResFinder", version="4.0.0",
+            blast_results.results, software="ResFinder", version="4.1.0",
             run_date="fake_run_date", run_cmd="Fake run cmd",
             id=sample_name)
 
@@ -439,7 +393,7 @@ if args.acquired is True:
         # DEPRECATED
         # use std_result
         new_std_res = ResFinder.old_results_to_standard_output(
-            kma_run.results, software="ResFinder", version="4.0.0",
+            kma_run.results, software="ResFinder", version="4.1.0",
             run_date="fake_run_date", run_cmd="Fake run cmd",
             id=sample_name)
 
@@ -527,7 +481,7 @@ if args.point is True and args.species:
     # DEPRECATED
     # use std_result
     new_std_pnt = finder.old_results_to_standard_output(
-        result=results_pnt, software="ResFinder", version="4.0.0",
+        result=results_pnt, software="ResFinder", version="4.1.0",
         run_date="fake_run_date", run_cmd="Fake run cmd",
         id=sample_name)
 
@@ -545,8 +499,7 @@ if args.point is True and args.species:
                                                  "PointFinder")
 
 #DEBUG
-#    print("\n\nSTD RES:\n{}".format(json.dumps(std_result)))
-
+#        print("STD RESULT:\n{}".format(json.dumps(std_result)))
 ##########################################################################
 # Phenotype to genotype
 ##########################################################################
@@ -560,10 +513,8 @@ else:
 res_pheno_db = PhenoDB(
     abclassdef_file=(args.db_path_res + "/antibiotic_classes.txt"),
     acquired_file=args.db_path_res + "/phenotypes.txt", point_file=point_file)
-
 # Isolate object store results
 isolate = Isolate(name=sample_name)
-
 if(args.acquired):
     isolate.load_finder_results(std_table=std_result,
                                 phenodb=res_pheno_db,
@@ -580,10 +531,12 @@ if(args.point):
     #                                 phenodb=res_pheno_db)
     # isolate.load_pointfinder_tab(args.out_path + "/PointFinder_results.txt",
     #                                      res_pheno_db)
-
 isolate.calc_res_profile(res_pheno_db)
+if(args.acquired):
+    ResFinderResultHandler.load_res_profile(std_result, isolate)
+if(args.point):
+    PointFinderResultHandler.load_res_profile(std_result, isolate)
 
-ResFinderResultHandler.load_res_profile(std_result, isolate)
 
 #TODO
 std_result_file = "{}/std_format_under_development.json".format(args.out_path)
@@ -592,7 +545,6 @@ with open(std_result_file, 'w') as fh:
 
 # Create and write the downloadable tab file
 pheno_profile_str = isolate.profile_to_str_table(header=True)
-
 # TODO: REMOVE THE NEED FOR THE PICKLED FILE
 if(args.pickle):
     isolate_pickle = open("{}/isolate.p".format(args.out_path), "wb")


=====================================
scripts/env_variables.txt
=====================================
@@ -0,0 +1,11 @@
+export CGE_PYTHON="python3"
+export CGE_KMA="/media/sf_nextcloud/Scripts-CGE/resfinder/cge/kma/kma"
+export CGE_BLASTN="blastn"
+export CGE_RESFINDER="/media/sf_nextcloud/Scripts-CGE/resfinder/run_resfinder.py"
+export CGE_RESFINDER_RESGENE_DB="/media/sf_nextcloud/Scripts-CGE/resfinder/db_resfinder"
+export CGE_RESFINDER_RESPOINT_DB="/media/sf_nextcloud/Scripts-CGE/resfinder/db_pointfinder"
+export CGE_RESFINDER_DISINF_DB="/media/sf_nextcloud/Scripts-CGE/resfinder/db_disinfinder"
+export CGE_RESFINDER_GENE_COV="0.6"
+export CGE_RESFINDER_GENE_ID="0.8"
+export CGE_RESFINDER_POINT_COV="0.6"
+export CGE_RESFINDER_POINT_ID="0.8"


=====================================
scripts/resfinder.nf
=====================================
@@ -6,7 +6,7 @@ resfinder = "/home/projects/cge/apps/resfinder/resfinder/run_resfinder.py"
 params.indir = './'
 params.ext = '.fq.gz'
 params.outdir = '.'
-params.species
+params.species = 'other'
 
 println("Search pattern: $params.indir*{1,2}$params.ext")
 
@@ -30,10 +30,17 @@ process resfinder{
 
     """
     set +u
-    module unload perl
+    module unload mgmapper metabat fastqc
+    module unload ncbi-blast perl
     source /home/projects/cge/apps/env/rf4_env/bin/activate
+    module load perl
     module load ncbi-blast/2.8.1+
-    $python3 $resfinder -acq --point -ifq $datasetFile -o '$params.outdir/$sampleID' -s '$params.species'
+    if [ $params.species = 'other' ]
+    then
+        $python3 $resfinder -acq -ifq $datasetFile -o '$params.outdir/$sampleID' -s '$params.species'
+    else
+        $python3 $resfinder -acq -ifq $datasetFile -o '$params.outdir/$sampleID' -s '$params.species' --point
+    fi
     """
 }
 


=====================================
scripts/resfinder_asm.nf
=====================================
@@ -7,7 +7,7 @@ params.input = './*.fa'
 // params.indir = './'
 // params.ext = '.fa'
 params.outdir = '.'
-params.species
+params.species = 'other'
 
 println("Search pattern: $params.input")
 
@@ -31,10 +31,17 @@ process resfinder{
 
     """
     set +u
-    module unload perl
+    module unload mgmapper metabat fastqc
+    module unload ncbi-blast perl
     source /home/projects/cge/apps/env/rf4_env/bin/activate
+    module load perl
     module load ncbi-blast/2.8.1+
-    $python3 $resfinder -acq --point -ifa $datasetFile -o '$params.outdir/$sampleID' -s '$params.species'
+    if [ $params.species = 'other' ]
+    then
+        $python3 $resfinder -acq -ifa $datasetFile -o '$params.outdir/$sampleID' -s '$params.species'
+    else
+        $python3 $resfinder -acq -ifa $datasetFile -o '$params.outdir/$sampleID' -s '$params.species' --point
+    fi
     """
 }
 


=====================================
scripts/wdl/README.md
=====================================
@@ -0,0 +1,161 @@
+# Quick guide to running ResFinder with Cromwell
+
+### Disclaimer
+Support is not offered for running Cromwell and no files in this directory is
+guaranteed to work. These files were uploaded as inspiration. Please do not
+report issues relating to this directory.
+
+## Prepare input files
+
+Two input files are needed:
+
+1. input_data.tsv
+2. input.json
+
+Templates can be found in the ResFinder directory scripts/wdl.
+
+### input_data.tsv
+Tab separated file. Should contain columns in the following order:
+
+1. Absolute path to fasta/fastq file 1
+2. Absolute path to fastq file 2 (Can be empty, but must exist)
+3. Species
+4. Type of data, must be one of: assembly, paired
+
+Each row should contain a single sample.
+
+#### Species
+If species cannot be provided put "other" (cases sensitive).
+
+#### Type of data
+
+* assembly: Fasta file containing contigs from a de novo assembly.
+* paired: Couple of fastq files containing read data for foward and reverse
+reads.
+* single: **Not implemented** Read data from single-end sequencing.
+
+
+#### Example
+```
+
+/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_01_1.fq	/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_01_2.fq	Escherichia	coli	paired
+/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_05_1.fq	/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_05_2.fq	Escherichia	coli	paired
+/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09a_1.fq	/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09a_2.fq	Escherichia	coli	paired
+/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09b_1.fq	/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09b_2.fq	Escherichia	coli	paired
+/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_01.fa		Escherichia	coli	assembly
+/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_02.fa		Escherichia	coli	assembly
+/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_03.fa		Escherichia	coli	assembly
+/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_05.fa		Escherichia	coli	assembly
+/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09a.fa		Escherichia	coli	assembly
+/home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09b.fa		Escherichia	coli	assembly
+
+```
+
+### input.json
+JSON formatted file containing input and output information.
+
+The file should consist of a single dict/hash/map with the following keys:
+
+* Resistance.inputSamplesFile: Absolute path to input_data.tsv
+* Resistance.outputDir: Absolute path to output directory.
+* Resistance.geneCov: Fraction of gene coverage needed for resistance gene hits.
+* Resistance.geneID: Fraction of nucleotide identity needed in resistance gene
+hits.
+* Resistance.pointCov: Fraction of gene coverage needed for point mutation gene
+hits.
+* Resistance.pointID: Fraction of nucleotide identity needed in point mutation gene
+hits.
+
+If running on Computerome and are using the input.json template, you probably
+won't need to change the following:
+
+* Resistance.python: Path to python3 interpreter.
+* Resistance.kma: Path to kma application.
+* Resistance.blastn: Path to blastn application.
+* Resistance.resfinder: Path to run_resfinder.py.
+* Resistance.resDB: Path to ResFinder database.
+* Resistance.pointDB: Path to PointFinder database
+
+The values should be the absolute path to the input_data.tsv and the desired
+output directory, respectively.
+
+#### Example
+
+```json
+
+{
+  "Resistance.inputSamplesFile": "/home/projects/cge/people/rkmo/delme/res_input.tsv",
+  "Resistance.outputDir": "/home/projects/cge/people/rkmo/delme/",
+  "Resistance.geneCov": 0.6,
+  "Resistance.geneID": 0.8,
+  "Resistance.pointCov": 0.6,
+  "Resistance.pointID": 0.8,
+  "Resistance.python": "python3",
+  "Resistance.kma": "/home/projects/cge/apps/resfinder/resfinder/cge/kma/kma",
+  "Resistance.blastn": "blastn",
+  "Resistance.resfinder": "/home/projects/cge/apps/resfinder/resfinder/run_resfinder.py",
+  "Resistance.resDB": "/home/projects/cge/apps/resfinder/resfinder/db_resfinder",
+  "Resistance.pointDB": "/home/projects/cge/apps/resfinder/resfinder/db_pointfinder"
+}
+
+```
+
+## Run Cromwell
+
+Cromwell needs JAVA to run. Load a valid JAVA module, for example:
+
+```bash
+
+module load openjdk/16
+
+```
+
+A Cromwell call looks like this:
+
+```bash
+
+java -Dconfig.file=<CONF> -jar <CROMWELL> run <WDL> --inputs <JSON>
+
+```
+
+### <CONF> and <CROMWELL>
+Computerome specific.
+
+* <CONF>: Path to Computerome configuration for Cromwell. You need to change
+this if you are not running Cromwell on Computerome. Computerome path:
+/home/projects/cge/apps/resfinder/resfinder/scripts/wdl/computerome.conf
+
+* <CROMWELL>: Path to Cronwell jar file in Computerome:
+/services/tools/cromwell/50/cromwell-50.jar
+
+### <WDL>
+ResFinder specific.
+
+* <WDL>: Path to wdl file that specifies how to run ResFinder. Path to
+resfinder.wdl on Computerome:
+/home/projects/cge/apps/resfinder/resfinder/scripts/wdl/resfinder.wdl
+
+### <JSON>
+User/Run specific
+
+Path to input.json. Specifies all the parameters for ResFinder (See above).
+
+### Run example
+
+```bash
+
+java -Dconfig.file=/home/projects/cge/apps/resfinder/resfinder/scripts/wdl/computerome.conf -jar /services/tools/cromwell/50/cromwell-50.jar run /home/projects/cge/apps/resfinder/resfinder/scripts/wdl/resfinder.wdl --inputs /home/projects/cge/apps/resfinder/resfinder/scripts/wdl/input.json
+
+```
+
+### Post run
+
+All ResFinder output will be located in the provided output directory.
+
+In the directory where you execute Cromwell the following two directories will
+also be created:
+
+* cromwell-executions
+* cromwell-workflow-logs
+
+They contain logging information and cached results.


=====================================
scripts/wdl/computerome.conf
=====================================
@@ -0,0 +1,49 @@
+# TORQUE as a backend for Cromwell on Computerome
+
+# Here is where you can define the backend providers that Cromwell understands.
+# The default is a local provider.
+# To add additional backend providers, you should copy paste additional backends
+# of interest that you can find in the cromwell.example.backends folder
+# folder at https://www.github.com/broadinstitute/cromwell
+# Other backend providers include SGE, SLURM, Docker, udocker, Singularity. etc.
+# Don't forget you will need to customize them for your particular use case.
+
+backend {
+
+    # Override the default backend.
+    default = TORQUE
+
+    # The list of providers.
+    providers {
+
+        TORQUE {
+
+            # The actor that runs the backend.
+            actor-factory = "cromwell.backend.impl.sfs.config.ConfigBackendLifecycleActorFactory"
+
+            # The backend custom configuration.
+            config {
+
+                # Number of concurrent jobs allowed
+                concurrent-job-limit = 500
+
+                # The list of possible runtime custom attributes.
+                runtime-attributes = """
+                String walltime = "1:00:00"
+                Int cpu = 1
+                Float memory_mb = 2048.0
+                String queue = "cge"
+                """
+
+                submit = "qsub -W group_list=${queue} -A ${queue} -N ${job_name} -lwalltime=${walltime},nodes=1:ppn=${cpu},mem=${ceil(memory_mb)}mb -d ${cwd} -o ${out} -e ${err} ${script}"
+
+                kill = "qdel ${job_id}"
+                check-alive = "qstat ${job_id}"
+                job-id-regex = "(\\d+)"
+            }
+
+        }
+
+    }
+
+}


=====================================
scripts/wdl/input.json
=====================================
@@ -0,0 +1,14 @@
+{
+    "Resistance.inputSamplesFile": "/home/projects/cge/people/rkmo/delme/res_input.tsv",
+    "Resistance.outputDir": "/home/projects/cge/people/rkmo/delme/",
+    "Resistance.geneCov": 0.6,
+    "Resistance.geneID": 0.8,
+    "Resistance.pointCov": 0.6,
+    "Resistance.pointID": 0.8,
+    "Resistance.python": "python3",
+    "Resistance.kma": "/home/projects/cge/apps/resfinder/resfinder/cge/kma/kma",
+    "Resistance.blastn": "blastn",
+    "Resistance.resfinder": "/home/projects/cge/apps/resfinder/resfinder/run_resfinder.py",
+    "Resistance.resDB": "/home/projects/cge/apps/resfinder/resfinder/db_resfinder",
+    "Resistance.pointDB": "/home/projects/cge/apps/resfinder/resfinder/db_pointfinder"
+}


=====================================
scripts/wdl/resfinder.wdl
=====================================
@@ -0,0 +1,110 @@
+workflow Resistance {
+    File inputSamplesFile
+    Array[Array[File]] inputSamples = read_tsv(inputSamplesFile)
+
+    String outputDir
+    Float geneCov
+    Float geneID
+    Float pointCov
+    Float pointID
+    String python
+    String kma
+    String blastn
+    String resfinder
+    String resDB
+    String pointDB
+
+    scatter (sample in inputSamples) {
+        call ResFinder {
+            input: inputSample=sample,
+                outputRoot=outputDir,
+                geneCov=geneCov,
+                geneID=geneID,
+                pointCov=pointCov,
+                pointID=pointID,
+                python=python,
+                kma=kma,
+                blastn=blastn,
+                resfinder=resfinder,
+                resDB=resDB,
+                pointDB=pointDB
+        }
+    }
+}
+
+task ResFinder {
+
+    Array[String] inputSample
+    Float geneCov
+    Float geneID
+    Float pointCov
+    Float pointID
+    String python
+    String kma
+    String blastn
+    String resfinder
+    String resDB
+    String pointDB
+
+    String inputPath1 = inputSample[0]
+    String inputPath2 = inputSample[1]
+    String species = inputSample[2]
+    String inputType = inputSample[3]
+
+    String outputRoot
+    String filename = basename(inputPath1)
+    String out_dir_name = "${outputRoot}/${filename}.rf_out"
+
+    command {
+
+        set +u
+        module unload mgmapper metabat fastqc
+        module unload ncbi-blast perl
+        source /home/projects/cge/apps/env/rf4_env/bin/activate
+        module load perl
+        module load ncbi-blast/2.8.1+
+
+        mkdir ${out_dir_name}
+
+        inputArgs=""
+        pointArgs=""
+
+        if [ "${species}" = "other" ] && [ "${inputType}" = "paired" ]
+        then
+            inputArgs+="-ifq ${inputPath1} ${inputPath2}"
+        elif [ "${inputType}" = "paired" ]
+        then
+            inputArgs+="-ifq ${inputPath1} ${inputPath2}"
+            pointArgs+="--point --db_path_point ${pointDB} --min_cov_point ${pointCov} --threshold_point ${pointID}"
+        elif [ "${species}" = "other"] && [ "${inputType}" = "assembly" ]
+        then
+            inputArgs+="-ifa ${inputPath1}"
+        elif [ "${inputType}" = "assembly" ]
+        then
+            inputArgs+="-ifa ${inputPath1}"
+            pointArgs+="--point --db_path_point ${pointDB} --min_cov_point ${pointCov} --threshold_point ${pointID}"
+        fi
+
+        ${python} ${resfinder} \
+            $inputArgs \
+            --blastPath ${blastn} \
+            --kmaPath ${kma} \
+            --species "${species}" \
+            --db_path_res ${resDB} \
+            --acquired \
+            --acq_overlap 30 \
+            --min_cov ${geneCov} \
+            --threshold ${geneID} \
+            -o ${out_dir_name} \
+            $pointArgs
+    }
+    output {
+        File rf_out = "${out_dir_name}/std_format_under_development.json"
+    }
+    runtime {
+        walltime: "1:00:00"
+        cpu: 2
+        memory: "4 GB"
+        queue: "cge"
+    }
+}


=====================================
tests/functional_tests.py
=====================================
@@ -62,7 +62,6 @@ class ResFinderRunTest(unittest.TestCase):
         except OSError:
             procs = run(["rm", "-r", run_test_dir])
 
-
     def test_on_data_with_just_acquired_resgene_using_blast(self):
         # Maria has an E. coli isolate, with unknown resistance.
         # At first, she just wants to know which acquired resistance genes are
@@ -94,11 +93,11 @@ class ResFinderRunTest(unittest.TestCase):
         procs = run(cmd_acquired, shell=True, stdout=PIPE, stderr=PIPE,
                     check=True)
 
-        fsa_hit = test1_dir + "/Hit_in_genome_seq.fsa"
-        fsa_res = test1_dir + "/Resistance_gene_seq.fsa"
-        res_table = test1_dir + "/results_table.txt"
-        res_tab = test1_dir + "/results_tab.txt"
-        results = test1_dir + "/results.txt"
+        fsa_hit = test1_dir + "/ResFinder_Hit_in_genome_seq.fsa"
+        fsa_res = test1_dir + "/ResFinder_Resistance_gene_seq.fsa"
+        res_table = test1_dir + "/ResFinder_results_table.txt"
+        res_tab = test1_dir + "/ResFinder_results_tab.txt"
+        results = test1_dir + "/ResFinder_results.txt"
 
         with open(fsa_hit, "r") as fh:
             check_result = fh.readline()
@@ -152,11 +151,11 @@ class ResFinderRunTest(unittest.TestCase):
         procs = run(cmd_acquired, shell=True, stdout=PIPE, stderr=PIPE,
                     check=True)
 
-        fsa_hit = test2_dir + "/Hit_in_genome_seq.fsa"
-        fsa_res = test2_dir + "/Resistance_gene_seq.fsa"
-        res_table = test2_dir + "/results_table.txt"
-        res_tab = test2_dir + "/results_tab.txt"
-        results = test2_dir + "/results.txt"
+        fsa_hit = test2_dir + "/ResFinder_Hit_in_genome_seq.fsa"
+        fsa_res = test2_dir + "/ResFinder_Resistance_gene_seq.fsa"
+        res_table = test2_dir + "/ResFinder_results_table.txt"
+        res_tab = test2_dir + "/ResFinder_results_tab.txt"
+        results = test2_dir + "/ResFinder_results.txt"
 
         with open(fsa_hit, "r") as fh:
             check_result = fh.readline()
@@ -279,7 +278,7 @@ def parse_args():
     parser = argparse.ArgumentParser(add_help=False, allow_abbrev=False)
     group = parser.add_argument_group("Options")
     group.add_argument('-res_help', "--resfinder_help",
-                        action="help")
+                       action="help")
     group.add_argument("-db_res", "--db_path_res",
                        help="Path to the databases for ResFinder",
                        default="./db_resfinder")



View it on GitLab: https://salsa.debian.org/med-team/resfinder/-/commit/327f6198e3e9e684df4d913db5274b72f888135a

-- 
View it on GitLab: https://salsa.debian.org/med-team/resfinder/-/commit/327f6198e3e9e684df4d913db5274b72f888135a
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20210704/d4a1fcb0/attachment-0001.htm>


More information about the debian-med-commit mailing list