[med-svn] [Git][med-team/augur][master] 4 commits: routine-update: New upstream version

Wed Nov 18 20:19:44 GMT 2020


Nilesh Patra pushed to branch master at Debian Med / augur


Commits:
dd68af4c by Nilesh Patra at 2020-11-19T01:45:16+05:30
routine-update: New upstream version

- - - - -
4b456b79 by Nilesh Patra at 2020-11-19T01:45:17+05:30
New upstream version 10.1.1
- - - - -
7e062b3d by Nilesh Patra at 2020-11-19T01:46:00+05:30
Update upstream source from tag 'upstream/10.1.1'

Update to upstream version '10.1.1'
with Debian dir ff951131f95f40da596973f0482a664285a4c27f
- - - - -
5230ab1d by Nilesh Patra at 2020-11-19T01:46:12+05:30
routine-update: Ready to upload to unstable

- - - - -


25 changed files:

- CHANGES.md
- README.md
- augur/__version__.py
- augur/filter.py
- augur/utils.py
- debian/changelog
- docs/conf.py
- DEV_DOCS.md → docs/contribute/DEV_DOCS.md
- − docs/faq/augur_snakemake.md
- − docs/faq/community_hosting.md
- docs/faq/faq.rst
- + docs/faq/what-is-a-build.md
- docs/index.rst
- docs/releases/migrating-v5-v6.md
- docs/releases/v6.md
- docs/tutorials/tb_tutorial.md
- − docs/tutorials/tutorials.rst
- docs/tutorials/zika_tutorial.md
- docs/usage/cli/filter.rst
- docs/usage/usage.rst
- setup.py
- tests/builds/zika.t
- tests/functional/ancestral.t
- tests/functional/refine.t
- tests/test_utils.py


Changes:

=====================================
CHANGES.md
=====================================
@@ -3,6 +3,25 @@
 ## __NEXT__
 
 
+## 10.1.1 (16 November 2020)
+
+### Bug Fixes
+
+* dependencies: Require the most recent minor versions of TreeTime (0.8.X) to fix numpy matrix errors [#633][]
+
+[#633]: https://github.com/nextstrain/augur/pull/633
+
+## 10.1.0 (13 November 2020)
+
+### Features
+
+* docs: Migrate non-reference documentation to docs.nextstrain.org [#620][]
+* filter: Add `--exclude-ambiguous-dates-by` flag to enable exclusion of samples with ambiguous dates [#623][] and [#631][]
+
+[#620]: https://github.com/nextstrain/augur/pull/620
+[#623]: https://github.com/nextstrain/augur/pull/623
+[#631]: https://github.com/nextstrain/augur/pull/631
+
 ## 10.0.4 (6 November 2020)
 
 ### Bug Fixes


=====================================
README.md
=====================================
@@ -28,7 +28,7 @@ The output of augur is a series of JSONs that can be used to visualize your resu
 * [Technical documentation for Augur](https://nextstrain-augur.readthedocs.io/en/stable/installation/installation.html)
 * [Contributor guide](https://github.com/nextstrain/.github/blob/master/CONTRIBUTING.md)
 * [Project board with available issues](https://github.com/orgs/nextstrain/projects/6)
-* [Developer docs for Augur](./DEV_DOCS.md)
+* [Developer docs for Augur](./docs/contribute/DEV_DOCS.md)
 
 ## Quickstart
 


=====================================
augur/__version__.py
=====================================
@@ -1,4 +1,4 @@
-__version__ = '10.0.4'
+__version__ = '10.1.1'
 
 
 def is_augur_version_compatible(version):


=====================================
augur/filter.py
=====================================
@@ -10,7 +10,7 @@ import numpy as np
 import sys
 import datetime
 import treetime.utils
-from .utils import read_metadata, get_numerical_dates, run_shell_command, shquote
+from .utils import read_metadata, get_numerical_dates, run_shell_command, shquote, is_date_ambiguous
 
 comment_char = '#'
 
@@ -105,6 +105,8 @@ def register_arguments(parser):
                                 help="Exclude samples matching these conditions. Ex: \"host=rat\" or \"host!=rat\". Multiple values are processed as OR (matching any of those specified will be excluded), not AND")
     parser.add_argument('--include-where', nargs='+',
                                 help="Include samples with these values. ex: host=rat. Multiple values are processed as OR (having any of those specified will be included), not AND. This rule is applied last and ensures any sequences matching these rules will be included.")
+    parser.add_argument('--exclude-ambiguous-dates-by', choices=['any', 'day', 'month', 'year'],
+                                help='Exclude ambiguous dates by day (e.g., 2020-09-XX), month (e.g., 2020-XX-XX), year (e.g., 200X-10-01), or any date fields. An ambiguous year makes the corresponding month and day ambiguous, too, even if those fields have unambiguous values (e.g., "201X-10-01"). Similarly, an ambiguous month makes the corresponding day ambiguous (e.g., "2010-XX-01").')
     parser.add_argument('--query', help="Filter samples by attribute. Uses Pandas Dataframe querying, see https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-query for syntax.")
     parser.add_argument('--output', '-o', help="output file", required=True)
 
@@ -231,6 +233,17 @@ def run(args):
             num_excluded_by_length = len(seq_keep) - len(seq_keep_by_length)
             seq_keep = seq_keep_by_length
 
+    # filter by ambiguous dates
+    num_excluded_by_ambiguous_date = 0
+    if args.exclude_ambiguous_dates_by and 'date' in meta_columns:
+        seq_keep_by_date = []
+        for seq_name in seq_keep:
+            if not is_date_ambiguous(meta_dict[seq_name]['date'], args.exclude_ambiguous_dates_by):
+                seq_keep_by_date.append(seq_name)
+
+        num_excluded_by_ambiguous_date = len(seq_keep) - len(seq_keep_by_date)
+        seq_keep = seq_keep_by_date
+
     # filter by date
     num_excluded_by_date = 0
     if (args.min_date or args.max_date) and 'date' in meta_columns:
@@ -423,6 +436,8 @@ def run(args):
         print("\t%i of these were filtered out by the query:\n\t\t\"%s\"" % (num_excluded_by_query, args.query))
     if args.min_length:
         print("\t%i of these were dropped because they were shorter than minimum length of %sbp" % (num_excluded_by_length, args.min_length))
+    if args.exclude_ambiguous_dates_by and num_excluded_by_ambiguous_date:
+        print("\t%i of these were dropped because of their ambiguous date in %s" % (num_excluded_by_ambiguous_date, args.exclude_ambiguous_dates_by))
     if (args.min_date or args.max_date) and 'date' in meta_columns:
         print("\t%i of these were dropped because of their date (or lack of date)" % (num_excluded_by_date))
     if args.non_nucleotide:


=====================================
augur/utils.py
=====================================
@@ -73,6 +73,37 @@ def ambiguous_date_to_date_range(uncertain_date, fmt, min_max_year=None):
 def read_metadata(fname, query=None):
     return MetadataFile(fname, query).read()
 
+def is_date_ambiguous(date, ambiguous_by="any"):
+    """
+    Returns whether a given date string in the format of YYYY-MM-DD is ambiguous by a given part of the date (e.g., day, month, year, or any parts).
+
+    Parameters
+    ----------
+    date : str
+        Date string in the format of YYYY-MM-DD
+    ambiguous_by : str
+        Field of the date string to test for ambiguity ("day", "month", "year", "any")
+    """
+    date_components = date.split('-', 2)
+
+    if len(date_components) == 3:
+        year, month, day = date_components
+    elif len(date_components) == 2:
+        year, month = date_components
+        day = "XX"
+    else:
+        year = date_components[0]
+        month = "XX"
+        day = "XX"
+
+    # Determine ambiguity hierarchically such that, for example, an ambiguous
+    # month implicates an ambiguous day even when day information is available.
+    return any((
+        "X" in year,
+        "X" in month and ambiguous_by in ("any", "month", "day"),
+        "X" in day and ambiguous_by in ("any", "day")
+    ))
+
 def get_numerical_dates(meta_dict, name_col = None, date_col='date', fmt=None, min_max_year=None):
     if fmt:
         from datetime import datetime


=====================================
debian/changelog
=====================================
@@ -1,3 +1,10 @@
+augur (10.1.1-1) unstable; urgency=medium
+
+  * Team upload.
+  * New upstream version
+
+ -- Nilesh Patra <npatra974 at gmail.com>  Thu, 19 Nov 2020 01:46:12 +0530
+
 augur (10.0.4-1) unstable; urgency=medium
 
   * Team upload.


=====================================
docs/conf.py
=====================================
@@ -53,7 +53,7 @@ author = prose_list(git_authors())
 # Add any Sphinx extension module names here, as strings. They can be
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
 # ones.
-extensions = ['recommonmark', 'sphinx.ext.autodoc', 'sphinxarg.ext', 'sphinx.ext.napoleon', 'sphinx_markdown_tables']
+extensions = ['recommonmark', 'sphinx.ext.autodoc', 'sphinxarg.ext', 'sphinx.ext.napoleon', 'sphinx_markdown_tables', 'sphinx.ext.intersphinx']
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
@@ -61,7 +61,19 @@ templates_path = ['_templates']
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 # This pattern also affects html_static_path and html_extra_path.
-exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
+exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store',
+    'contribute/DEV_DOCS.md',
+    'faq/colors.md',
+    'faq/fasta_input.md',
+    'faq/import-beast.md',
+    'faq/lat_longs.md',
+    'faq/seq_traits.md',
+    'faq/translate_ref.md',
+    'faq/vcf_input.md',
+    'tutorials/tb_tutorial.md',
+    'tutorials/zika_tutorial.md',
+    'usage/augur_snakemake.md',
+]
 
 # A string of reStructuredText that will be included at the end of every source
 # file that is read. This is a possible place to add substitutions that should
@@ -78,6 +90,12 @@ rst_epilog = f"""
 #
 html_theme = 'nextstrain-sphinx-theme'
 
+html_theme_options = {
+    'logo_only': False, # if True, don't display project name at top of the sidebar
+    'collapse_navigation': False, # if True, no [+] icons in sidebar
+    'titles_only': True, # if True, page subheadings not included in nav
+}
+
 # Add any paths that contain custom static files (such as style sheets) here,
 # relative to this directory. They are copied after the builtin static files,
 # so a file named "default.css" will overwrite the builtin "default.css"..
@@ -88,3 +106,9 @@ html_static_path = ['_static']
 html_css_files = [
     'css/custom.css',
 ]
+
+# -- Cross-project references ------------------------------------------------
+
+intersphinx_mapping = {
+    'docs.nextstrain.org': ('https://docs.nextstrain.org/en/latest/', None),
+}


=====================================
DEV_DOCS.md → docs/contribute/DEV_DOCS.md
=====================================
@@ -1,6 +1,6 @@
-# Development Docs for Contributors
+# Augur Development Docs for Contributors
 
-Thank you for helping us to improve Nextstrain! This document describes:
+Thank you for helping us to improve Augur! This document describes:
 
 - Getting Started
 - Contributing code


=====================================
docs/faq/augur_snakemake.md deleted
=====================================
@@ -1 +0,0 @@
-../usage/augur_snakemake.md
\ No newline at end of file


=====================================
docs/faq/community_hosting.md deleted
=====================================
@@ -1,45 +0,0 @@
-# Sharing your analysis
-
-[nextstrain.org](https://nextstrain.org) has a feature that allows you to share your own analysis through the nextstrain.org website.
-This works using github as a repository for your analysis files.
-To share an analysis, you need to create a repository.
-The name of the repository will be what is used to access the results.
-Within this repository, there should a folder called auspice which contains the output json files of the augur pipeline.
-Importantly, the name of these files has to start with the name of the repository.
-If so, you should be able to access your analysis via the nextstrain community feature.
-
-As an example, lets look at one of our nextstrain community analysis.
-The following link shows you an analysis we made a few month ago of many influenza B sequences:
-
-[nextstrain.org/community/neherlab/allflu/B_ha](https://nextstrain.org/community/neherlab/allflu/B_ha)
-
-The analysis files are hosted on our github page in the reposity
-
-[github.com/neherlab/allflu](https://github.com/neherlab/allflu)
-
-In this repository, you'll find the folder `auspice` which contains the files
-```
-allflu_B_ha_meta.json
-allflu_B_ha_tree.json
-```
-Note that all files start with "allflu" which matches the name of the repository.
-In fact, there are multiple analysis in this folder. The corresponding files all start with "allflu" but they differ in the viral lineage they correspond to:
-```
-allflu_B_ha_meta.json
-allflu_B_ha_tree.json
-allflu_h1n1_ha_meta.json
-allflu_h1n1_ha_tree.json
-allflu_h1n1pdm_ha_meta.json
-allflu_h1n1pdm_ha_tree.json
-allflu_h3n2_ha_meta.json
-allflu_h3n2_ha_tree.json
-```
-All these can be accessed as
-
-[nextstrain.org/community/neherlab/allflu/h1n1_ha](https://nextstrain.org/community/neherlab/allflu/h1n1_ha)
-
-[nextstrain.org/community/neherlab/allflu/h3n2_ha](https://nextstrain.org/community/neherlab/allflu/h3n2_ha)
-
-[nextstrain.org/community/neherlab/allflu/h1n1pdm_ha](https://nextstrain.org/community/neherlab/allflu/h1n1pdm_ha)
-
-


=====================================
docs/faq/faq.rst
=====================================
@@ -10,15 +10,7 @@ common questions and problems users run into.
    :maxdepth: 1
    :glob:
 
+   what-is-a-build
    metadata
-   translate_ref
    clades
-   community_hosting
-   import-beast
-   colors
-   lat_longs
    Specifying `refine` rates <refine>
-   Using Augur and Snakemake <augur_snakemake>
-   vcf_input
-   fasta_input
-   seq_traits
\ No newline at end of file


=====================================
docs/faq/what-is-a-build.md
=====================================
@@ -0,0 +1,32 @@
+# The concept of a 'build'
+
+Nextstrain's focus on providing a _real-time_ snapshot of evolving pathogen populations necessitates a reproducible analysis that can be rerun when new sequences are available.
+The individual steps necessary to repeat analysis together comprise a "build".
+
+
+Because no two datasets or pathogens are the same, we build Augur to be flexible and suitable for different analyses.
+The individual Augur commands are composable, and can be mixed and matched with other scripts as needed.
+These steps, taken together, are what we refer to as a "build".
+
+
+### Example build
+
+The [Zika virus tutorial](https://docs.nextstrain.org/en/latest/tutorials/zika.html#build-steps) describes a build which contains the following steps:
+
+1. Prepare pathogen sequences and metadata
+2. Align sequences
+3. Construct a phylogeny from aligned sequences
+4. Annotate the phylogeny with inferred ancestral pathogen dates, sequences, and traits
+5. Export the annotated phylogeny and corresponding metadata into auspice-readable format
+
+and each of these can be run via a separate `augur` command.
+
+If you look at the [other tutorials](https://docs.nextstrain.org/en/latest/tutorials/index.html), each one uses a slightly different combination of `augur` commands depending on the pathogen.
+
+### Snakemake
+
+While it is possible to run a build by running each of the individual steps, we typically group these together into a make-type file.
+[Snakemake](https://snakemake.readthedocs.io/en/stable/index.html) is "a tool to create reproducible and scalable data analyses... via a human-readable, Python-based language."
+
+> Snakemake is installed as part of the [conda environment](https://docs.nextstrain.org/en/latest/guides/install/local-installation.html#install-augur-auspice-with-conda) or the [docker container](https://docs.nextstrain.org/en/latest/guides/install/cli-install.html).
+If you ever see a build which has a "Snakefile" then you can run this by typing `snakemake --cores 1` or `nextstrain build --cpus 1 .`, respectively.


=====================================
docs/index.rst
=====================================
@@ -5,9 +5,19 @@ Augur: A bioinformatics toolkit for phylogenetic analysis
     *One held to foretell events by omens.*
     (`Merriam-Webster <https://www.merriam-webster.com/dictionary/augur>`__)
 
+.. note::
+   The documentation you are viewing is Augur's reference guide, which means it is information-oriented and targeted at users who just need info about how Augur works.
+    
+   * If you have a question about how to achieve a specific goal with Augur, check out our :doc:`Augur-focused How-to Guides section <docs.nextstrain.org:guides/bioinformatics/index>` in the main Nextstrain documentation.
+   * If you want to learn the basics of how to use Augur from scratch, check out our :doc:`Zika tutorial <docs.nextstrain.org:tutorials/zika_tutorial>` in the main Nextstrain documentation.
+   * If you want to understand how Augur fits together with Auspice to visualize results, check out our :doc:`Data Formats section <docs.nextstrain.org:reference/formats/data-formats>` in the main Nextstrain documentation.
+
+
+
 Augur is a bioinformatics toolkit to track evolution from sequence and serological data.
 It provides a collection of commands which are designed to be composable into larger processing pipelines.
 Augur originated as part of `Nextstrain <https://nextstrain.org>`__, an open-source project to harness the scientific and public health potential of pathogen genome data.
+All source code is available on `GitHub <https://github.com/nextstrain/augur>`__.
 
 .. note:: We have just released version 6 of augur -- `check our upgrading guide <releases/migrating-v5-v6.html>`__
 
@@ -26,7 +36,6 @@ The ``refine`` step is necessary to ensure that cross-referencing between tree n
 The different augur modules can be strung together by workflow managers like snakemake and nextflow.
 The nextstrain team uses `snakemake <https://snakemake.readthedocs.io/en/stable/>`__ to run and manage the different analysis that you see on `nextstrain.org <https://nextstrain.org>`__.
 
-
 .. toctree::
    :maxdepth: 2
    :caption: Table of contents
@@ -37,7 +46,6 @@ The nextstrain team uses `snakemake <https://snakemake.readthedocs.io/en/stable/
    usage/usage
    releases/releases
    faq/faq
-   tutorials/tutorials
    examples/examples
    api/api
    authors/authors


=====================================
docs/releases/migrating-v5-v6.md
=====================================
@@ -119,7 +119,8 @@ These may have been inferred for internal nodes by Augur functions like `augur t
 Certain traits have a geographic interpretation, e.g. "country".
 Auspice will attempt to display these traits on a map (and provide a drop-down to switch between them if there are more than one).
 
-> _Make sure that these have corresponding entry in the lat-longs TSV file supplied to `export`. See how to do this [here](/faq/lat_longs)._
+> _Make sure that these have corresponding entry in the lat-longs TSV file supplied to `export`. See how to do this [here](https://docs.nextstrain.org/en/latest/guides/bioinformatics/lat_longs.html)._
+
 
 
 ---
@@ -528,5 +529,6 @@ In Auspice v2, all values are now displayed exactly as they arrive, allowing use
 
 Don't forget to also change them in any custom lat-long and/or coloring files you are using. We've also become stricter about the format of the files that pass in color and lat-long information. Previously, it didn't matter if columns were separated by spaces or tabs - now, they must be separated by tabs.
 
-You can find out more about how to add [custom coloring](/faq/colors) and [lat-long](/faq/lat_longs) values.
+You can find out more about how to add [custom coloring](https://docs.nextstrain.org/en/latest/guides/bioinformatics/colors.html) and [lat-long](https://docs.nextstrain.org/en/latest/guides/bioinformatics/lat_longs.html) values.
+
 If you use the command `parse` to generate a metadata table from fields in a fasta header, you can use the flag `--prettify-fields` to apply some prettifying operations to specific metadata entries, see the documentation [`parse`](/usage/cli/parse).


=====================================
docs/releases/v6.md
=====================================
@@ -46,7 +46,7 @@ Users can ask for this output and specify a file name using `--output-sequences`
 <span style='color: orange'>Deprecation warning:</span> The argument `--output` is now deprecated. Please use `--output-node-data` instead.
 
 ## Import BEAST MCC trees
-We now have instructions and functionality to import BEAST trees, see [here](/faq/import-beast).
+We now have instructions and functionality to import BEAST trees, see [here](https://docs.nextstrain.org/en/latest/guides/bioinformatics/import-beast.html).
 
 ## Prettifying of strings
 Previous auspice version "prettified" metadata strings (like changing 'north_america' to 'North America').
@@ -110,4 +110,4 @@ We've tried to use redirects to ensure that all the old links continue to work.
 * Errors in formatting of input files (e.g. metadata files, Auspice config files) weren't handled nicely, often resulting in hard-to-interpret stack traces.
 We now try to catch these and print an error indicating the offending file.
 
-* Tests using Python version 2 have now been removed.
\ No newline at end of file
+* Tests using Python version 2 have now been removed.


=====================================
docs/tutorials/tb_tutorial.md
=====================================
@@ -7,7 +7,7 @@ As in the Zika fasta-input [tutorial](zika_tutorial), we'll build up a Snakefile
 
 ## Setup
 
-To run this tutorial you'll need to [install augur](../installation/installation) and [install Snakemake](https://snakemake.readthedocs.io/en/stable/getting_started/installation.html).
+To run this tutorial you'll need to [install augur](../guides/install/augur_install.md) and [install Snakemake](https://snakemake.readthedocs.io/en/stable/getting_started/installation.html).
 
 ## Build steps
 Nextstrain builds typically require the following steps:


=====================================
docs/tutorials/tutorials.rst deleted
=====================================
@@ -1,12 +0,0 @@
-=========
-Tutorials
-=========
-
-.. note:: We have just released version 6 of augur -- `check our upgrading guide <../releases/migrating-v5-v6.html>`__
-
-.. toctree::
-   :maxdepth: 1
-   :caption: Available tutorials
-
-   zika_tutorial
-   tb_tutorial


=====================================
docs/tutorials/zika_tutorial.md
=====================================
@@ -7,7 +7,7 @@ We will work off the tutorial for Zika virus on the [nextstrain web site](https:
 
 ## Setup
 
-To run this tutorial you'll need to [install augur](../installation/installation) and [install Snakemake](https://snakemake.readthedocs.io/en/stable/getting_started/installation.html).
+To run this tutorial you'll need to [install augur](../guides/install/augur_install.md) and [install Snakemake](https://snakemake.readthedocs.io/en/stable/getting_started/installation.html).
 
 ## Augur commands
 


=====================================
docs/usage/cli/filter.rst
=====================================
@@ -15,7 +15,7 @@ augur filter
 How we subsample sequences in the zika-tutoral
 ==============================================
 
-As an example, we'll look that the ``filter`` command in greater detail using material form the `zika tutorial <../../tutorials/zika_tutorial.html>`__.
+As an example, we'll look that the ``filter`` command in greater detail using material form the :doc:`zika tutorial <docs.nextstrain.org:tutorials/zika_tutorial>`.
 The filter command allows you to selected various subsets of your input data for different types of analysis.
 A simple example use of this command would be
 
@@ -45,7 +45,7 @@ To drop such strains, you can pass the name of this file to the augur filter com
              --output filtered.fasta
 
 (To improve legibility, we have wrapped the command across multiple lines.)
-If you run this command (you should be able to copy-paste this into your terminal) on the data provided in the `zika tutorial <zika_tutorial.html>`__, you should see that one of the sequences in the data set was dropped since its name was in the ``dropped_strains.txt`` file.
+If you run this command (you should be able to copy-paste this into your terminal) on the data provided in the :doc:`zika tutorial <docs.nextstrain.org:tutorials/zika_tutorial>`, you should see that one of the sequences in the data set was dropped since its name was in the ``dropped_strains.txt`` file.
 
 Another common filtering operation is subsetting of data to a achieve a more even spatio-temporal distribution or to cut-down data set size to more manageable numbers.
 The filter command allows you to select a specific number of sequences from specific groups, for example one sequence per month from each country:


=====================================
docs/usage/usage.rst
=====================================
@@ -26,6 +26,5 @@ For instance, the documentation for `augur filter <./cli/filter.html>`__ shows h
    :maxdepth: 2
 
    cli/cli
-   augur_snakemake
    json_format
    envvars


=====================================
setup.py
=====================================
@@ -49,11 +49,11 @@ setuptools.setup(
     python_requires = '>={}'.format('.'.join(str(n) for n in min_version)),
     install_requires = [
         "bcbio-gff >=0.6.0, ==0.6.*",
-        "biopython >=1.67, <=1.78",
+        "biopython >=1.67, <=1.76",
         "jsonschema >=3.0.0, ==3.*",
         "packaging >=19.2",
         "pandas >=1.0.0, ==1.*",
-        "phylo-treetime >=0.7.4, ==0.7.*"
+        "phylo-treetime ==0.8.*"
     ],
     extras_require = {
         'full': [


=====================================
tests/builds/zika.t
=====================================
@@ -75,8 +75,6 @@ Build a time tree from the existing tree topology, the multiple sequence alignme
   >  --date-inference marginal \
   >  --clock-filter-iqd 4 \
   >  --seed 314159 > /dev/null
-  */treetime/aa_models.py:108: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray (glob)
-    [0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795]
 
 Confirm that TreeTime trees match expected topology and branch lengths.
 
@@ -107,8 +105,6 @@ Infer ancestral sequences from the tree.
   >  --infer-ambiguous \
   >  --output-node-data "$TMP/out/nt_muts.json" \
   >  --inference joint > /dev/null
-  */treetime/aa_models.py:108: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray (glob)
-    [0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795]
 
   $ diff -u --ignore-matching-lines version "results/nt_muts.json" "$TMP/out/nt_muts.json"
 


=====================================
tests/functional/ancestral.t
=====================================
@@ -11,8 +11,6 @@ The default is to infer ambiguous bases, so there should not be N bases in the i
   >  --alignment ancestral/aligned.fasta \
   >  --output-node-data "$TMP/ancestral_mutations.json" \
   >  --output-sequences "$TMP/ancestral_sequences.fasta" > /dev/null
-  */treetime/aa_models.py:108: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray (glob)
-    [0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795]
 
   $ grep N "$TMP/ancestral_sequences.fasta"
   >NODE_0000000
@@ -26,8 +24,6 @@ There should not be N bases in the inferred output sequences.
   >  --infer-ambiguous \
   >  --output-node-data "$TMP/ancestral_mutations.json" \
   >  --output-sequences "$TMP/ancestral_sequences.fasta" > /dev/null
-  */treetime/aa_models.py:108: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray (glob)
-    [0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795]
 
   $ grep N "$TMP/ancestral_sequences.fasta"
   >NODE_0000000
@@ -41,8 +37,6 @@ There be N bases in the inferred output sequences.
   >  --keep-ambiguous \
   >  --output-node-data "$TMP/ancestral_mutations.json" \
   >  --output-sequences "$TMP/ancestral_sequences.fasta" > /dev/null
-  */treetime/aa_models.py:108: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray (glob)
-    [0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795]
 
   $ grep N "$TMP/ancestral_sequences.fasta"
   >NODE_0000000


=====================================
tests/functional/refine.t
=====================================
@@ -17,8 +17,6 @@ Try building a time tree.
   >  --date-inference marginal \
   >  --clock-filter-iqd 4 \
   >  --seed 314159 > /dev/null
-  */treetime/aa_models.py:108: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray (glob)
-    [0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795]
 
 Confirm that TreeTime trees match expected topology and branch lengths.
 
@@ -40,8 +38,6 @@ Build a time tree with mutations as the reported divergence unit.
   >  --clock-filter-iqd 4 \
   >  --seed 314159 \
   >  --divergence-units mutations > /dev/null
-  */treetime/aa_models.py:108: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray (glob)
-    [0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795]
 
 Confirm that TreeTime trees match expected topology and branch lengths.
 
@@ -62,8 +58,6 @@ This is one way to get named internal nodes for downstream analyses and does not
   >  --clock-filter-iqd 4 \
   >  --seed 314159 \
   >  --divergence-units mutations-per-site > /dev/null
-  */treetime/aa_models.py:108: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray (glob)
-    [0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795]
 
 Confirm that trees match expected topology and branch lengths, given that the output should not be a time tree.
 
@@ -87,8 +81,6 @@ This approach only works when we provide an alignment FASTA.
   >  --clock-filter-iqd 4 \
   >  --seed 314159 \
   >  --divergence-units mutations > /dev/null
-  */treetime/aa_models.py:108: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray (glob)
-    [0.800038509951648, 1.20274751778601, 1.55207513886163, 1.46600946033173, 0.830022143283238, 1.5416250309563, 1.53255698189437, 1.41208067821187, 1.47469999960758, 0.351200119909572, 0.570542199221932, 1.21378822764856, 0.609532859331199, 0.692733248746636, 1.40887880416009, 1.02015839286433, 0.807404666228614, 1.268589159299, 0.933095433689795]
 
 Confirm that trees match expected topology and branch lengths, given that the output should not be a time tree.
 


=====================================
tests/test_utils.py
=====================================
@@ -91,7 +91,7 @@ class TestUtils:
             fh.write("\n".join(bed_lines))
         with pytest.raises(Exception):
             utils.read_bed_file(bed_file)
-    
+
     def test_read_mask_file_drm_file(self, tmpdir):
         """read_mask_file should handle drm files as well"""
         drm_file = str(tmpdir / "temp.drm")
@@ -100,3 +100,32 @@ class TestUtils:
         with open(drm_file, "w") as fh:
             fh.write("\n".join(drm_lines))
         assert utils.read_mask_file(drm_file) == expected_sites
+
+    def test_is_date_ambiguous(self):
+        """is_date_ambiguous should return true for ambiguous dates and false for valid dates."""
+        # Test complete date strings with ambiguous values.
+        assert utils.is_date_ambiguous("2019-0X-0X", "any")
+        assert utils.is_date_ambiguous("2019-XX-09", "month")
+        assert utils.is_date_ambiguous("2019-03-XX", "day")
+        assert utils.is_date_ambiguous("201X-03-09", "year")
+        assert utils.is_date_ambiguous("20XX-01-09", "month")
+        assert utils.is_date_ambiguous("2019-XX-03", "day")
+        assert utils.is_date_ambiguous("20XX-01-03", "day")
+
+        # Test incomplete date strings with ambiguous values.
+        assert utils.is_date_ambiguous("2019", "any")
+        assert utils.is_date_ambiguous("201X", "year")
+        assert utils.is_date_ambiguous("2019-XX", "month")
+        assert utils.is_date_ambiguous("2019-10", "day")
+        assert utils.is_date_ambiguous("2019-XX", "any")
+        assert utils.is_date_ambiguous("2019-XX", "day")
+
+        # Test complete date strings without ambiguous dates for the requested field.
+        assert not utils.is_date_ambiguous("2019-09-03", "any")
+        assert not utils.is_date_ambiguous("2019-03-XX", "month")
+        assert not utils.is_date_ambiguous("2019-09-03", "day")
+        assert not utils.is_date_ambiguous("2019-XX-XX", "year")
+
+        # Test incomplete date strings without ambiguous dates for the requested fields.
+        assert not utils.is_date_ambiguous("2019", "year")
+        assert not utils.is_date_ambiguous("2019-10", "month")



View it on GitLab: https://salsa.debian.org/med-team/augur/-/compare/b59a1a7a9dcc7b11889c92a0c2605a20f37262b7...5230ab1d8a882d28c879b73e489ae063c140b293

-- 
View it on GitLab: https://salsa.debian.org/med-team/augur/-/compare/b59a1a7a9dcc7b11889c92a0c2605a20f37262b7...5230ab1d8a882d28c879b73e489ae063c140b293
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20201118/47bc5c06/attachment-0001.html>