[med-svn] [Git][med-team/q2-metadata][upstream] New upstream version 2023.7.0+dfsg

Étienne Mollier (@emollier) gitlab at salsa.debian.org
Sat Aug 19 09:20:39 BST 2023



Étienne Mollier pushed to branch upstream at Debian Med / q2-metadata


Commits:
c1a6a697 by Étienne Mollier at 2023-08-18T15:27:44+02:00
New upstream version 2023.7.0+dfsg
- - - - -


21 changed files:

- + .github/workflows/ci-dev.yaml
- − .github/workflows/ci.yml
- + .github/workflows/join-release.yaml
- + .github/workflows/tag-release.yaml
- LICENSE
- ci/recipe/meta.yaml
- q2_metadata/__init__.py
- q2_metadata/_distance.py
- − q2_metadata/_examples.py
- + q2_metadata/_merge.py
- q2_metadata/_random.py
- q2_metadata/_tabulate.py
- q2_metadata/_version.py
- q2_metadata/plugin_setup.py
- q2_metadata/tests/__init__.py
- q2_metadata/tests/test_distance.py
- + q2_metadata/tests/test_merge.py
- q2_metadata/tests/test_plugin_setup.py
- q2_metadata/tests/test_random.py
- q2_metadata/tests/test_tabulate.py
- setup.py


Changes:

=====================================
.github/workflows/ci-dev.yaml
=====================================
@@ -0,0 +1,12 @@
+# Example of workflow trigger for calling workflow (the client).
+name: ci-dev
+on:
+  pull_request:
+    branches: ["dev"]
+  push:
+    branches: ["dev"]
+jobs:
+  ci:
+    uses: qiime2/distributions/.github/workflows/lib-ci-dev.yaml at dev
+    with:
+      distro: core
\ No newline at end of file


=====================================
.github/workflows/ci.yml deleted
=====================================
@@ -1,55 +0,0 @@
-# This file is automatically generated by busywork.qiime2.org and
-# template-repos - any manual edits made to this file will be erased when
-# busywork performs maintenance updates.
-
-name: ci
-
-on:
-  pull_request:
-  push:
-    branches:
-      - master
-
-jobs:
-  lint:
-    runs-on: ubuntu-latest
-    steps:
-    - name: checkout source
-      uses: actions/checkout at v2
-
-    - name: set up python 3.8
-      uses: actions/setup-python at v1
-      with:
-        python-version: 3.8
-
-    - name: install dependencies
-      run: python -m pip install --upgrade pip
-
-    - name: lint
-      run: |
-        pip install -q https://github.com/qiime2/q2lint/archive/master.zip
-        q2lint
-        pip install -q flake8
-        flake8
-
-  build-and-test:
-    needs: lint
-    strategy:
-      matrix:
-        os: [ubuntu-latest, macos-latest]
-    runs-on: ${{ matrix.os }}
-    steps:
-    - name: checkout source
-      uses: actions/checkout at v2
-      with:
-        fetch-depth: 0
-
-    - name: set up git repo for versioneer
-      run: git fetch --depth=1 origin +refs/tags/*:refs/tags/*
-
-    - uses: qiime2/action-library-packaging at alpha1
-      with:
-        package-name: q2-metadata
-        build-target: dev
-        additional-tests: py.test --pyargs q2_metadata
-        library-token: ${{ secrets.LIBRARY_TOKEN }}


=====================================
.github/workflows/join-release.yaml
=====================================
@@ -0,0 +1,6 @@
+name: join-release
+on:
+  workflow_dispatch: {}
+jobs:
+  release:
+    uses: qiime2/distributions/.github/workflows/lib-join-release.yaml at dev
\ No newline at end of file


=====================================
.github/workflows/tag-release.yaml
=====================================
@@ -0,0 +1,7 @@
+name: tag-release
+on:
+  push:
+    branches: ["Release-*"]
+jobs:
+  tag:
+    uses: qiime2/distributions/.github/workflows/lib-tag-release.yaml at dev
\ No newline at end of file


=====================================
LICENSE
=====================================
@@ -1,6 +1,6 @@
 BSD 3-Clause License
 
-Copyright (c) 2017-2022, QIIME 2 development team.
+Copyright (c) 2017-2023, QIIME 2 development team.
 All rights reserved.
 
 Redistribution and use in source and binary forms, with or without


=====================================
ci/recipe/meta.yaml
=====================================
@@ -25,20 +25,21 @@ requirements:
     - qiime2 {{ qiime2_epoch }}.*
     - q2templates {{ qiime2_epoch }}.*
     - q2-types {{ qiime2_epoch }}.*
-    - q2-quality-filter {{ qiime2_epoch }}.*
 
 test:
   requires:
     - qiime2 >={{ qiime2 }}
     - q2templates >={{ q2templates }}
     - q2-types >={{ q2_types }}
-    - q2-quality-filter >={{ q2_quality_filter }}
     - pytest
 
   imports:
     - q2_metadata
     - qiime2.plugins.metadata
 
+  commands:
+    - py.test --pyargs q2_metadata
+
 about:
   home: https://qiime2.org
   license: BSD-3-Clause


=====================================
q2_metadata/__init__.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #
@@ -9,9 +9,10 @@
 from ._tabulate import tabulate
 from ._distance import distance_matrix
 from ._random import shuffle_groups
+from ._merge import merge
 from ._version import get_versions
 
 __version__ = get_versions()['version']
 del get_versions
 
-__all__ = ['tabulate', 'distance_matrix', 'shuffle_groups']
+__all__ = ['tabulate', 'distance_matrix', 'shuffle_groups', 'merge']


=====================================
q2_metadata/_distance.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #


=====================================
q2_metadata/_examples.py deleted
=====================================
@@ -1,61 +0,0 @@
-# ----------------------------------------------------------------------------
-# Copyright (c) 2016-2022, QIIME 2 development team.
-#
-# Distributed under the terms of the Modified BSD License.
-#
-# The full license is in the file LICENSE, distributed with this software.
-# ----------------------------------------------------------------------------
-
-import qiime2
-
-
-stats_url = ('https://data.qiime2.org/usage-examples/'
-             'moving-pictures/demux-filter-stats.qza')
-faith_pd_url = ('https://data.qiime2.org/usage-examples/moving-pictures/'
-                'core-metrics-results/faith_pd_vector.qza')
-
-metadata_url = (f'https://data.qiime2.org/{qiime2.__release__}/tutorials/'
-                'moving-pictures/sample_metadata.tsv')
-
-
-def tabulate_example(use):
-    stats = use.init_artifact_from_url('demux_stats', stats_url)
-    stats_md = use.view_as_metadata('stats_as_md', stats)
-
-    viz, = use.action(
-        use.UsageAction('metadata', 'tabulate'),
-        use.UsageInputs(
-            input=stats_md,
-        ),
-        use.UsageOutputNames(
-            visualization='demux_stats_viz',
-        )
-    )
-
-    viz.assert_output_type('Visualization')
-
-
-def tabulate_multiple_files_example(use):
-    md = use.init_metadata_from_url('sample-metadata', metadata_url)
-    faith_pd = use.init_artifact_from_url('faith_pd_vector', faith_pd_url)
-    faith_pd_as_md = use.view_as_metadata('faith_pd_as_metadata', faith_pd)
-
-    merged = use.merge_metadata('merged', md, faith_pd_as_md)
-
-    use.comment(
-        "Multiple metadata files or artifacts viewed as metadata can be merged"
-        " to make one tabular visualization. "
-        "This one displays only 25 metadata rows per page."
-    )
-    viz, = use.action(
-        use.UsageAction('metadata', 'tabulate'),
-        use.UsageInputs(
-            input=merged,
-            page_size=25,
-        ),
-        use.UsageOutputNames(
-            visualization='demux_stats_viz',
-        )
-    )
-
-    viz.assert_output_type('Visualization')


=====================================
q2_metadata/_merge.py
=====================================
@@ -0,0 +1,43 @@
+# ----------------------------------------------------------------------------
+# Copyright (c) 2017-2023, QIIME 2 development team.
+#
+# Distributed under the terms of the Modified BSD License.
+#
+# The full license is in the file LICENSE, distributed with this software.
+# ----------------------------------------------------------------------------
+
+import qiime2
+import pandas as pd
+
+
+def merge(metadata1: qiime2.Metadata,
+          metadata2: qiime2.Metadata) -> qiime2.Metadata:
+    # Ultimately it would make sense for this action to take
+    # List[qiime2.Metadata] as input, but this isn't possible right now
+    overlapping_ids = set(metadata1.ids) & set(metadata2.ids)
+    overlapping_columns = set(metadata1.columns) & set(metadata2.columns)
+    n_overlapping_ids = len(overlapping_ids)
+    n_overlapping_columns = len(overlapping_columns)
+
+    if len(overlapping_ids) > 0 and len(overlapping_columns) > 0:
+        raise ValueError(f"Merging can currently handle overlapping ids "
+                         f"or overlapping columns, but not both. "
+                         f"{n_overlapping_ids} overlapping ids were "
+                         f"identified ({', '.join(overlapping_ids)}) and"
+                         f"{n_overlapping_columns} overlapping columns "
+                         f"were identified {', '.join(overlapping_columns)}.")
+
+    df1 = metadata1.to_dataframe()
+    df2 = metadata2.to_dataframe()
+
+    if n_overlapping_columns == 0:
+        result = pd.merge(df1, df2, how='outer', left_index=True,
+                          right_index=True)
+    else:  # i.e., n_overlapping_ids == 0
+        result = pd.merge(df1, df2, how='outer', left_index=True,
+                          right_index=True, suffixes=('', '_'))
+        for c in overlapping_columns:
+            result[c] = result[c].combine_first(result[f"{c}_"])
+            result = result.drop(columns=[f"{c}_"])
+
+    return qiime2.Metadata(result)


=====================================
q2_metadata/_random.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #
@@ -13,24 +13,32 @@ import pandas as pd
 
 def shuffle_groups(metadata: qiime2.CategoricalMetadataColumn,
                    n_columns: int = 3,
-                   column_name_prefix: str = 'shuffled.grouping.',
-                   column_value_prefix: str = 'fake.group.') -> pd.DataFrame:
+                   md_column_name_prefix: str = 'shuffled.grouping.',
+                   md_column_values_prefix: str = 'fake.group.',
+                   encode_sample_size: bool = False
+                   ) -> pd.DataFrame:
 
     input_column_name = metadata.name
     df = metadata.to_dataframe()
+    group_sample_size = df[input_column_name].value_counts()
 
     value_mapping = {}
     for i, value in enumerate(df[input_column_name].unique()):
-        value_mapping[value] = '%s%d' % (column_value_prefix, i)
-
-    first_column_id = '%s0' % column_name_prefix
+        if encode_sample_size:
+            value_mapping[value] = '%s%d%s' % (md_column_values_prefix, i,
+                                               f'.n={group_sample_size[value]}'
+                                               )
+        else:
+            value_mapping[value] = '%s%d' % (md_column_values_prefix, i)
+
+    first_column_id = '%s0' % md_column_name_prefix
     df[first_column_id] = df[input_column_name].map(value_mapping)
 
     df[first_column_id] = \
         np.random.permutation(df[first_column_id].values)
 
     for i in range(1, n_columns):
-        column_id = '%s%d' % (column_name_prefix, i)
+        column_id = '%s%d' % (md_column_name_prefix, i)
         df[column_id] = \
             np.random.permutation(df[first_column_id].values)
 


=====================================
q2_metadata/_tabulate.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #


=====================================
q2_metadata/_version.py
=====================================
@@ -23,9 +23,9 @@ def get_keywords():
     # setup.py/versioneer.py will grep for the variable names, so they must
     # each be defined on a line of their own. _version.py will just call
     # get_keywords().
-    git_refnames = " (tag: 2022.11.1)"
-    git_full = "3fb78377c3aa05380d57e101e6146fb4099ea437"
-    git_date = "2022-12-21 22:06:24 +0000"
+    git_refnames = " (tag: 2023.7.0, Release-2023.7)"
+    git_full = "5247a8c0f3bf099449cd31fade7bd5af545820b6"
+    git_date = "2023-08-17 18:46:21 +0000"
     keywords = {"refnames": git_refnames, "full": git_full, "date": git_date}
     return keywords
 


=====================================
q2_metadata/plugin_setup.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #
@@ -9,13 +9,14 @@
 import pandas as pd
 from q2_types.distance_matrix import DistanceMatrix
 from q2_types.sample_data import SampleData
+from q2_types.metadata import ImmutableMetadata
 import qiime2.plugin
 from qiime2.plugin import (
     Int, Categorical, MetadataColumn, model, Numeric, Plugin, SemanticType,
-    Str, ValidationError,
+    Str, Bool, Metadata, ValidationError,
 )
 
-from . import _examples, tabulate, distance_matrix, shuffle_groups, __version__
+from . import tabulate, distance_matrix, shuffle_groups, merge, __version__
 
 plugin = Plugin(
     name='metadata',
@@ -60,10 +61,6 @@ plugin.visualizers.register_function(
     description='Generate a tabular view of Metadata. The output '
                 'visualization supports interactive filtering, sorting, and '
                 'exporting to common file formats.',
-    examples={
-        'basic_tabulate_usage': _examples.tabulate_example,
-        'tabulate_multiple_files': _examples.tabulate_multiple_files_example
-    },
 )
 
 ArtificialGrouping = \
@@ -117,15 +114,21 @@ plugin.methods.register_function(
     inputs={},
     parameters={'metadata': MetadataColumn[Categorical],
                 'n_columns': Int,
-                'column_name_prefix': Str,
-                'column_value_prefix': Str},
+                'md_column_name_prefix': Str,
+                'md_column_values_prefix': Str,
+                'encode_sample_size': Bool
+                },
     parameter_descriptions={
         'metadata': ('Categorical metadata column to shuffle.'),
         'n_columns': 'The number of shuffled metadata columns to create.',
-        'column_name_prefix': ('Prefix to use in naming the shuffled '
-                               'metadata columns.'),
-        'column_value_prefix': ('Prefix to use in naming the values in the '
-                                'shuffled metadata columns.')},
+        'md_column_name_prefix': ('Prefix to use in naming the shuffled '
+                                  'metadata columns.'),
+        'md_column_values_prefix': ('Prefix to use in naming the values in '
+                                    'the shuffled metadata columns.'),
+        'encode_sample_size': ('If true, the sample size of each metadata '
+                               'group will be appended to the shuffled '
+                               'metadata column values.'),
+        },
     output_descriptions={
         'shuffled_groups': 'Randomized metadata columns'},
     outputs=[('shuffled_groups', SampleData[ArtificialGrouping])],
@@ -139,3 +142,35 @@ plugin.methods.register_function(
                  'values with sample ids will be random. These data will be '
                  'written to an artifact that can be used as sample metadata.')
 )
+
+
+plugin.methods.register_function(
+    function=merge,
+    inputs={},
+    parameters={'metadata1': Metadata,
+                'metadata2': Metadata},
+    parameter_descriptions={
+        'metadata1': 'First metadata file to merge.',
+        'metadata2': 'Second metadata file to merge.'
+    },
+    outputs=[('merged_metadata', ImmutableMetadata)],
+    output_descriptions={
+        'merged_metadata': 'The merged metadata.'
+    },
+    name='Merge metadata',
+    description=('Merge metadata that contains overlapping ids or overlapping '
+                 'columns, but not both overlapping ids and overlapping '
+                 'columns. The result will be the union (i.e., outer join) '
+                 'of the ids and columns from the two metadata inputs.\n\n'
+                 'Attemping to merge metadata with both overlapping ids and '
+                 'overlapping columns will currently fail because we don\'t '
+                 'resolve conflicting column values for a sample. '
+                 'Merging metadata with neither overlapping ids or '
+                 'overlapping columns is possible with this action.\n\n'
+                 'To merge more than two metadata objects, run this command '
+                 'multiple times, iteratively using the output of the '
+                 'previous run as one of the metadata inputs.\n\n'
+                 'The output, an ImmutableMetadata artifact, can be used '
+                 'anywhere that a metadata file can be used, or can be '
+                 'exported to a metadata tsv file in the typical format.')
+)


=====================================
q2_metadata/tests/__init__.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #


=====================================
q2_metadata/tests/test_distance.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #


=====================================
q2_metadata/tests/test_merge.py
=====================================
@@ -0,0 +1,203 @@
+# ----------------------------------------------------------------------------
+# Copyright (c) 2017-2023, QIIME 2 development team.
+#
+# Distributed under the terms of the Modified BSD License.
+#
+# The full license is in the file LICENSE, distributed with this software.
+# ----------------------------------------------------------------------------
+
+import unittest
+
+import numpy as np
+import pandas as pd
+import qiime2
+
+from q2_metadata import merge
+
+
+class MergeTests(unittest.TestCase):
+
+    def test_merge_overlapping_samples_and_columns_errors(self):
+        index1 = pd.Index(['sample1', 'sample2', 'sample3'], name='id')
+        data1 = [['a', 'd', 'h'],
+                 ['b', 'e', 'i'],
+                 ['c', 'f', 'j']]
+        md1 = qiime2.Metadata(pd.DataFrame(data1, index=index1, dtype=object,
+                                           columns=['col1', 'col2', 'col3']))
+
+        index2 = pd.Index(['sample1', 'sample2', 'sample3'], name='id')
+        data2 = [['a', 'd', 'h'],
+                 ['b', 'e', 'i'],
+                 ['c', 'f', 'j']]
+
+        md2 = qiime2.Metadata(pd.DataFrame(data2, index=index2, dtype=object,
+                                           columns=['col1', 'col2', 'col3']))
+
+        self.assertRaisesRegex(ValueError,
+                               "3 overl.*ids.*sample1.*3 overl.*col.*col1",
+                               merge, md1, md2)
+
+        index3 = pd.Index(['sample4', 'sample5', 'sample1'], name='id')
+        data3 = [['a', 'd', 'h'],
+                 ['b', 'e', 'i'],
+                 ['c', 'f', 'j']]
+        md3 = qiime2.Metadata(pd.DataFrame(data3, index=index3, dtype=object,
+                                           columns=['col4', 'col5', 'col1']))
+
+        self.assertRaisesRegex(ValueError,
+                               "1 overl.*ids.*sample1.*1 overl.*col.*col1",
+                               merge, md1, md3)
+
+    def test_merge_all_samples_overlapping(self):
+        index1 = pd.Index(['sample1', 'sample2', 'sample3'], name='id')
+        data1 = [['a', 'd', 'h'],
+                 ['b', 'e', 'i'],
+                 ['c', 'f', 'j']]
+        md1 = qiime2.Metadata(pd.DataFrame(data1, index=index1, dtype=object,
+                                           columns=['col1', 'col2', 'col3']))
+
+        index2 = pd.Index(['sample1', 'sample2', 'sample3'], name='id')
+        data2 = [['k', 'n', 'q'],
+                 ['l', 'o', 'r'],
+                 ['m', 'p', 's']]
+        md2 = qiime2.Metadata(pd.DataFrame(data2, index=index2, dtype=object,
+                                           columns=['col4', 'col5', 'col6']))
+
+        obs1 = merge(md1, md2)
+
+        index_exp1 = pd.Index(['sample1', 'sample2', 'sample3'], name='id')
+        data_exp1 = [['a', 'd', 'h', 'k', 'n', 'q'],
+                     ['b', 'e', 'i', 'l', 'o', 'r'],
+                     ['c', 'f', 'j', 'm', 'p', 's']]
+        exp1 = qiime2.Metadata(
+            pd.DataFrame(data_exp1, index=index_exp1, dtype=object,
+                         columns=['col1', 'col2', 'col3',
+                                  'col4', 'col5', 'col6']))
+
+        self.assertEqual(obs1, exp1)
+
+    def test_merge_some_samples_overlapping(self):
+        index1 = pd.Index(['sample1', 'sample2', 'sample3'], name='id')
+        data1 = [['a', 'd', 'h'],
+                 ['b', 'e', 'i'],
+                 ['c', 'f', 'j']]
+        md1 = qiime2.Metadata(pd.DataFrame(data1, index=index1, dtype=object,
+                                           columns=['col1', 'col2', 'col3']))
+
+        index2 = pd.Index(['sample1', 'sample2', 'sample4'], name='id')
+        data2 = [['k', 'n'],
+                 ['l', 'o'],
+                 ['m', 'p']]
+        md2 = qiime2.Metadata(pd.DataFrame(data2, index=index2, dtype=object,
+                                           columns=['col4', 'col5']))
+
+        obs1 = merge(md1, md2)
+
+        index_exp1 = pd.Index(['sample1', 'sample2', 'sample3', 'sample4'],
+                              name='id')
+        data_exp1 = [['a', 'd', 'h', 'k', 'n'],
+                     ['b', 'e', 'i', 'l', 'o'],
+                     ['c', 'f', 'j', np.nan, np.nan],
+                     [np.nan, np.nan, np.nan, 'm', 'p']]
+        exp1 = qiime2.Metadata(
+            pd.DataFrame(data_exp1, index=index_exp1, dtype=object,
+                         columns=['col1', 'col2', 'col3',
+                                  'col4', 'col5']))
+
+        self.assertEqual(obs1, exp1)
+
+    def test_merge_all_columns_overlapping(self):
+        index1 = pd.Index(['sample1', 'sample2', 'sample3'], name='id')
+        data1 = [['a', 'd', 'h'],
+                 ['b', 'e', 'i'],
+                 ['c', 'f', 'j']]
+        md1 = qiime2.Metadata(pd.DataFrame(data1, index=index1, dtype=object,
+                                           columns=['col1', 'col2', 'col3']))
+
+        index2 = pd.Index(['sample4', 'sample5', 'sample6'], name='id')
+        data2 = [['k', 'n', 'q'],
+                 ['l', 'o', 'r'],
+                 ['m', 'p', 's']]
+        md2 = qiime2.Metadata(pd.DataFrame(data2, index=index2, dtype=object,
+                                           columns=['col1', 'col2', 'col3']))
+
+        obs1 = merge(md1, md2)
+        print(obs1.to_dataframe())
+
+        index_exp1 = pd.Index(['sample1', 'sample2', 'sample3',
+                               'sample4', 'sample5', 'sample6'], name='id')
+        data_exp1 = [['a', 'd', 'h'],
+                     ['b', 'e', 'i'],
+                     ['c', 'f', 'j'],
+                     ['k', 'n', 'q'],
+                     ['l', 'o', 'r'],
+                     ['m', 'p', 's']]
+        exp1 = qiime2.Metadata(
+            pd.DataFrame(data_exp1, index=index_exp1, dtype=object,
+                         columns=['col1', 'col2', 'col3']))
+
+        print(exp1.to_dataframe())
+        self.assertEqual(obs1, exp1)
+
+    def test_merge_some_columns_overlapping(self):
+        index1 = pd.Index(['sample1', 'sample2', 'sample3'], name='id')
+        data1 = [['a', 'd', 'h'],
+                 ['b', 'e', 'i'],
+                 ['c', 'f', 'j']]
+        md1 = qiime2.Metadata(pd.DataFrame(data1, index=index1, dtype=object,
+                                           columns=['col1', 'col2', 'col3']))
+
+        index2 = pd.Index(['sample4', 'sample5', 'sample6'], name='id')
+        data2 = [['k', 'n', 'q'],
+                 ['l', 'o', 'r'],
+                 ['m', 'p', 's']]
+        md2 = qiime2.Metadata(pd.DataFrame(data2, index=index2, dtype=object,
+                                           columns=['col1', 'col2', 'col4']))
+
+        obs1 = merge(md1, md2)
+
+        index_exp1 = pd.Index(['sample1', 'sample2', 'sample3',
+                               'sample4', 'sample5', 'sample6'], name='id')
+        data_exp1 = [['a', 'd', 'h', np.nan],
+                     ['b', 'e', 'i', np.nan],
+                     ['c', 'f', 'j', np.nan],
+                     ['k', 'n', np.nan, 'q'],
+                     ['l', 'o', np.nan, 'r'],
+                     ['m', 'p', np.nan, 's']]
+        exp1 = qiime2.Metadata(
+            pd.DataFrame(data_exp1, index=index_exp1, dtype=object,
+                         columns=['col1', 'col2', 'col3', 'col4']))
+
+        self.assertEqual(obs1, exp1)
+
+    def test_merge_no_samples_or_columns_overlapping(self):
+        index1 = pd.Index(['sample1', 'sample2', 'sample3'], name='id')
+        data1 = [['a', 'd', 'h'],
+                 ['b', 'e', 'i'],
+                 ['c', 'f', 'j']]
+        md1 = qiime2.Metadata(pd.DataFrame(data1, index=index1, dtype=object,
+                                           columns=['col1', 'col2', 'col3']))
+
+        index2 = pd.Index(['sample4', 'sample5', 'sample6'], name='id')
+        data2 = [['k', 'n', 'q'],
+                 ['l', 'o', 'r'],
+                 ['m', 'p', 's']]
+        md2 = qiime2.Metadata(pd.DataFrame(data2, index=index2, dtype=object,
+                                           columns=['col4', 'col5', 'col6']))
+
+        obs1 = merge(md1, md2)
+
+        index_exp1 = pd.Index(['sample1', 'sample2', 'sample3',
+                               'sample4', 'sample5', 'sample6'], name='id')
+        data_exp1 = [['a', 'd', 'h', np.nan, np.nan, np.nan],
+                     ['b', 'e', 'i', np.nan, np.nan, np.nan],
+                     ['c', 'f', 'j', np.nan, np.nan, np.nan],
+                     [np.nan, np.nan, np.nan, 'k', 'n', 'q'],
+                     [np.nan, np.nan, np.nan, 'l', 'o', 'r'],
+                     [np.nan, np.nan, np.nan, 'm', 'p', 's']]
+        exp1 = qiime2.Metadata(
+            pd.DataFrame(data_exp1, index=index_exp1, dtype=object,
+                         columns=['col1', 'col2', 'col3',
+                                  'col4', 'col5', 'col6']))
+
+        self.assertEqual(obs1, exp1)


=====================================
q2_metadata/tests/test_plugin_setup.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #


=====================================
q2_metadata/tests/test_random.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #
@@ -28,8 +28,8 @@ class ShuffleGroupsTests(unittest.TestCase):
 
         # expected number of rows and columns in result
         obs = shuffle_groups(md, n_columns=1,
-                             column_name_prefix='shuffled.grouping.',
-                             column_value_prefix='fake.group.')
+                             md_column_name_prefix='shuffled.grouping.',
+                             md_column_values_prefix='fake.group.')
         self.assertEqual(obs.shape, (4, 1))
 
         # expected column names (the original should not be in the result)
@@ -52,8 +52,8 @@ class ShuffleGroupsTests(unittest.TestCase):
         random_check = []
         for i in range(self.n_iterations):
             obs2 = shuffle_groups(md, n_columns=1,
-                                  column_name_prefix='shuffled.grouping.',
-                                  column_value_prefix='fake.group.')
+                                  md_column_name_prefix='shuffled.grouping.',
+                                  md_column_values_prefix='fake.group.')
             random_check.append(
                 list(obs['shuffled.grouping.0']) ==
                 list(obs2['shuffled.grouping.0']))
@@ -74,8 +74,8 @@ class ShuffleGroupsTests(unittest.TestCase):
 
         # expected number of rows and columns
         obs = shuffle_groups(md, n_columns=3,
-                             column_name_prefix='shuffled.grouping.',
-                             column_value_prefix='fake.group.')
+                             md_column_name_prefix='shuffled.grouping.',
+                             md_column_values_prefix='fake.group.')
         self.assertEqual(obs.shape, (9, 3))
 
         # original column name should not be in the result
@@ -131,8 +131,8 @@ class ShuffleGroupsTests(unittest.TestCase):
 
         # expected number of rows and columns in result
         obs = shuffle_groups(md, n_columns=1,
-                             column_name_prefix='shuffled.grouping.',
-                             column_value_prefix='fake.group.')
+                             md_column_name_prefix='shuffled.grouping.',
+                             md_column_values_prefix='fake.group.')
         self.assertEqual(obs.shape, (4, 1))
 
         # expected column names (the original should not be in the result)
@@ -155,8 +155,8 @@ class ShuffleGroupsTests(unittest.TestCase):
 
         # expected number of rows and columns in result
         obs = shuffle_groups(md, n_columns=1,
-                             column_name_prefix='1',
-                             column_value_prefix='fake.group.')
+                             md_column_name_prefix='1',
+                             md_column_values_prefix='fake.group.')
         self.assertEqual(obs.shape, (4, 1))
 
         # expected column names (the original should not be in the result)
@@ -179,8 +179,8 @@ class ShuffleGroupsTests(unittest.TestCase):
 
         # expected number of rows and columns in result
         obs = shuffle_groups(md, n_columns=1,
-                             column_name_prefix='shuffled.grouping.',
-                             column_value_prefix='1')
+                             md_column_name_prefix='shuffled.grouping.',
+                             md_column_values_prefix='1')
         self.assertEqual(obs.shape, (4, 1))
 
         # expected column names (the original should not be in the result)
@@ -194,3 +194,77 @@ class ShuffleGroupsTests(unittest.TestCase):
         self.assertEqual(
             set(obs['shuffled.grouping.0'].unique()),
             {'11', '10'})
+
+    def test_shuffle_groups_sample_size_column_id_flag_no_input(self):
+        md = qiime2.CategoricalMetadataColumn(
+            pd.Series(['a', 'b', 'a', 'b'], name='groups',
+                      index=pd.Index(['sample1', 'sample2', 'sample3', 's4'],
+                                     name='id'))
+        )
+
+        # expected number of rows and columns in result
+        obs = shuffle_groups(md, n_columns=1,
+                             md_column_name_prefix='shuffled.grouping.',
+                             md_column_values_prefix='fake.group.',
+                             )
+
+        # correct group names in new column
+        self.assertEqual(
+            set(obs['shuffled.grouping.0'].unique()),
+            {'fake.group.1', 'fake.group.0'})
+
+    def test_shuffle_groups_sample_size_column_id_flag_true(self):
+        md = qiime2.CategoricalMetadataColumn(
+            pd.Series(['a', 'b', 'a', 'b'], name='groups',
+                      index=pd.Index(['sample1', 'sample2', 'sample3', 's4'],
+                                     name='id'))
+        )
+
+        # expected number of rows and columns in result
+        obs = shuffle_groups(md, n_columns=1,
+                             md_column_name_prefix='shuffled.grouping.',
+                             md_column_values_prefix='fake.group.',
+                             encode_sample_size=True)
+
+        # correct group names in new column
+        self.assertEqual(
+            set(obs['shuffled.grouping.0'].unique()),
+            {'fake.group.1.n=2', 'fake.group.0.n=2'})
+
+    def test_shuffle_groups_sample_size_column_id_flag_false(self):
+        md = qiime2.CategoricalMetadataColumn(
+            pd.Series(['a', 'b', 'a', 'b'], name='groups',
+                      index=pd.Index(['sample1', 'sample2', 'sample3', 's4'],
+                                     name='id'))
+        )
+
+        # expected number of rows and columns in result
+        obs = shuffle_groups(md, n_columns=1,
+                             md_column_name_prefix='shuffled.grouping.',
+                             md_column_values_prefix='1',
+                             encode_sample_size=False)
+
+        # correct group names in new column
+        self.assertEqual(
+            set(obs['shuffled.grouping.0'].unique()),
+            {'11', '10'})
+
+    def test_shuffle_groups_sample_size_equals_value_counts(self):
+        md = qiime2.CategoricalMetadataColumn(
+            pd.Series(['a', 'b', 'b', 'b', 'a', 'b'], name='groups',
+                      index=pd.Index(['sample1', 'sample2', 'sample3', 's4',
+                                      's5', 's6'],
+                                     name='id'))
+        )
+
+        # expected number of rows and columns in result
+        obs = shuffle_groups(md, n_columns=1,
+                             md_column_name_prefix='shuffled.grouping.',
+                             md_column_values_prefix='fake.group.',
+                             encode_sample_size=True)
+
+        # expected number of samples in each group in the new column
+        self.assertEqual(obs['shuffled.grouping.0'].value_counts()
+                         ['fake.group.0.n=2'], 2)
+        self.assertEqual(obs['shuffled.grouping.0'].value_counts()
+                            ['fake.group.1.n=4'], 4)


=====================================
q2_metadata/tests/test_tabulate.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #
@@ -12,7 +12,6 @@ import tempfile
 
 import pandas as pd
 import qiime2
-from qiime2.plugin.testing import TestPluginBase
 
 from q2_metadata import tabulate
 
@@ -101,12 +100,5 @@ class TabulateTests(TestCase):
                 tabulate(output_dir, md, -1)
 
 
-class TestUsageExamples(TestPluginBase):
-    package = 'q2_metadata.tests'
-
-    def test_examples(self):
-        self.execute_examples()
-
-
 if __name__ == "__main__":
     main()


=====================================
setup.py
=====================================
@@ -1,5 +1,5 @@
 # ----------------------------------------------------------------------------
-# Copyright (c) 2017-2022, QIIME 2 development team.
+# Copyright (c) 2017-2023, QIIME 2 development team.
 #
 # Distributed under the terms of the Modified BSD License.
 #



View it on GitLab: https://salsa.debian.org/med-team/q2-metadata/-/commit/c1a6a697d5f8b80963fc329b1894cb4e2d012056

-- 
View it on GitLab: https://salsa.debian.org/med-team/q2-metadata/-/commit/c1a6a697d5f8b80963fc329b1894cb4e2d012056
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20230819/ce89092b/attachment-0001.htm>


More information about the debian-med-commit mailing list