[Git][debian-gis-team/xarray-safe-rcm][master] 5 commits: New upstream version 2024.11.0

Mon Nov 25 07:09:45 GMT 2024


Antonio Valentino pushed to branch master at Debian GIS Project / xarray-safe-rcm


Commits:
07e91c30 by Antonio Valentino at 2024-11-25T06:51:10+00:00
New upstream version 2024.11.0
- - - - -
ac6db5b0 by Antonio Valentino at 2024-11-25T06:51:10+00:00
Update upstream source from tag 'upstream/2024.11.0'

Update to upstream version '2024.11.0'
with Debian dir 32f1fab7063db30b5cd0921d07dea574981a2d5c
- - - - -
0433413e by Antonio Valentino at 2024-11-25T06:51:46+00:00
New upstream release

- - - - -
2c622ff3 by Antonio Valentino at 2024-11-25T06:57:24+00:00
Drop dependency on xarray-datatree

- - - - -
52de5b1c by Antonio Valentino at 2024-11-25T07:01:51+00:00
Set distribution to unstable

- - - - -


21 changed files:

- + .github/release.yml
- + .github/workflows/ci.yaml
- .github/workflows/pypi.yaml
- + .github/workflows/upstream-dev.yaml
- .gitignore
- .pre-commit-config.yaml
- README.md
- + ci/install-upstream-dev.sh
- ci/requirements/environment.yaml
- debian/changelog
- debian/control
- pyproject.toml
- safe_rcm/__init__.py
- safe_rcm/api.py
- safe_rcm/calibrations.py
- safe_rcm/manifest.py
- safe_rcm/product/reader.py
- safe_rcm/product/transformers.py
- safe_rcm/product/utils.py
- + safe_rcm/tests/test_xml.py
- safe_rcm/xml.py


Changes:

=====================================
.github/release.yml
=====================================
@@ -0,0 +1,5 @@
+changelog:
+  exclude:
+    authors:
+      - dependabot
+      - pre-commit-ci


=====================================
.github/workflows/ci.yaml
=====================================
@@ -0,0 +1,84 @@
+name: CI
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+  workflow_dispatch:
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+
+jobs:
+  detect-skip-ci-trigger:
+    name: "Detect CI Trigger: [skip-ci]"
+    if: |
+      github.repository == 'umr-lops/xarray-safe-rcm'
+      && github.event_name == 'push'
+      || github.event_name == 'pull_request'
+    runs-on: ubuntu-latest
+    outputs:
+      triggered: ${{ steps.detect-trigger.outputs.trigger-found }}
+    steps:
+      - uses: actions/checkout at v4
+        with:
+          fetch-depth: 2
+      - uses: xarray-contrib/ci-trigger at v1
+        id: detect-trigger
+        with:
+          keyword: "[skip-ci]"
+
+  ci:
+    name: ${{ matrix.os }} py${{ matrix.python-version }}
+    runs-on: ${{ matrix.os }}
+    needs: detect-skip-ci-trigger
+
+    if: needs.detect-skip-ci-trigger.outputs.triggered == 'false'
+
+    defaults:
+      run:
+        shell: bash -l {0}
+
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.10", "3.11", "3.12"]
+        os: ["ubuntu-latest", "macos-latest", "windows-latest"]
+
+    steps:
+      - name: Checkout the repository
+        uses: actions/checkout at v4
+        with:
+          # need to fetch all tags to get a correct version
+          fetch-depth: 0 # fetch all branches and tags
+
+      - name: Setup environment variables
+        run: |
+          echo "TODAY=$(date +'%Y-%m-%d')" >> $GITHUB_ENV
+
+          echo "CONDA_ENV_FILE=ci/requirements/environment.yaml" >> $GITHUB_ENV
+
+      - name: Setup micromamba
+        uses: mamba-org/setup-micromamba at v2
+        with:
+          environment-file: ${{ env.CONDA_ENV_FILE }}
+          environment-name: xarray-safe-rcm-tests
+          cache-environment: true
+          cache-environment-key: "${{runner.os}}-${{runner.arch}}-py${{matrix.python-version}}-${{env.TODAY}}-${{hashFiles(env.CONDA_ENV_FILE)}}"
+          create-args: >-
+            python=${{matrix.python-version}}
+            conda
+
+      - name: Install xarray-safe-rcm
+        run: |
+          python -m pip install --no-deps -e .
+
+      - name: Import xarray-safe-rcm
+        run: |
+          python -c "import safe_rcm"
+
+      - name: Run tests
+        run: |
+          python -m pytest --cov=safe_rcm


=====================================
.github/workflows/pypi.yaml
=====================================
@@ -51,4 +51,4 @@ jobs:
           path: dist/
 
       - name: Publish to PyPI
-        uses: pypa/gh-action-pypi-publish at 2f6f737ca5f74c637829c0f5c3acd0e29ea5e8bf
+        uses: pypa/gh-action-pypi-publish at 15c56dba361d8335944d31a2ecd17d700fc7bcbc


=====================================
.github/workflows/upstream-dev.yaml
=====================================
@@ -0,0 +1,99 @@
+name: upstream-dev CI
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+  schedule:
+    - cron: "0 18 * * 0" # Weekly "On Sundays at 18:00" UTC
+  workflow_dispatch:
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+
+jobs:
+  detect-test-upstream-trigger:
+    name: "Detect CI Trigger: [test-upstream]"
+    if: github.event_name == 'push' || github.event_name == 'pull_request'
+    runs-on: ubuntu-latest
+    outputs:
+      triggered: ${{ steps.detect-trigger.outputs.trigger-found }}
+    steps:
+      - uses: actions/checkout at v4
+        with:
+          fetch-depth: 2
+      - uses: xarray-contrib/ci-trigger at v1.2
+        id: detect-trigger
+        with:
+          keyword: "[test-upstream]"
+
+  upstream-dev:
+    name: upstream-dev
+    runs-on: ubuntu-latest
+    needs: detect-test-upstream-trigger
+
+    if: |
+      always()
+      && github.repository == 'umr-lops/xarray-safe-rcm'
+      && (
+        github.event_name == 'schedule'
+        || github.event_name == 'workflow_dispatch'
+        || needs.detect-test-upstream-trigger.outputs.triggered == 'true'
+        || contains(github.event.pull_request.labels.*.name, 'run-upstream')
+      )
+
+    defaults:
+      run:
+        shell: bash -l {0}
+
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.12"]
+
+    steps:
+      - name: checkout the repository
+        uses: actions/checkout at v4
+        with:
+          # need to fetch all tags to get a correct version
+          fetch-depth: 0 # fetch all branches and tags
+
+      - name: set up conda environment
+        uses: mamba-org/setup-micromamba at v2
+        with:
+          environment-file: ci/requirements/environment.yaml
+          environment-name: tests
+          create-args: >-
+            python=${{ matrix.python-version }}
+            pytest-reportlog
+            conda
+
+      - name: install upstream-dev dependencies
+        run: bash ci/install-upstream-dev.sh
+
+      - name: install the package
+        run: python -m pip install --no-deps -e .
+
+      - name: show versions
+        run: python -m pip list
+
+      - name: import
+        run: |
+          python -c 'import safe_rcm'
+
+      - name: run tests
+        if: success()
+        id: status
+        run: |
+          python -m pytest -rf --report-log=pytest-log.jsonl
+
+      - name: report failures
+        if: |
+          failure()
+          && steps.tests.outcome == 'failure'
+          && github.event_name == 'schedule'
+        uses: xarray-contrib/issue-from-pytest-log at v1
+        with:
+          log-path: pytest-log.jsonl


=====================================
.gitignore
=====================================
@@ -19,3 +19,4 @@ __pycache__/
 .coverage.*
 .cache
 /docs/_build/
+.prettier_cache


=====================================
.pre-commit-config.yaml
=====================================
@@ -4,36 +4,44 @@ ci:
 # https://pre-commit.com/
 repos:
   - repo: https://github.com/pre-commit/pre-commit-hooks
-    rev: v4.5.0
+    rev: v5.0.0
     hooks:
       - id: trailing-whitespace
       - id: end-of-file-fixer
       - id: check-docstring-first
-      - id: check-yaml
-      - id: check-toml
-  - repo: https://github.com/pycqa/isort
-    rev: 5.13.2
+  - repo: https://github.com/astral-sh/ruff-pre-commit
+    rev: v0.7.3
     hooks:
-      - id: isort
-  - repo: https://github.com/psf/black
-    rev: 24.2.0
+      - id: ruff
+        args: [--fix]
+  - repo: https://github.com/psf/black-pre-commit-mirror
+    rev: 24.10.0
     hooks:
-      - id: black
       - id: black-jupyter
   - repo: https://github.com/keewis/blackdoc
     rev: v0.3.9
     hooks:
       - id: blackdoc
-  - repo: https://github.com/pycqa/flake8
-    rev: 7.0.0
-    hooks:
-      - id: flake8
+        additional_dependencies: ["black==24.10.0"]
+      - id: blackdoc-autoupdate-black
   - repo: https://github.com/kynan/nbstripout
-    rev: 0.7.1
+    rev: 0.8.0
     hooks:
       - id: nbstripout
         args: [--extra-keys=metadata.kernelspec metadata.language_info.version]
-  - repo: https://github.com/pre-commit/mirrors-prettier
-    rev: v4.0.0-alpha.8
+  - repo: https://github.com/rbubley/mirrors-prettier
+    rev: v3.3.3
     hooks:
       - id: prettier
+        args: [--cache-location=.prettier_cache]
+  - repo: https://github.com/ComPWA/taplo-pre-commit
+    rev: v0.9.3
+    hooks:
+      - id: taplo-format
+        args: [--option, array_auto_collapse=false]
+      - id: taplo-lint
+        args: [--no-schema]
+  - repo: https://github.com/abravalheri/validate-pyproject
+    rev: v0.23
+    hooks:
+      - id: validate-pyproject


=====================================
README.md
=====================================
@@ -1,6 +1,6 @@
 # xarray-safe-rcm
 
-Read RCM SAFE files into `datatree` objects.
+Read RCM SAFE files into `xarray.DataTree` objects.
 
 ## Usage
 


=====================================
ci/install-upstream-dev.sh
=====================================
@@ -0,0 +1,25 @@
+#!/usr/bin/env bash
+
+if command -v micromamba >/dev/null; then
+  conda=micromamba
+elif command -v mamba >/dev/null; then
+  conda=mamba
+else
+  conda=conda
+fi
+conda remove -y --force cytoolz numpy xarray toolz fsspec python-dateutil pandas lxml xmlschema rioxarray
+python -m pip install \
+  -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple \
+  --no-deps \
+  --pre \
+  --upgrade \
+  numpy \
+  pandas \
+  xarray
+python -m pip install --upgrade \
+  git+https://github.com/pytoolz/toolz \
+  git+https://github.com/lxml/lxml \
+  git+https://github.com/sissaschool/xmlschema \
+  git+https://github.com/fsspec/filesystem_spec \
+  git+https://github.com/dateutil/dateutil \
+  git+https://github.com/corteva/rioxarray


=====================================
ci/requirements/environment.yaml
=====================================
@@ -2,7 +2,7 @@ name: xarray-safe-rcm-tests
 channels:
   - conda-forge
 dependencies:
-  - python=3.10
+  - python
   # development
   - ipython
   - pre-commit
@@ -14,6 +14,7 @@ dependencies:
   # testing
   - pytest
   - pytest-reportlog
+  - pytest-cov
   - hypothesis
   - coverage
   # I/O
@@ -23,7 +24,6 @@ dependencies:
   - scipy
   # data
   - xarray
-  - xarray-datatree
   - dask
   - numpy
   - pandas


=====================================
debian/changelog
=====================================
@@ -1,9 +1,15 @@
-xarray-safe-rcm (2024.02.0-2) UNRELEASED; urgency=medium
+xarray-safe-rcm (2024.11.0-1) unstable; urgency=medium
 
-  * Team upload.
+  [ Bas Couwenberg ]
   * Bump Standards-Version to 4.7.0, no changes.
 
- -- Bas Couwenberg <sebastic at debian.org>  Sun, 28 Jul 2024 20:07:08 +0200
+  [ Antonio Valentino ]
+  * New upstream version.
+  * debian/control:
+    - Drop dependency in xarray-datatree and require
+      xarray (>= 2024.10.0).
+
+ -- Antonio Valentino <antonio.valentino at tiscali.it>  Mon, 25 Nov 2024 07:01:40 +0000
 
 xarray-safe-rcm (2024.02.0-1) unstable; urgency=medium
 


=====================================
debian/control
=====================================
@@ -17,8 +17,7 @@ Build-Depends: debhelper-compat (= 13),
                python3-setuptools,
                python3-setuptools-scm,
                python3-toolz,
-               python3-xarray,
-               python3-xarray-datatree,
+               python3-xarray (>= 2024.10.0),
                python3-xmlschema
 Standards-Version: 4.7.0
 Testsuite: autopkgtest-pkg-pybuild


=====================================
pyproject.toml
=====================================
@@ -1,19 +1,18 @@
 [project]
 name = "xarray-safe-rcm"
 requires-python = ">= 3.10"
-license = {text = "MIT"}
+license = { text = "MIT" }
 description = "xarray reader for radarsat constellation mission (RCM) SAFE files"
 readme = "README.md"
 dependencies = [
-    "toolz",
-    "numpy",
-    "xarray",
-    "xarray-datatree",
-    "lxml",
-    "xmlschema",
-    "rioxarray",
-    "fsspec",
-    "exceptiongroup; python_version < '3.11'",
+  "toolz",
+  "numpy",
+  "xarray",
+  "lxml",
+  "xmlschema",
+  "rioxarray",
+  "fsspec",
+  "exceptiongroup; python_version < '3.11'",
 ]
 dynamic = ["version"]
 
@@ -23,16 +22,52 @@ build-backend = "setuptools.build_meta"
 
 [tool.setuptools.packages.find]
 include = [
-    "safe_rcm",
-    "safe_rcm.*",
+  "safe_rcm",
+  "safe_rcm.*",
 ]
 
 [tool.setuptools_scm]
-fallback_version = "999"
-
-[tool.isort]
-profile = "black"
-skip_gitignore = true
-float_to_top = true
-default_section = "THIRDPARTY"
-known_first_party = "safe_rcm"
+fallback_version = "9999"
+
+[tool.ruff]
+target-version = "py310"
+builtins = ["ellipsis"]
+exclude = [".git", ".eggs", "build", "dist", "__pycache__"]
+line-length = 100
+
+[tool.ruff.lint]
+ignore = [
+  "E402",  # module level import not at top of file
+  "E501",  # line too long - let black worry about that
+  "E731",  # do not assign a lambda expression, use a def
+  "UP038", # type union instead of tuple for isinstance etc
+]
+select = [
+  "F",   # Pyflakes
+  "E",   # Pycodestyle
+  "I",   # isort
+  "UP",  # Pyupgrade
+  "TID", # flake8-tidy-imports
+  "W",
+]
+extend-safe-fixes = [
+  "TID252", # absolute imports
+  "UP031",  # percent string interpolation
+]
+fixable = ["I", "TID252", "UP"]
+
+[tool.ruff.lint.isort]
+known-first-party = ["safe_rcm"]
+known-third-party = ["xarray", "tlz"]
+
+[tool.ruff.lint.flake8-tidy-imports]
+# Disallow all relative imports.
+ban-relative-imports = "all"
+
+[tool.coverage.run]
+source = ["safe_rcm"]
+branch = true
+
+[tool.coverage.report]
+show_missing = true
+exclude_lines = ["pragma: no cover", "if TYPE_CHECKING"]


=====================================
safe_rcm/__init__.py
=====================================
@@ -1,8 +1,8 @@
 from importlib.metadata import version
 
-from .api import open_rcm  # noqa: F401
+from safe_rcm.api import open_rcm  # noqa: F401
 
 try:
-    __version__ = version("safe_rcm")
+    __version__ = version("xarray-safe-rcm")
 except Exception:
-    __version__ = "999"
+    __version__ = "9999"


=====================================
safe_rcm/api.py
=====================================
@@ -2,19 +2,18 @@ import os
 import posixpath
 from fnmatch import fnmatchcase
 
-import datatree
 import fsspec
 import xarray as xr
 from fsspec.implementations.dirfs import DirFileSystem
 from tlz.dicttoolz import valmap
 from tlz.functoolz import compose_left, curry, juxt
 
-from .calibrations import read_noise_levels
-from .manifest import read_manifest
-from .product.reader import read_product
-from .product.transformers import extract_dataset
-from .product.utils import starcall
-from .xml import read_xml
+from safe_rcm.calibrations import read_noise_levels
+from safe_rcm.manifest import read_manifest
+from safe_rcm.product.reader import read_product
+from safe_rcm.product.transformers import extract_dataset
+from safe_rcm.product.utils import starcall
+from safe_rcm.xml import read_xml
 
 try:
     ExceptionGroup
@@ -128,6 +127,7 @@ def open_rcm(
                 lambda arr: arr.set_index({"stacked": ["sarCalibrationType", "pole"]}),
                 lambda arr: arr.unstack("stacked"),
                 lambda arr: arr.rename("lookup_tables"),
+                lambda arr: arr.to_dataset(),
             ),
         },
         "/noiseLevels": {
@@ -160,7 +160,7 @@ def open_rcm(
 
     return tree.assign(
         {
-            "lookupTables": datatree.DataTree.from_dict(calibration),
-            "imagery": datatree.DataTree(imagery),
+            "lookupTables": xr.DataTree.from_dict(calibration),
+            "imagery": xr.DataTree(imagery),
         }
     )


=====================================
safe_rcm/calibrations.py
=====================================
@@ -1,17 +1,15 @@
 import posixpath
 
-import datatree
 import numpy as np
 import xarray as xr
 from tlz.dicttoolz import itemmap, merge_with, valfilter, valmap
 from tlz.functoolz import compose_left, curry, flip
 from tlz.itertoolz import first
 
+from safe_rcm.product.dicttoolz import keysplit
 from safe_rcm.product.reader import execute
-
-from .product.dicttoolz import keysplit
-from .product.transformers import extract_dataset
-from .xml import read_xml
+from safe_rcm.product.transformers import extract_dataset
+from safe_rcm.xml import read_xml
 
 
 def move_attrs_to_coords(ds, names):
@@ -110,4 +108,4 @@ def read_noise_levels(mapper, root, fnames):
         merged,
     )
 
-    return datatree.DataTree.from_dict(combined)
+    return xr.DataTree.from_dict(combined)


=====================================
safe_rcm/manifest.py
=====================================
@@ -2,8 +2,8 @@ from tlz import filter
 from tlz.functoolz import compose_left, curry
 from tlz.itertoolz import concat, get
 
-from .product.dicttoolz import query
-from .xml import read_xml
+from safe_rcm.product.dicttoolz import query
+from safe_rcm.xml import read_xml
 
 
 def merge_location(loc):


=====================================
safe_rcm/product/reader.py
=====================================
@@ -1,14 +1,13 @@
-import datatree
 import xarray as xr
 from tlz.dicttoolz import keyfilter, merge, merge_with, valfilter, valmap
 from tlz.functoolz import compose_left, curry, juxt
 from tlz.itertoolz import first, second
 
-from ..xml import read_xml
-from . import transformers
-from .dicttoolz import keysplit, query
-from .predicates import disjunction, is_nested_array, is_scalar_valued
-from .utils import dictfirst, starcall
+from safe_rcm.product import transformers
+from safe_rcm.product.dicttoolz import keysplit, query
+from safe_rcm.product.predicates import disjunction, is_nested_array, is_scalar_valued
+from safe_rcm.product.utils import dictfirst, starcall
+from safe_rcm.xml import read_xml
 
 
 @curry
@@ -276,4 +275,4 @@ def read_product(mapper, product_path):
         lambda x: execute(**x)(decoded),
         layout,
     )
-    return datatree.DataTree.from_dict(converted)
+    return xr.DataTree.from_dict(converted)


=====================================
safe_rcm/product/transformers.py
=====================================
@@ -1,4 +1,3 @@
-import datatree
 import numpy as np
 import xarray as xr
 from tlz.dicttoolz import (
@@ -13,8 +12,8 @@ from tlz.dicttoolz import (
 from tlz.functoolz import compose_left, curry, flip
 from tlz.itertoolz import concat, first, second
 
-from .dicttoolz import first_values, keysplit, valsplit
-from .predicates import (
+from safe_rcm.product.dicttoolz import first_values, keysplit, valsplit
+from safe_rcm.product.predicates import (
     is_array,
     is_attr,
     is_composite_value,
@@ -252,4 +251,4 @@ def extract_nested_datatree(obj, dims=None):
     datasets = merge_with(list, *obj)
     tree = valmap(curry(extract_nested_dataset)(dims=dims), datasets)
 
-    return datatree.DataTree.from_dict(tree)
+    return xr.DataTree.from_dict(tree)


=====================================
safe_rcm/product/utils.py
=====================================
@@ -26,7 +26,9 @@ def strip_namespaces(name, namespaces):
     trimmed : str
         The string without prefix and without leading colon.
     """
-    funcs = [flip(str.removeprefix, ns) for ns in namespaces]
+    funcs = [
+        flip(str.removeprefix, ns) for ns in sorted(namespaces, key=len, reverse=True)
+    ]
     return pipe(name, *funcs).lstrip(":")
 
 


=====================================
safe_rcm/tests/test_xml.py
=====================================
@@ -0,0 +1,299 @@
+import collections
+import textwrap
+
+import fsspec
+import pytest
+
+from safe_rcm import xml
+
+
+def dedent(text):
+    return textwrap.dedent(text.removeprefix("\n").rstrip())
+
+
+schemas = [
+    dedent(
+        """
+        <?xml version="1.0" encoding="UTF-8"?>
+        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+        </xsd:schema>
+        """
+    ),
+    dedent(
+        """
+        <?xml version="1.0" encoding="UTF-8"?>
+        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+          <xsd:include schemaLocation="schema2.xsd"/>
+        </xsd:schema>
+        """
+    ),
+    dedent(
+        """
+        <?xml version="1.0" encoding="UTF-8"?>
+        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+          <xsd:include schemaLocation="schema1.xsd"/>
+          <xsd:include schemaLocation="schema2.xsd"/>
+        </xsd:schema>
+        """
+    ),
+]
+
+
+Container = collections.namedtuple("SchemaSetup", ["mapper", "path", "expected"])
+SchemaProperties = collections.namedtuple(
+    "SchemaProperties", ["root_elements", "simple_types", "complex_types"]
+)
+
+
+ at pytest.fixture(params=enumerate(schemas))
+def schema_setup(request):
+    schema_index, schema = request.param
+
+    mapper = fsspec.get_mapper("memory")
+    mapper["schemas/root.xsd"] = schema.encode()
+    mapper["schemas/schema1.xsd"] = dedent(
+        """
+        <?xml version="1.0" encoding="UTF-8"?>
+        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+          <xsd:include schemaLocation="schema3.xsd"/>
+          <xsd:element name="manifest" type="manifest"/>
+        </xsd:schema>
+        """
+    ).encode()
+    mapper["schemas/schema2.xsd"] = dedent(
+        """
+        <?xml version="1.0" encoding="UTF-8"?>
+        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+          <xsd:include schemaLocation="schema4.xsd"/>
+          <xsd:element name="count" type="count"/>
+        </xsd:schema>
+        """
+    ).encode()
+    mapper["schemas/schema3.xsd"] = dedent(
+        """
+        <?xml version="1.0" encoding="UTF-8"?>
+        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+          <xsd:include schemaLocation="schema3.xsd"/>
+          <xsd:complexType name="manifest">
+            <xsd:sequence>
+              <xsd:element name="quantity_a" type="count"/>
+              <xsd:element name="quantity_b" type="count"/>
+            </xsd:sequence>
+          </xsd:complexType>
+        </xsd:schema>
+        """
+    ).encode()
+    mapper["schemas/schema4.xsd"] = dedent(
+        """
+        <?xml version="1.0" encoding="UTF-8"?>
+        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+          <xsd:simpleType name="count">
+            <xsd:restriction base="xsd:integer">
+              <xsd:minInclusive value="0"/>
+              <xsd:maxInclusive value="10"/>
+            </xsd:restriction>
+          </xsd:simpleType>
+        </xsd:schema>
+        """
+    ).encode()
+
+    return schema_index, mapper
+
+
+ at pytest.fixture
+def schema_paths_setup(schema_setup):
+    schema_index, mapper = schema_setup
+
+    expected = [
+        ["schemas/root.xsd"],
+        ["schemas/root.xsd", "schemas/schema2.xsd", "schemas/schema4.xsd"],
+        [
+            "schemas/root.xsd",
+            "schemas/schema1.xsd",
+            "schemas/schema2.xsd",
+            "schemas/schema3.xsd",
+            "schemas/schema4.xsd",
+        ],
+    ]
+
+    return Container(mapper, "schemas/root.xsd", expected[schema_index])
+
+
+ at pytest.fixture
+def schema_content_setup(schema_setup):
+    schema_index, mapper = schema_setup
+
+    count_type = {"name": "count", "type": "simple", "base_type": "integer"}
+    manifest_type = {"name": "manifest", "type": "complex"}
+
+    manifest_element = {"name": "manifest", "type": manifest_type}
+    count_element = {"name": "count", "type": count_type}
+    expected = [
+        SchemaProperties([], [], []),
+        SchemaProperties([count_element], [count_type], []),
+        SchemaProperties(
+            [manifest_element, count_element], [count_type], [manifest_type]
+        ),
+    ]
+
+    return Container(mapper, "schemas/root.xsd", expected[schema_index])
+
+
+ at pytest.fixture(params=["data.xml", "data/file.xml"])
+def data_file_setup(request):
+    path = request.param
+    mapper = fsspec.get_mapper("memory")
+
+    mapper["schemas/root.xsd"] = dedent(
+        """
+        <?xml version="1.0" encoding="UTF-8"?>
+        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+          <xsd:include schemaLocation="schema1.xsd"/>
+          <xsd:include schemaLocation="schema2.xsd"/>
+          <xsd:complexType name="elements">
+            <xsd:sequence>
+              <xsd:element name="summary" type="manifest"/>
+              <xsd:element name="count" type="count"/>
+            </xsd:sequence>
+          </xsd:complexType>
+          <xsd:element name="elements" type="elements"/>
+        </xsd:schema>
+        """
+    ).encode()
+    mapper["schemas/schema1.xsd"] = dedent(
+        """
+        <?xml version="1.0" encoding="UTF-8"?>
+        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+          <xsd:include schemaLocation="schema2.xsd"/>
+          <xsd:complexType name="manifest">
+            <xsd:sequence>
+              <xsd:element name="quantity_a" type="count"/>
+              <xsd:element name="quantity_b" type="count"/>
+            </xsd:sequence>
+          </xsd:complexType>
+        </xsd:schema>
+        """
+    ).encode()
+    mapper["schemas/schema2.xsd"] = dedent(
+        """
+        <?xml version="1.0" encoding="UTF-8"?>
+        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+          <xsd:simpleType name="count">
+            <xsd:restriction base="xsd:integer">
+              <xsd:minInclusive value="0"/>
+              <xsd:maxInclusive value="10"/>
+            </xsd:restriction>
+          </xsd:simpleType>
+        </xsd:schema>
+        """
+    ).encode()
+
+    schema_path = "schemas/root.xsd" if "/" not in path else "../schemas/root.xsd"
+    mapper[path] = dedent(
+        f"""
+        <?xml version="1.0" encoding="UTF-8"?>
+        <elements xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="schema {schema_path}">
+          <summary>
+            <quantity_a>1</quantity_a>
+            <quantity_b>2</quantity_b>
+          </summary>
+          <count>3</count>
+        </elements>
+        """
+    ).encode()
+
+    expected = {
+        "@xmlns:xsi": "http://www.w3.org/2001/XMLSchema-instance",
+        "@xsi:schemaLocation": f"schema {schema_path}",
+        "summary": {"quantity_a": 1, "quantity_b": 2},
+        "count": 3,
+    }
+
+    return Container(mapper, path, expected)
+
+
+def convert_type(t):
+    def strip_namespace(name):
+        return name.split("}", maxsplit=1)[1]
+
+    if hasattr(t, "content"):
+        # complex type
+        return {"name": t.name, "type": "complex"}
+    elif hasattr(t, "base_type"):
+        # simple type, only restriction
+        return {
+            "name": t.name,
+            "base_type": strip_namespace(t.base_type.name),
+            "type": "simple",
+        }
+
+
+def convert_element(el):
+    return {"name": el.name, "type": convert_type(el.type)}
+
+
+def extract_schema_properties(schema):
+    return SchemaProperties(
+        [convert_element(v) for v in schema.root_elements],
+        [convert_type(v) for v in schema.simple_types],
+        [convert_type(v) for v in schema.complex_types],
+    )
+
+
+def test_remove_includes():
+    expected = schemas[0]
+    actual = xml.remove_includes(schemas[1])
+
+    assert actual == expected
+
+
+ at pytest.mark.parametrize(
+    ["schema", "expected"],
+    (
+        (schemas[0], []),
+        (schemas[1], ["schema2.xsd"]),
+        (schemas[2], ["schema1.xsd", "schema2.xsd"]),
+    ),
+)
+def test_extract_includes(schema, expected):
+    actual = xml.extract_includes(schema)
+
+    assert actual == expected
+
+
+ at pytest.mark.parametrize(
+    ["root", "path", "expected"],
+    (
+        ("", "file.xml", "file.xml"),
+        ("/root", "file.xml", "/root/file.xml"),
+        ("/root", "/other_root/file.xml", "/other_root/file.xml"),
+    ),
+)
+def test_normalize(root, path, expected):
+    actual = xml.normalize(root, path)
+
+    assert actual == expected
+
+
+def test_schema_paths(schema_paths_setup):
+    actual = xml.schema_paths(schema_paths_setup.mapper, schema_paths_setup.path)
+
+    expected = schema_paths_setup.expected
+
+    assert actual == expected
+
+
+def test_open_schemas(schema_content_setup):
+    container = schema_content_setup
+    actual = xml.open_schema(container.mapper, container.path)
+    expected = container.expected
+
+    assert extract_schema_properties(actual) == expected
+
+
+def test_read_xml(data_file_setup):
+    container = data_file_setup
+
+    actual = xml.read_xml(container.mapper, container.path)
+
+    assert actual == container.expected


=====================================
safe_rcm/xml.py
=====================================
@@ -11,7 +11,7 @@ include_re = re.compile(r'\s*<xsd:include schemaLocation="(?P<location>[^"/]+)"\
 
 
 def remove_includes(text):
-    return io.StringIO(include_re.sub("", text))
+    return include_re.sub("", text)
 
 
 def extract_includes(text):
@@ -30,7 +30,8 @@ def schema_paths(mapper, root_schema):
     visited = []
     while unvisited:
         path = unvisited.popleft()
-        visited.append(path)
+        if path not in visited:
+            visited.append(path)
 
         text = mapper[path].decode()
         includes = extract_includes(text)
@@ -63,7 +64,7 @@ def open_schema(mapper, schema):
         The opened schema object
     """
     paths = schema_paths(mapper, schema)
-    preprocessed = [remove_includes(mapper[p].decode()) for p in paths]
+    preprocessed = [io.StringIO(remove_includes(mapper[p].decode())) for p in paths]
 
     return xmlschema.XMLSchema(preprocessed)
 



View it on GitLab: https://salsa.debian.org/debian-gis-team/xarray-safe-rcm/-/compare/2a1bdbafe66f2108e36a5463e2eeeb69c67235af...52de5b1c981e345165e92173e4631a2881746144

-- 
View it on GitLab: https://salsa.debian.org/debian-gis-team/xarray-safe-rcm/-/compare/2a1bdbafe66f2108e36a5463e2eeeb69c67235af...52de5b1c981e345165e92173e4631a2881746144
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-grass-devel/attachments/20241125/2482f111/attachment-0001.htm>