[Git][debian-gis-team/stac-validator][master] 6 commits: New upstream version 3.5.0+ds

Antonio Valentino (@antonio.valentino) gitlab at salsa.debian.org
Fri Jan 17 08:11:29 GMT 2025



Antonio Valentino pushed to branch master at Debian GIS Project / stac-validator


Commits:
ecdb8180 by Antonio Valentino at 2025-01-17T07:46:29+00:00
New upstream version 3.5.0+ds
- - - - -
17cc15f2 by Antonio Valentino at 2025-01-17T07:46:31+00:00
Update upstream source from tag 'upstream/3.5.0+ds'

Update to upstream version '3.5.0+ds'
with Debian dir 00e965752ca040990414ac1af823cf4522187451
- - - - -
fc3d63a8 by Antonio Valentino at 2025-01-17T07:47:15+00:00
New upstream release

- - - - -
5641eb5a by Antonio Valentino at 2025-01-17T07:51:27+00:00
Update dependencies

- - - - -
0c22dc1c by Antonio Valentino at 2025-01-17T07:58:31+00:00
Skip tests requiring access to the internet

- - - - -
f5cd4a6f by Antonio Valentino at 2025-01-17T07:58:56+00:00
Set distribution to unstable

- - - - -


21 changed files:

- + .github/workflows/publish.yml
- .github/workflows/test-runner.yml
- .pre-commit-config.yaml
- CHANGELOG.md
- README.md
- debian/changelog
- debian/control
- debian/rules
- requirements-dev.txt
- setup.py
- stac_validator/stac_validator.py
- stac_validator/utilities.py
- stac_validator/validate.py
- tests/test_assets.py
- tests/test_core.py
- tests/test_custom.py
- + tests/test_header.py
- tests/test_links.py
- tests/test_recursion.py
- tox.ini
- tox/Dockerfile-tox


Changes:

=====================================
.github/workflows/publish.yml
=====================================
@@ -0,0 +1,35 @@
+name: Publish
+
+on:
+  push:
+    tags:
+      - "v*.*.*" # Triggers when a tag starting with 'v' followed by version numbers is pushed
+
+jobs:
+  build-and-publish:
+    name: Build and Publish to PyPI
+    runs-on: ubuntu-latest
+
+    steps:
+      - uses: actions/checkout at v4
+
+      - name: Set up Python 3.10
+        uses: actions/setup-python at v5
+        with:
+          python-version: "3.10"
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install setuptools wheel twine
+
+      - name: Build package
+        run: |
+          python setup.py sdist bdist_wheel
+
+      - name: Publish package to PyPI
+        env:
+          TWINE_USERNAME: "__token__"
+          TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
+        run: |
+          twine upload dist/*


=====================================
.github/workflows/test-runner.yml
=====================================
@@ -16,7 +16,7 @@ jobs:
     runs-on: ubuntu-latest
     strategy:
       matrix:
-        python-version: ["3.8", "3.9", "3.10", "3.11"]
+        python-version: ["3.8", "3.9", "3.10", "3.11", "3.12", "3.13"]
 
     steps:
       - uses: actions/checkout at v2
@@ -25,7 +25,7 @@ jobs:
         with:
           python-version: ${{ matrix.python-version }}
 
-      - name: Run unit tests
+      - name: Run mypy
         run: |
           pip install .
           pip install -r requirements-dev.txt


=====================================
.pre-commit-config.yaml
=====================================
@@ -12,7 +12,7 @@ repos:
     rev: 24.1.1
     hooks:
       - id: black
-        language_version: python3.10
+        # language_version: python3.11
   - repo: https://github.com/pre-commit/mirrors-mypy
     rev: v1.8.0
     hooks:


=====================================
CHANGELOG.md
=====================================
@@ -6,6 +6,20 @@ The format is (loosely) based on [Keep a Changelog](http://keepachangelog.com/)
 
 ## [Unreleased]
 
+### Added
+
+## [v3.5.0] - 2025-01-10
+
+### Added
+
+- Added publish.yml to automatically publish new releases to PyPI [#236](https://github.com/stac-utils/stac-validator/pull/236)
+- Configure whether to open URLs when validating assets [#238](https://github.com/stac-utils/stac-validator/pull/238)
+- Allow to provide HTTP headers [#239](https://github.com/stac-utils/stac-validator/pull/239)
+
+### Changed
+
+- Switched to the referencing library for dynamic JSON schema validation and reference resolution [#241](https://github.com/stac-utils/stac-validator/pull/241)
+
 ## [v3.4.0] - 2024-10-08
 
 ### Added
@@ -207,7 +221,8 @@ The format is (loosely) based on [Keep a Changelog](http://keepachangelog.com/)
 - With the newest version - 1.0.0-beta.2 - items will run through jsonchema validation before the PySTAC validation. The reason for this is that jsonschema will give more informative error messages. This should be addressed better in the future. This is not the case with the --recursive option as time can be a concern here with larger collections.
 - Logging. Various additions were made here depending on the options selected. This was done to help assist people to update their STAC collections.
 
-[Unreleased]: https://github.com/sparkgeo/stac-validator/compare/v3.4.0..main
+[Unreleased]: https://github.com/sparkgeo/stac-validator/compare/v3.5.0..main
+[v3.5.0]: https://github.com/sparkgeo/stac-validator/compare/v3.4.0..v3.5.0
 [v3.4.0]: https://github.com/sparkgeo/stac-validator/compare/v3.3.2..v3.4.0
 [v3.3.2]: https://github.com/sparkgeo/stac-validator/compare/v3.3.1..v3.3.2
 [v3.3.1]: https://github.com/sparkgeo/stac-validator/compare/v3.3.0..v3.3.1


=====================================
README.md
=====================================
@@ -106,6 +106,10 @@ Options:
   --collections            Validate /collections response.
   --item-collection        Validate item collection response. Can be combined
                            with --pages. Defaults to one page.
+  --no-assets-urls         Disables the opening of href links when validating
+                           assets (enabled by default).
+  --header KEY VALUE       HTTP header to include in the requests. Can be used
+                           multiple times.
   -p, --pages INTEGER      Maximum number of pages to validate via --item-
                            collection. Defaults to one page.
   -v, --verbose            Enables verbose output for recursive mode.
@@ -330,3 +334,9 @@ stac-validator https://spot-canada-ortho.s3.amazonaws.com/catalog.json --recursi
 ```bash
 stac-validator https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a/items --item-collection --pages 2
 ```
+
+**--header**
+
+```bash
+stac-validator https://stac-catalog.eu/collections/sentinel-s2-l2a/items --header x-api-key $MY_API_KEY --header foo bar
+```


=====================================
debian/changelog
=====================================
@@ -1,3 +1,14 @@
+stac-validator (3.5.0+ds-1) unstable; urgency=medium
+
+  * New upstream release.
+  * debian/control:
+    - Add dependency in python3-referencing and
+      python3-requests-mock.
+  * debian/rules:
+    - Skip tests requiring access to the internet.
+
+ -- Antonio Valentino <antonio.valentino at tiscali.it>  Fri, 17 Jan 2025 07:58:40 +0000
+
 stac-validator (3.4.0+ds-1) unstable; urgency=medium
 
   [ Bas Couwenberg ]


=====================================
debian/control
=====================================
@@ -11,6 +11,8 @@ Build-Depends: debhelper-compat (= 13),
                python3-jsonschema,
                python3-pytest,
                python3-requests,
+               python3-requests-mock <!nocheck>,
+               python3-referencing,
                python3-setuptools
 Standards-Version: 4.7.0
 Testsuite: autopkgtest-pkg-pybuild


=====================================
debian/rules
=====================================
@@ -54,7 +54,8 @@ and not test_validate_item_collection_remote_pages \
 and not test_core_collection_local_v110 \
 and not test_core_item_local_v110 \
 and not test_validate_collections_remote \
-and not test_correct_validate_dict_return_method" \
+and not test_correct_validate_dict_return_method \
+and not test_header" \
 -vv $(CURDIR)/tests
 
 %:


=====================================
requirements-dev.txt
=====================================
@@ -2,4 +2,5 @@ black
 pytest
 pytest-mypy
 pre-commit
+requests-mock
 types-jsonschema


=====================================
setup.py
=====================================
@@ -2,7 +2,7 @@
 
 from setuptools import setup
 
-__version__ = "3.4.0"
+__version__ = "3.5.0"
 
 with open("README.md", "r") as fh:
     long_description = fh.read()
@@ -26,13 +26,15 @@ setup(
     long_description_content_type="text/markdown",
     url="https://github.com/stac-utils/stac-validator",
     install_requires=[
-        "requests>=2.19.1",
-        "jsonschema>=3.2.0",
-        "click>=8.0.0",
+        "requests>=2.32.3",
+        "jsonschema>=4.23.0",
+        "click>=8.1.8",
+        "referencing>=0.35.1",
     ],
     extras_require={
         "dev": [
             "pytest",
+            "requests-mock",
             "types-setuptools",
         ],
     },
@@ -41,5 +43,5 @@ setup(
         "console_scripts": ["stac-validator = stac_validator.stac_validator:main"]
     },
     python_requires=">=3.8",
-    tests_require=["pytest"],
+    tests_require=["pytest", "requests-mock"],
 )


=====================================
stac_validator/stac_validator.py
=====================================
@@ -109,6 +109,17 @@ def collections_summary(message: List[Dict[str, Any]]) -> None:
     is_flag=True,
     help="Validate item collection response. Can be combined with --pages. Defaults to one page.",
 )
+ at click.option(
+    "--no-assets-urls",
+    is_flag=True,
+    help="Disables the opening of href links when validating assets (enabled by default).",
+)
+ at click.option(
+    "--header",
+    type=(str, str),
+    multiple=True,
+    help="HTTP header to include in the requests. Can be used multiple times.",
+)
 @click.option(
     "--pages",
     "-p",
@@ -128,6 +139,8 @@ def main(
     stac_file: str,
     collections: bool,
     item_collection: bool,
+    no_assets_urls: bool,
+    header: list,
     pages: int,
     recursive: bool,
     max_depth: int,
@@ -147,6 +160,8 @@ def main(
         stac_file (str): Path to the STAC file to be validated.
         collections (bool): Validate response from /collections endpoint.
         item_collection (bool): Whether to validate item collection responses.
+        no_assets_urls (bool): Whether to open href links when validating assets (enabled by default).
+        headers (dict): HTTP headers to include in the requests.
         pages (int): Maximum number of pages to validate via `item_collection`.
         recursive (bool): Whether to recursively validate all related STAC objects.
         max_depth (int): Maximum depth to traverse when recursing.
@@ -177,6 +192,8 @@ def main(
         core=core,
         links=links,
         assets=assets,
+        assets_open_urls=not no_assets_urls,
+        headers=dict(header),
         extensions=extensions,
         custom=custom,
         verbose=verbose,


=====================================
stac_validator/utilities.py
=====================================
@@ -1,11 +1,17 @@
 import functools
 import json
 import ssl
-from typing import Dict
+from typing import Dict, Optional
 from urllib.parse import urlparse
-from urllib.request import urlopen
+from urllib.request import Request, urlopen
 
+import jsonschema
 import requests  # type: ignore
+from jsonschema import Draft202012Validator
+from referencing import Registry, Resource
+from referencing.jsonschema import DRAFT202012
+from referencing.retrieval import to_cached_resource
+from referencing.typing import URI
 
 NEW_VERSIONS = [
     "1.0.0-beta.2",
@@ -77,7 +83,7 @@ def get_stac_type(stac_content: Dict) -> str:
         return str(e)
 
 
-def fetch_and_parse_file(input_path: str) -> Dict:
+def fetch_and_parse_file(input_path: str, headers: Optional[Dict] = None) -> Dict:
     """Fetches and parses a JSON file from a URL or local file.
 
     Given a URL or local file path to a JSON file, this function fetches the file,
@@ -87,6 +93,7 @@ def fetch_and_parse_file(input_path: str) -> Dict:
 
     Args:
         input_path: A string representing the URL or local file path to the JSON file.
+        headers: For URLs: HTTP headers to include in the request
 
     Returns:
         A dictionary containing the parsed contents of the JSON file.
@@ -97,7 +104,7 @@ def fetch_and_parse_file(input_path: str) -> Dict:
     """
     try:
         if is_url(input_path):
-            resp = requests.get(input_path)
+            resp = requests.get(input_path, headers=headers)
             resp.raise_for_status()
             data = resp.json()
         else:
@@ -150,8 +157,7 @@ def set_schema_addr(version: str, stac_type: str) -> str:
 
 
 def link_request(
-    link: Dict,
-    initial_message: Dict,
+    link: Dict, initial_message: Dict, open_urls: bool = True, headers: Dict = {}
 ) -> None:
     """Makes a request to a URL and appends it to the relevant field of the initial message.
 
@@ -159,6 +165,8 @@ def link_request(
         link: A dictionary containing a "href" key which is a string representing a URL.
         initial_message: A dictionary containing lists for "request_valid", "request_invalid",
         "format_valid", and "format_invalid" URLs.
+        open_urls: Whether to open link href URL
+        headers: HTTP headers to include in the request
 
     Returns:
         None
@@ -166,17 +174,106 @@ def link_request(
     """
     if is_url(link["href"]):
         try:
-            if "s3" in link["href"]:
-                context = ssl._create_unverified_context()
-                response = urlopen(link["href"], context=context)
-            else:
-                response = urlopen(link["href"])
-            status_code = response.getcode()
-            if status_code == 200:
-                initial_message["request_valid"].append(link["href"])
+            if open_urls:
+                request = Request(link["href"], headers=headers)
+                if "s3" in link["href"]:
+                    context = ssl._create_unverified_context()
+                    response = urlopen(request, context=context)
+                else:
+                    response = urlopen(request)
+                status_code = response.getcode()
+                if status_code == 200:
+                    initial_message["request_valid"].append(link["href"])
         except Exception:
             initial_message["request_invalid"].append(link["href"])
         initial_message["format_valid"].append(link["href"])
     else:
         initial_message["request_invalid"].append(link["href"])
         initial_message["format_invalid"].append(link["href"])
+
+
+def fetch_remote_schema(uri: str) -> dict:
+    """
+    Fetch a remote schema from a URI.
+
+    Args:
+        uri (str): The URI of the schema to fetch.
+
+    Returns:
+        dict: The fetched schema content as a dictionary.
+
+    Raises:
+        requests.RequestException: If the request to fetch the schema fails.
+    """
+    response = requests.get(uri)
+    response.raise_for_status()
+    return response.json()
+
+
+ at to_cached_resource()  # type: ignore
+def cached_retrieve(uri: URI) -> str:
+    """
+    Retrieve and cache a remote schema.
+
+    Args:
+        uri (str): The URI of the schema.
+
+    Returns:
+        str: The raw JSON string of the schema.
+
+    Raises:
+        requests.RequestException: If the request to fetch the schema fails.
+        Exception: For any other unexpected errors.
+    """
+    try:
+        response = requests.get(uri, timeout=10)  # Set a timeout for robustness
+        response.raise_for_status()  # Raise an error for HTTP response codes >= 400
+        return response.text
+    except requests.exceptions.RequestException as e:
+        raise requests.RequestException(
+            f"Failed to fetch schema from {uri}: {str(e)}"
+        ) from e
+    except Exception as e:
+        raise Exception(
+            f"Unexpected error while retrieving schema from {uri}: {str(e)}"
+        ) from e
+
+
+def validate_with_ref_resolver(schema_path: str, content: dict) -> None:
+    """
+    Validate a JSON document against a JSON Schema with dynamic reference resolution.
+
+    Args:
+        schema_path (str): Path or URI of the JSON Schema.
+        content (dict): JSON content to validate.
+
+    Raises:
+        jsonschema.exceptions.ValidationError: If validation fails.
+        requests.RequestException: If fetching a remote schema fails.
+        FileNotFoundError: If a local schema file is not found.
+        Exception: If any other error occurs during validation.
+    """
+    # Load the schema
+    if schema_path.startswith("http"):
+        schema = fetch_remote_schema(schema_path)
+    else:
+        try:
+            with open(schema_path, "r") as f:
+                schema = json.load(f)
+        except FileNotFoundError as e:
+            raise FileNotFoundError(f"Schema file not found: {schema_path}") from e
+
+    # Set up the resource and registry for schema resolution
+    resource: Resource = Resource(contents=schema, specification=DRAFT202012)  # type: ignore
+    registry: Registry = Registry(retrieve=cached_retrieve).with_resource(  # type: ignore
+        uri=schema_path, resource=resource
+    )  # type: ignore
+
+    # Validate the content against the schema
+    try:
+        validator = Draft202012Validator(schema, registry=registry)
+        validator.validate(content)
+    except jsonschema.exceptions.ValidationError as e:
+        raise jsonschema.exceptions.ValidationError(f"{e.message}") from e
+    except Exception as e:
+        raise Exception(f"Unexpected error during validation: {str(e)}") from e


=====================================
stac_validator/validate.py
=====================================
@@ -6,7 +6,6 @@ from urllib.error import HTTPError, URLError
 
 import click  # type: ignore
 import jsonschema  # type: ignore
-from jsonschema.validators import validator_for
 from requests import exceptions  # type: ignore
 
 from .utilities import (
@@ -16,6 +15,7 @@ from .utilities import (
     is_valid_url,
     link_request,
     set_schema_addr,
+    validate_with_ref_resolver,
 )
 
 
@@ -33,6 +33,8 @@ class StacValidate:
         core (bool): Whether to only validate the core STAC object (without extensions).
         links (bool): Whether to additionally validate links (only works in default mode).
         assets (bool): Whether to additionally validate assets (only works in default mode).
+        assets_open_urls (bool): Whether to open assets URLs when validating assets.
+        headers (dict): HTTP headers to include in the requests.
         extensions (bool): Whether to only validate STAC object extensions.
         custom (str): The local filepath or remote URL of a custom JSON schema to validate the STAC object.
         verbose (bool): Whether to enable verbose output in recursive mode.
@@ -54,6 +56,8 @@ class StacValidate:
         core: bool = False,
         links: bool = False,
         assets: bool = False,
+        assets_open_urls: bool = True,
+        headers: dict = {},
         extensions: bool = False,
         custom: str = "",
         verbose: bool = False,
@@ -67,6 +71,8 @@ class StacValidate:
         self.schema = custom
         self.links = links
         self.assets = assets
+        self.assets_open_urls = assets_open_urls
+        self.headers: Dict = headers
         self.recursive = recursive
         self.max_depth = max_depth
         self.extensions = extensions
@@ -80,6 +86,16 @@ class StacValidate:
         self.log = log
 
     def create_err_msg(self, err_type: str, err_msg: str) -> Dict:
+        """
+        Create a standardized error message dictionary and mark validation as failed.
+
+        Args:
+            err_type (str): The type of error.
+            err_msg (str): The error message.
+
+        Returns:
+            dict: Dictionary containing error information.
+        """
         self.valid = False
         return {
             "version": self.version,
@@ -90,11 +106,17 @@ class StacValidate:
             "error_message": err_msg,
         }
 
-    def create_links_message(self):
-        format_valid = []
-        format_invalid = []
-        request_valid = []
-        request_invalid = []
+    def create_links_message(self) -> Dict:
+        """
+        Create an initial links validation message structure.
+
+        Returns:
+            dict: An empty validation structure for link checking.
+        """
+        format_valid: List = []
+        format_invalid: List = []
+        request_valid: List = []
+        request_invalid: List = []
         return {
             "format_valid": format_valid,
             "format_invalid": format_invalid,
@@ -103,6 +125,16 @@ class StacValidate:
         }
 
     def create_message(self, stac_type: str, val_type: str) -> Dict:
+        """
+        Create a standardized validation message dictionary.
+
+        Args:
+            stac_type (str): The STAC object type.
+            val_type (str): The type of validation (e.g., "default", "core").
+
+        Returns:
+            dict: Dictionary containing general validation information.
+        """
         return {
             "version": self.version,
             "path": self.stac_file,
@@ -113,81 +145,126 @@ class StacValidate:
         }
 
     def assets_validator(self) -> Dict:
-        """Validate assets.
+        """
+        Validate the 'assets' field in STAC content if present.
 
         Returns:
-            A dictionary containing the asset validation results.
+            dict: A dictionary containing the asset validation results.
         """
         initial_message = self.create_links_message()
         assets = self.stac_content.get("assets")
         if assets:
             for asset in assets.values():
-                link_request(asset, initial_message)
+                link_request(
+                    asset, initial_message, self.assets_open_urls, self.headers
+                )
         return initial_message
 
     def links_validator(self) -> Dict:
-        """Validate links.
+        """
+        Validate the 'links' field in STAC content.
 
         Returns:
-            A dictionary containing the link validation results.
+            dict: A dictionary containing the link validation results.
         """
         initial_message = self.create_links_message()
-        # get root_url for checking relative links
         root_url = ""
+
+        # Try to locate a self/alternate link that is a valid URL for root reference
         for link in self.stac_content["links"]:
             if link["rel"] in ["self", "alternate"] and is_valid_url(link["href"]):
                 root_url = (
                     link["href"].split("/")[0] + "//" + link["href"].split("/")[2]
                 )
+
+        # Validate each link, making it absolute if necessary
         for link in self.stac_content["links"]:
             if not is_valid_url(link["href"]):
                 link["href"] = root_url + link["href"][1:]
-            link_request(link, initial_message)
+            link_request(link, initial_message, True, self.headers)
 
         return initial_message
 
+    def custom_validator(self) -> None:
+        """
+        Validate a STAC JSON file against a custom or dynamically resolved JSON schema.
+
+        1. If `self.schema` is a valid URL, fetch and validate.
+        2. If it is a local file path, use it.
+        3. Otherwise, assume it is a relative path and resolve relative to the STAC file.
+
+        Returns:
+            None
+        """
+        if is_valid_url(self.schema):
+            validate_with_ref_resolver(self.schema, self.stac_content)
+        elif os.path.exists(self.schema):
+            validate_with_ref_resolver(self.schema, self.stac_content)
+        else:
+            file_directory = os.path.dirname(os.path.abspath(str(self.stac_file)))
+            self.schema = os.path.join(file_directory, self.schema)
+            self.schema = os.path.abspath(os.path.realpath(self.schema))
+            validate_with_ref_resolver(self.schema, self.stac_content)
+
+    def core_validator(self, stac_type: str) -> None:
+        """
+        Validate the STAC content against the core schema determined by stac_type and version.
+
+        Args:
+            stac_type (str): The type of the STAC object (e.g., "item", "collection").
+        """
+        stac_type = stac_type.lower()
+        self.schema = set_schema_addr(self.version, stac_type)
+        validate_with_ref_resolver(self.schema, self.stac_content)
+
     def extensions_validator(self, stac_type: str) -> Dict:
-        """Validate the STAC extensions according to their corresponding JSON schemas.
+        """
+        Validate STAC extensions for an ITEM or validate the core schema for a COLLECTION.
 
         Args:
             stac_type (str): The STAC object type ("ITEM" or "COLLECTION").
 
         Returns:
-            dict: A dictionary containing validation results.
-
-        Raises:
-            JSONSchemaValidationError: If there is a validation error in the JSON schema.
-            Exception: If there is an error in the STAC extension validation process.
+            dict: A dictionary containing extension (or core) validation results.
         """
         message = self.create_message(stac_type, "extensions")
         message["schema"] = []
         valid = True
+
         if stac_type == "ITEM":
             try:
                 if "stac_extensions" in self.stac_content:
-                    # error with the 'proj' extension not being 'projection' in older stac
+                    # Handle legacy "proj" to "projection" mapping
                     if "proj" in self.stac_content["stac_extensions"]:
                         index = self.stac_content["stac_extensions"].index("proj")
                         self.stac_content["stac_extensions"][index] = "projection"
+
                     schemas = self.stac_content["stac_extensions"]
                     for extension in schemas:
                         if not (is_valid_url(extension) or extension.endswith(".json")):
-                            # where are the extensions for 1.0.0-beta.2 on cdn.staclint.com?
                             if self.version == "1.0.0-beta.2":
                                 self.stac_content["stac_version"] = "1.0.0-beta.1"
                                 self.version = self.stac_content["stac_version"]
-                            extension = f"https://cdn.staclint.com/v{self.version}/extension/{extension}.json"
+                            extension = (
+                                f"https://cdn.staclint.com/v{self.version}/extension/"
+                                f"{extension}.json"
+                            )
                         self.schema = extension
                         self.custom_validator()
                         message["schema"].append(extension)
+
             except jsonschema.exceptions.ValidationError as e:
                 valid = False
                 if e.absolute_path:
-                    err_msg = f"{e.message}. Error is in {' -> '.join([str(i) for i in e.absolute_path])}"
+                    err_msg = (
+                        f"{e.message}. Error is in "
+                        f"{' -> '.join(map(str, e.absolute_path))}"
+                    )
                 else:
-                    err_msg = f"{e.message} of the root of the STAC object"
+                    err_msg = f"{e.message}"
                 message = self.create_err_msg("JSONSchemaValidationError", err_msg)
                 return message
+
             except Exception as e:
                 valid = False
                 err_msg = f"{e}. Error in Extensions."
@@ -195,128 +272,77 @@ class StacValidate:
         else:
             self.core_validator(stac_type)
             message["schema"] = [self.schema]
+
         self.valid = valid
         return message
 
-    def custom_validator(self) -> None:
-        """Validates a STAC JSON file against a JSON schema, which may be located
-        either online or locally.
-
-        The function checks whether the provided schema URL is valid and can be
-        fetched and parsed. If the schema is hosted online, the function uses the
-        fetched schema to validate the STAC JSON file. If the schema is local, the
-        function resolves any references in the schema and then validates the STAC
-        JSON file against the resolved schema. If the schema is specified as a
-        relative path, the function resolves the path relative to the STAC JSON file
-        being validated and uses the resolved schema to validate the STAC JSON file.
-
-        Returns:
-            None
-        """
-        # if schema is hosted online
-        if is_valid_url(self.schema):
-            schema = fetch_and_parse_schema(self.schema)
-            jsonschema.validate(self.stac_content, schema)
-        # in case the path to a json schema is local
-        elif os.path.exists(self.schema):
-            schema_dict = fetch_and_parse_schema(self.schema)
-            # determine the appropriate validator class for the schema
-            ValidatorClass = validator_for(schema_dict)
-            validator = ValidatorClass(schema_dict)
-            # validate the content
-            validator.validate(self.stac_content)
-
-        # deal with a relative path in the schema
-        else:
-            file_directory = os.path.dirname(os.path.abspath(str(self.stac_file)))
-            self.schema = os.path.join(str(file_directory), self.schema)
-            self.schema = os.path.abspath(os.path.realpath(self.schema))
-            schema = fetch_and_parse_schema(self.schema)
-            jsonschema.validate(self.stac_content, schema)
-
-    def core_validator(self, stac_type: str) -> None:
-        """Validate the STAC item or collection against the appropriate JSON schema.
-
-        Args:
-            stac_type (str): The type of STAC object being validated (either "item" or "collection").
-
-        Returns:
-            None
-
-        Raises:
-            ValidationError: If the STAC object fails to validate against the JSON schema.
-
-        The function first determines the appropriate JSON schema to use based on the STAC object's type and version.
-        If the version is one of the specified versions (0.8.0, 0.9.0, 1.0.0, 1.0.0-beta.1, 1.0.0-beta.2, or 1.0.0-rc.2),
-        it uses the corresponding schema stored locally. Otherwise, it retrieves the schema from the appropriate URL
-        using the `set_schema_addr` function. The function then calls the `custom_validator` method to validate the
-        STAC object against the schema.
-        """
-        stac_type = stac_type.lower()
-        self.schema = set_schema_addr(self.version, stac_type)
-        self.custom_validator()
-
     def default_validator(self, stac_type: str) -> Dict:
-        """Validate the STAC catalog or item against the core schema and its extensions.
+        """
+        Validate a STAC Catalog or Item against the core schema and its extensions.
 
         Args:
-            stac_type (str): The type of STAC object being validated. Must be either "catalog" or "item".
+            stac_type (str): The type of STAC object. Must be "catalog" or "item".
 
         Returns:
-            A dictionary containing the results of the default validation, including whether the STAC object is valid,
-            any validation errors encountered, and any links and assets that were validated.
+            dict: A dictionary with results of the default validation.
         """
         message = self.create_message(stac_type, "default")
         message["schema"] = []
+
+        # Validate core
         self.core_validator(stac_type)
         core_schema = self.schema
         message["schema"].append(core_schema)
-        stac_type = stac_type.upper()
-        if stac_type == "ITEM":
-            message = self.extensions_validator(stac_type)
+        stac_upper = stac_type.upper()
+
+        # Validate extensions if ITEM
+        if stac_upper == "ITEM":
+            message = self.extensions_validator(stac_upper)
             message["validation_method"] = "default"
             message["schema"].append(core_schema)
+
+        # Optionally validate links
         if self.links:
             message["links_validated"] = self.links_validator()
+
+        # Optionally validate assets
         if self.assets:
             message["assets_validated"] = self.assets_validator()
+
         return message
 
     def recursive_validator(self, stac_type: str) -> bool:
-        """Recursively validate a STAC JSON document against its JSON Schema.
+        """
+        Recursively validate a STAC JSON document and its children/items.
 
-        This method validates a STAC JSON document recursively against its JSON Schema by following its "child" and "item" links.
-        It uses the `default_validator` and `fetch_and_parse_file` functions to validate the current STAC document and retrieve the
-        next one to be validated, respectively.
+        Follows "child" and "item" links, calling `default_validator` on each.
 
         Args:
-            self: An instance of the STACValidator class.
-            stac_type: A string representing the STAC object type to validate.
+            stac_type (str): The STAC object type to validate.
 
         Returns:
-            A boolean indicating whether the validation was successful.
-
-        Raises:
-            jsonschema.exceptions.ValidationError: If the STAC document does not validate against its JSON Schema.
-
+            bool: True if all validations are successful, False otherwise.
         """
-        if self.skip_val is False:
+        if not self.skip_val:
             self.schema = set_schema_addr(self.version, stac_type.lower())
             message = self.create_message(stac_type, "recursive")
             message["valid_stac"] = False
+
             try:
                 _ = self.default_validator(stac_type)
-
             except jsonschema.exceptions.ValidationError as e:
                 if e.absolute_path:
-                    err_msg = f"{e.message}. Error is in {' -> '.join([str(i) for i in e.absolute_path])}"
+                    err_msg = (
+                        f"{e.message}. Error is in "
+                        f"{' -> '.join([str(i) for i in e.absolute_path])}"
+                    )
                 else:
-                    err_msg = f"{e.message} of the root of the STAC object"
+                    err_msg = f"{e.message}"
                 message.update(
                     self.create_err_msg("JSONSchemaValidationError", err_msg)
                 )
                 self.message.append(message)
-                if self.verbose is True:
+                if self.verbose:
                     click.echo(json.dumps(message, indent=4))
                 return False
 
@@ -324,25 +350,29 @@ class StacValidate:
             self.message.append(message)
             if self.verbose:
                 click.echo(json.dumps(message, indent=4))
+
             self.depth += 1
             if self.max_depth and self.depth >= self.max_depth:
                 self.skip_val = True
+
             base_url = self.stac_file
 
             for link in self.stac_content["links"]:
-                if link["rel"] == "child" or link["rel"] == "item":
+                if link["rel"] in ("child", "item"):
                     address = link["href"]
                     if not is_valid_url(address):
-                        x = str(base_url).split("/")
-                        x.pop(-1)
-                        st = x[0]
-                        for i in range(len(x)):
-                            if i > 0:
-                                st = st + "/" + x[i]
-                        self.stac_file = st + "/" + address
+                        path_parts = str(base_url).split("/")
+                        path_parts.pop(-1)
+                        root = path_parts[0]
+                        for i in range(1, len(path_parts)):
+                            root = f"{root}/{path_parts[i]}"
+                        self.stac_file = f"{root}/{address}"
                     else:
                         self.stac_file = address
-                    self.stac_content = fetch_and_parse_file(str(self.stac_file))
+
+                    self.stac_content = fetch_and_parse_file(
+                        str(self.stac_file), self.headers
+                    )
                     self.stac_content["stac_version"] = self.version
                     stac_type = get_stac_type(self.stac_content).lower()
 
@@ -354,7 +384,7 @@ class StacValidate:
                     message = self.create_message(stac_type, "recursive")
                     if self.version == "0.7.0":
                         schema = fetch_and_parse_schema(self.schema)
-                        # this next line prevents this: unknown url type: 'geojson.json' ??
+                        # Prevent unknown url type issue
                         schema["allOf"] = [{}]
                         jsonschema.validate(self.stac_content, schema)
                     else:
@@ -362,131 +392,110 @@ class StacValidate:
                         message["schema"] = msg["schema"]
                     message["valid_stac"] = True
 
-                    if self.log != "":
+                    if self.log:
                         self.message.append(message)
-                    if (
-                        not self.max_depth or self.max_depth < 5
-                    ):  # TODO this should be configurable, correct?
+                    if not self.max_depth or self.max_depth < 5:
                         self.message.append(message)
+
         return True
 
-    def validate_dict(self, stac_content) -> bool:
-        """Validate the contents of a dictionary representing a STAC object.
+    def validate_dict(self, stac_content: Dict) -> bool:
+        """
+        Validate the contents of a dictionary representing a STAC object.
 
         Args:
-            stac_content (dict): The dictionary representation of the STAC object to validate.
+            stac_content (dict): The dictionary representation of the STAC object.
 
         Returns:
-            A bool indicating if validation was successfull.
+            bool: True if validation succeeded, False otherwise.
         """
         self.stac_content = stac_content
         return self.run()
 
     def validate_item_collection_dict(self, item_collection: Dict) -> None:
-        """Validate the contents of an item collection.
+        """
+        Validate the contents of a STAC Item Collection.
 
         Args:
-            item_collection (dict): The dictionary representation of the item collection to validate.
-
-        Returns:
-            None
+            item_collection (dict): The dictionary representation of the item collection.
         """
         for item in item_collection["features"]:
             self.schema = ""
             self.validate_dict(item)
 
     def validate_collections(self) -> None:
-        """ "Validate STAC collections from a /collections endpoint.
+        """
+        Validate STAC Collections from a /collections endpoint.
 
         Raises:
-            URLError: If there is an issue with the URL used to fetch the item collection.
-            JSONDecodeError: If the item collection content cannot be parsed as JSON.
-            ValueError: If the item collection does not conform to the STAC specification.
-            TypeError: If the item collection content is not a dictionary or JSON object.
-            FileNotFoundError: If the item collection file cannot be found.
-            ConnectionError: If there is an issue with the internet connection used to fetch the item collection.
-            exceptions.SSLError: If there is an issue with the SSL connection used to fetch the item collection.
-            OSError: If there is an issue with the file system (e.g., read/write permissions) while trying to write to the log file.
-
-        Returns:
-            None
+            URLError, JSONDecodeError, ValueError, TypeError, FileNotFoundError,
+            ConnectionError, exceptions.SSLError, OSError: Various errors related
+            to fetching or parsing.
         """
-        collections = fetch_and_parse_file(str(self.stac_file))
+        collections = fetch_and_parse_file(str(self.stac_file), self.headers)
         for collection in collections["collections"]:
             self.schema = ""
             self.validate_dict(collection)
 
     def validate_item_collection(self) -> None:
-        """Validate a STAC item collection.
+        """
+        Validate a STAC Item Collection with optional pagination.
 
         Raises:
-            URLError: If there is an issue with the URL used to fetch the item collection.
-            JSONDecodeError: If the item collection content cannot be parsed as JSON.
-            ValueError: If the item collection does not conform to the STAC specification.
-            TypeError: If the item collection content is not a dictionary or JSON object.
-            FileNotFoundError: If the item collection file cannot be found.
-            ConnectionError: If there is an issue with the internet connection used to fetch the item collection.
-            exceptions.SSLError: If there is an issue with the SSL connection used to fetch the item collection.
-            OSError: If there is an issue with the file system (e.g., read/write permissions) while trying to write to the log file.
-
-        Returns:
-            None
+            URLError, JSONDecodeError, ValueError, TypeError, FileNotFoundError,
+            ConnectionError, exceptions.SSLError, OSError: Various errors related
+            to fetching or parsing.
         """
         page = 1
         print(f"processing page {page}")
-        item_collection = fetch_and_parse_file(str(self.stac_file))
+        item_collection = fetch_and_parse_file(str(self.stac_file), self.headers)
         self.validate_item_collection_dict(item_collection)
+
         try:
             if self.pages is not None:
                 for _ in range(self.pages - 1):
                     if "links" in item_collection:
                         for link in item_collection["links"]:
                             if link["rel"] == "next":
-                                page = page + 1
+                                page += 1
                                 print(f"processing page {page}")
                                 next_link = link["href"]
                                 self.stac_file = next_link
                                 item_collection = fetch_and_parse_file(
-                                    str(self.stac_file)
+                                    str(self.stac_file), self.headers
                                 )
                                 self.validate_item_collection_dict(item_collection)
                                 break
         except Exception as e:
-            message = {}
-            message["pagination_error"] = (
-                f"Validating the item collection failed on page {page}: {str(e)}"
-            )
+            message = {
+                "pagination_error": (
+                    f"Validating the item collection failed on page {page}: {str(e)}"
+                )
+            }
             self.message.append(message)
 
     def run(self) -> bool:
-        """Runs the STAC validation process based on the input parameters.
+        """
+        Run the STAC validation process based on the input parameters.
 
         Returns:
             bool: True if the STAC is valid, False otherwise.
 
         Raises:
-            URLError: If there is an error with the URL.
-            JSONDecodeError: If there is an error decoding the JSON content.
-            ValueError: If there is an invalid value.
-            TypeError: If there is an invalid type.
-            FileNotFoundError: If the file is not found.
-            ConnectionError: If there is an error with the connection.
-            exceptions.SSLError: If there is an SSL error.
-            OSError: If there is an error with the operating system.
-            jsonschema.exceptions.ValidationError: If the STAC content fails validation.
-            KeyError: If the specified key is not found.
-            HTTPError: If there is an error with the HTTP connection.
-            Exception: If there is any other type of error.
-
+            URLError, JSONDecodeError, ValueError, TypeError, FileNotFoundError,
+            ConnectionError, exceptions.SSLError, OSError, KeyError, HTTPError,
+            jsonschema.exceptions.ValidationError, Exception: Various errors
+            during fetching or parsing.
         """
         message = {}
         try:
+            # Fetch STAC content if not provided via item_collection/collections
             if (
                 self.stac_file is not None
                 and not self.item_collection
                 and not self.collections
             ):
-                self.stac_content = fetch_and_parse_file(self.stac_file)
+                self.stac_content = fetch_and_parse_file(self.stac_file, self.headers)
 
             stac_type = get_stac_type(self.stac_content).upper()
             self.version = self.stac_content["stac_version"]
@@ -496,24 +505,31 @@ class StacValidate:
                 self.core_validator(stac_type)
                 message["schema"] = [self.schema]
                 self.valid = True
-            elif self.schema != "":
+
+            elif self.schema:
                 message = self.create_message(stac_type, "custom")
                 message["schema"] = [self.schema]
                 self.custom_validator()
                 self.valid = True
+
             elif self.recursive:
                 self.valid = self.recursive_validator(stac_type)
+
             elif self.extensions:
                 message = self.extensions_validator(stac_type)
+
             else:
                 self.valid = True
                 message = self.default_validator(stac_type)
 
         except jsonschema.exceptions.ValidationError as e:
             if e.absolute_path:
-                err_msg = f"{e.message}. Error is in {' -> '.join([str(i) for i in e.absolute_path])} "
+                err_msg = (
+                    f"{e.message}. Error is in "
+                    f"{' -> '.join([str(i) for i in e.absolute_path])} "
+                )
             else:
-                err_msg = f"{e.message} of the root of the STAC object"
+                err_msg = f"{e.message}"
             message.update(self.create_err_msg("JSONSchemaValidationError", err_msg))
 
         except (
@@ -537,7 +553,8 @@ class StacValidate:
             message["valid_stac"] = self.valid
             self.message.append(message)
 
-        if self.log != "":
+        # Write out log if path is provided
+        if self.log:
             with open(self.log, "w") as f:
                 f.write(json.dumps(self.message, indent=4))
 


=====================================
tests/test_assets.py
=====================================
@@ -1,5 +1,5 @@
 """
-Description: Test --links option
+Description: Test --assets option
 
 """
 
@@ -22,7 +22,7 @@ def test_assets_v090():
             ],
             "valid_stac": False,
             "error_type": "JSONSchemaValidationError",
-            "error_message": "-0.00751271 is less than the minimum of 0. Error is in properties -> view:off_nadir",
+            "error_message": "-0.00751271 is less than the minimum of 0",
             "validation_method": "default",
             "assets_validated": {
                 "format_valid": [
@@ -78,6 +78,33 @@ def test_assets_v100():
     ]
 
 
+def test_assets_v100_no_links():
+    stac_file = "tests/test_data/v100/simple-item.json"
+    stac = stac_validator.StacValidate(stac_file, assets=True, assets_open_urls=False)
+    stac.run()
+    assert stac.message == [
+        {
+            "version": "1.0.0",
+            "path": "tests/test_data/v100/simple-item.json",
+            "schema": [
+                "https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json"
+            ],
+            "valid_stac": True,
+            "asset_type": "ITEM",
+            "validation_method": "default",
+            "assets_validated": {
+                "format_valid": [
+                    "https://storage.googleapis.com/open-cogs/stac-examples/20201211_223832_CS2_test.tif",
+                    "https://storage.googleapis.com/open-cogs/stac-examples/20201211_223832_CS2_test.jpg",
+                ],
+                "format_invalid": [],
+                "request_valid": [],
+                "request_invalid": [],
+            },
+        }
+    ]
+
+
 def test_assets_on_collection_without_assets_ok():
     stac_file = "tests/test_data/v100/collection.json"
     stac = stac_validator.StacValidate(stac_file, assets=True)


=====================================
tests/test_core.py
=====================================
@@ -83,7 +83,7 @@ def test_core_bad_item_local_v090():
             "schema": ["https://cdn.staclint.com/v0.9.0/item.json"],
             "valid_stac": False,
             "error_type": "JSONSchemaValidationError",
-            "error_message": "'id' is a required property of the root of the STAC object",
+            "error_message": "'id' is a required property",
         }
     ]
 


=====================================
tests/test_custom.py
=====================================
@@ -20,7 +20,7 @@ def test_custom_item_remote_schema_v080():
             "validation_method": "custom",
             "valid_stac": False,
             "error_type": "JSONSchemaValidationError",
-            "error_message": "'bbox' is a required property of the root of the STAC object",
+            "error_message": "'bbox' is a required property",
         }
     ]
 
@@ -74,7 +74,7 @@ def test_custom_bad_item_remote_schema_v090():
             "schema": ["https://cdn.staclint.com/v0.9.0/item.json"],
             "valid_stac": False,
             "error_type": "JSONSchemaValidationError",
-            "error_message": "'id' is a required property of the root of the STAC object",
+            "error_message": "'id' is a required property",
         }
     ]
 
@@ -99,29 +99,6 @@ def test_custom_item_remote_schema_v1rc2():
     ]
 
 
-def test_custom_proj_error_v1rc2():
-    schema = "https://stac-extensions.github.io/projection/v1.0.0/schema.json"
-    stac_file = (
-        "tests/test_data/1rc2/extensions-collection/./proj-example/proj-example.json"
-    )
-    stac = stac_validator.StacValidate(stac_file, custom=schema)
-    stac.run()
-    assert stac.message == [
-        {
-            "version": "1.0.0-rc.2",
-            "path": "tests/test_data/1rc2/extensions-collection/./proj-example/proj-example.json",
-            "schema": [
-                "https://stac-extensions.github.io/projection/v1.0.0/schema.json"
-            ],
-            "valid_stac": False,
-            "asset_type": "ITEM",
-            "validation_method": "custom",
-            "error_type": "JSONSchemaValidationError",
-            "error_message": "'A' is not of type 'number'. Error is in properties -> proj:centroid -> lat ",
-        }
-    ]
-
-
 def test_custom_item_v100_relative_schema():
     schema = "../schema/v1.0.0/projection.json"
     stac_file = "tests/test_data/v100/extended-item-no-extensions.json"


=====================================
tests/test_header.py
=====================================
@@ -0,0 +1,50 @@
+"""
+Description: Test --header option
+
+"""
+
+import json
+
+import requests_mock
+
+from stac_validator import stac_validator
+
+
+def test_header():
+    stac_file = "tests/test_data/v110/simple-item.json"
+    url = "https://localhost/" + stac_file
+
+    no_headers = {}
+    valid_headers = {"x-api-key": "a-valid-api-key"}
+
+    with requests_mock.Mocker(real_http=True) as mock, open(stac_file) as json_data:
+        mock.get(url, request_headers=no_headers, status_code=403)
+        mock.get(url, request_headers=valid_headers, json=json.load(json_data))
+
+        stac = stac_validator.StacValidate(url, core=True, headers=valid_headers)
+        stac.run()
+        assert stac.message == [
+            {
+                "version": "1.1.0",
+                "path": "https://localhost/tests/test_data/v110/simple-item.json",
+                "schema": [
+                    "https://schemas.stacspec.org/v1.1.0/item-spec/json-schema/item.json"
+                ],
+                "valid_stac": True,
+                "asset_type": "ITEM",
+                "validation_method": "core",
+            }
+        ]
+
+        stac = stac_validator.StacValidate(url, core=True, headers=no_headers)
+        stac.run()
+        assert stac.message == [
+            {
+                "version": "",
+                "path": "https://localhost/tests/test_data/v110/simple-item.json",
+                "schema": [""],
+                "valid_stac": False,
+                "error_type": "HTTPError",
+                "error_message": "403 Client Error: None for url: https://localhost/tests/test_data/v110/simple-item.json",
+            }
+        ]


=====================================
tests/test_links.py
=====================================
@@ -20,7 +20,7 @@ def test_poorly_formatted_v090():
             ],
             "valid_stac": False,
             "error_type": "JSONSchemaValidationError",
-            "error_message": "-0.00751271 is less than the minimum of 0. Error is in properties -> view:off_nadir",
+            "error_message": "-0.00751271 is less than the minimum of 0",
             "validation_method": "default",
             "links_validated": {
                 "format_valid": [


=====================================
tests/test_recursion.py
=====================================
@@ -328,7 +328,7 @@ def test_recursion_with_bad_item():
             ],
             "valid_stac": False,
             "error_type": "JSONSchemaValidationError",
-            "error_message": "'id' is a required property of the root of the STAC object",
+            "error_message": "'id' is a required property",
         },
     ]
 
@@ -350,6 +350,6 @@ def test_recursion_with_missing_collection_link():
             "valid_stac": False,
             "validation_method": "recursive",
             "error_type": "JSONSchemaValidationError",
-            "error_message": "'simple-collection' should not be valid under {}. Error is in collection",
+            "error_message": "'simple-collection' should not be valid under {}",
         },
     ]


=====================================
tox.ini
=====================================
@@ -1,6 +1,8 @@
 [tox]
-envlist = py38,py39,py310,py311
+envlist = py38,py39,py310,py311,py312,py313
 
 [testenv]
-deps = pytest
+deps = 
+    pytest
+    requests-mock
 commands = pytest
\ No newline at end of file


=====================================
tox/Dockerfile-tox
=====================================
@@ -4,5 +4,5 @@ COPY . /code/
 RUN export LC_ALL=C.UTF-8 && \
     export LANG=C.UTF-8 && \
     pip3 install . && \
-    pip3 install tox==4.0.11 && \
+    pip3 install tox==4.23.2 && \
     tox
\ No newline at end of file



View it on GitLab: https://salsa.debian.org/debian-gis-team/stac-validator/-/compare/f54338c5d62fafe87e46f3a23e17f4db5a89884a...f5cd4a6faa2ced4e7150aba31c88d4ea75ec50c3

-- 
View it on GitLab: https://salsa.debian.org/debian-gis-team/stac-validator/-/compare/f54338c5d62fafe87e46f3a23e17f4db5a89884a...f5cd4a6faa2ced4e7150aba31c88d4ea75ec50c3
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-grass-devel/attachments/20250117/bcee34cc/attachment-0001.htm>


More information about the Pkg-grass-devel mailing list