[Git][debian-gis-team/stac-validator][upstream] New upstream version 3.11.0

Antonio Valentino (@antonio.valentino) gitlab at salsa.debian.org
Sat Mar 28 18:33:02 GMT 2026



Antonio Valentino pushed to branch upstream at Debian GIS Project / stac-validator


Commits:
faedebc1 by Antonio Valentino at 2026-03-28T17:09:01+00:00
New upstream version 3.11.0
- - - - -


9 changed files:

- CHANGELOG.md
- README.md
- + assets/cloudferro-logo.png
- pyproject.toml
- stac_validator/stac_validator.py
- stac_validator/utilities.py
- + tests/test_schema_cache.py
- tests/test_sys_exit.py
- tests/test_validate_dict.py


Changes:

=====================================
CHANGELOG.md
=====================================
@@ -6,6 +6,33 @@ The format is (loosely) based on [Keep a Changelog](http://keepachangelog.com/)
 
 ## [Unreleased]
 
+### Added
+
+### Changed
+
+### Fixed
+
+### Removed
+
+### Updated
+
+## [v3.11.0] - 2026-03-27
+
+### Added 
+
+- Added a new CLI option --schema-cache-size to control the in-memory schema cache size at runtime.
+- Added support for setting schema cache size to 0 to disable schema caching for low-memory environments.
+
+### Changed 
+
+- Refactored schema caching to support runtime cache reconfiguration while preserving cache inspection and clearing behavior.
+- Added test coverage for:
+  - runtime schema cache size reconfiguration
+  - zero-size cache behavior
+  - negative cache size validation
+  - CLI acceptance of --schema-cache-size
+
+
 ## [v3.10.2] - 2025-11-16
 
 ### Fixed
@@ -310,7 +337,8 @@ The format is (loosely) based on [Keep a Changelog](http://keepachangelog.com/)
 - With the newest version - 1.0.0-beta.2 - items will run through jsonchema validation before the PySTAC validation. The reason for this is that jsonschema will give more informative error messages. This should be addressed better in the future. This is not the case with the --recursive option as time can be a concern here with larger collections.
 - Logging. Various additions were made here depending on the options selected. This was done to help assist people to update their STAC collections.
 
-[Unreleased]: https://github.com/sparkgeo/stac-validator/compare/v3.10.2..main
+[Unreleased]: https://github.com/sparkgeo/stac-validator/compare/v3.11.0..main
+[v3.11.0]: https://github.com/sparkgeo/stac-validator/compare/v3.10.2..v3.11.0
 [v3.10.2]: https://github.com/sparkgeo/stac-validator/compare/v3.10.1..v3.10.2
 [v3.10.1]: https://github.com/sparkgeo/stac-validator/compare/v3.10.0..v3.10.1
 [v3.10.0]: https://github.com/sparkgeo/stac-validator/compare/v3.9.3..v3.10.0


=====================================
README.md
=====================================
@@ -23,6 +23,7 @@
 - [Usage](#usage)
   - [CLI](#cli)
   - [Python](#python)
+- [Schema Cache Settings](#schema-cache-settings)
 - [Examples](#additional-examples)
   - [Core Validation](#--core)
   - [Custom Schema](#--custom)
@@ -178,6 +179,7 @@ Options:
                                   (local filepath).
   --pydantic                      Validate using stac-pydantic models for enhanced
                                   type checking and validation.
+  --schema-cache-size INTEGER     Max number of schema entries to cache in memory. Use 0 to disable schema caching. Defaults to 16.
   --schema-config TEXT            Path to a YAML or JSON schema config file.
   --verbose                       Enable verbose output. This will output
                                   additional information during validation.
@@ -244,6 +246,20 @@ stac.validate_dict(dictionary)
 print(stac.message)
 ```
 
+Set schema cache size
+```python
+from stac_validator import stac_validator
+from stac_validator.utilities import set_schema_cache_size
+
+# Set once at app startup (process-wide)
+set_schema_cache_size(16)  # use 0 to disable caching
+
+stac = stac_validator.StacValidate()
+stac.validate_dict(dictionary)
+print(stac.message)
+```
+
+
 **Item Collection**
 
 ```python
@@ -254,6 +270,29 @@ stac.validate_item_collection_dict(item_collection_dict)
 print(stac.message)
 ```
 
+
+### Schema Cache Settings
+
+- Default schema cache size is 16 entries.
+- Use `--schema-cache-size` in the CLI or `set_schema_cache_size(...)` in Python to override it.
+- Use `0` to disable schema caching.
+
+Use `set_schema_cache_size` once at application startup:
+
+```python
+from stac_validator.utilities import set_schema_cache_size
+
+# Examples:
+set_schema_cache_size(16)  # small cache for low-memory deployments
+set_schema_cache_size(64)  # moderate cache for long-running services
+set_schema_cache_size(0)   # disable schema caching
+```
+
+Notes:
+- `StacValidate()` and `validate_dict()` do not accept a cache-size parameter.
+- Changing cache size at runtime replaces the cache instance and drops existing cached entries.
+- In multi-worker deployments, configure cache size in each worker process.
+
 ## Deployment
 
 ### Docker
@@ -527,11 +566,13 @@ The following organizations have contributed time and/or funding to support the
 - [Healy Hyperspatial](https://healy-hyperspatial.github.io/)
 - [Radiant Earth Foundation](https://radiant.earth/)
 - [Sparkgeo](https://sparkgeo.com/)
+- [CloudFerro](https://cloudferro.com/)
 
 <p align="left">
   <a href="https://healy-hyperspatial.github.io/"><img src="https://raw.githubusercontent.com/stac-utils/stac-fastapi-elasticsearch-opensearch/refs/heads/main/assets/hh-logo-blue.png" alt="Healy Hyperspatial" height="100" hspace="20"></a>
   <a href="https://radiant.earth/"><img src="assets/radiant-earth.webp" alt="Radiant Earth Foundation" height="100" hspace="20"></a>
   <a href="https://sparkgeo.com/"><img src="assets/sparkgeo_logo.jpeg" alt="Sparkgeo" height="100" hspace="20"></a>
+  <a href="https://cloudferro.com/"><img src="assets/cloudferro-logo.png" alt="CloudFerro" height="110" hspace="20"></a>
 </p>
 
 


=====================================
assets/cloudferro-logo.png
=====================================
Binary files /dev/null and b/assets/cloudferro-logo.png differ


=====================================
pyproject.toml
=====================================
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "stac_validator"
-version = "3.10.2"
+version = "3.11.0"
 description = "A package to validate STAC files"
 authors = [
     {name = "James Banting"},


=====================================
stac_validator/stac_validator.py
=====================================
@@ -5,6 +5,7 @@ from typing import Any, Dict, List, Optional, Tuple
 
 import click  # type: ignore
 
+from .utilities import set_schema_cache_size
 from .validate import StacValidate
 
 
@@ -232,6 +233,12 @@ def recursive_validation_summary(message: List[Dict[str, Any]]) -> None:
     is_flag=True,
     help="Enable verbose output. This will output additional information during validation.",
 )
+ at click.option(
+    "--schema-cache-size",
+    type=int,
+    default=None,
+    help="Max number of schema entries to cache in memory. Use 0 to disable schema caching. Defaults to 16.",
+)
 def main(
     stac_file: str,
     collections: bool,
@@ -253,6 +260,7 @@ def main(
     log_file: str,
     pydantic: bool,
     verbose: bool = False,
+    schema_cache_size: Optional[int] = None,
 ):
     """Main function for the `stac-validator` command line tool. Validates a STAC file
     against the STAC specification and prints the validation results to the console as JSON.
@@ -279,6 +287,7 @@ def main(
         log_file (str): Path to a log file to save full recursive output.
         pydantic (bool): Whether to validate using stac-pydantic models for enhanced type checking and validation.
         verbose (bool): Whether to enable verbose output. This will output additional information during validation.
+        schema_cache_size (Optional[int]): Maximum schema cache size. Use 0 to disable caching. Defaults to 16.
 
     Returns:
         None
@@ -295,6 +304,14 @@ def main(
     else:
         schema_map_dict = dict(schema_map)
 
+    if schema_cache_size is not None:
+        if schema_cache_size < 0:
+            raise click.BadParameter(
+                "must be greater than or equal to 0",
+                param_hint="--schema-cache-size",
+            )
+        set_schema_cache_size(schema_cache_size)
+
     stac = StacValidate(
         stac_file=stac_file,
         collections=collections,


=====================================
stac_validator/utilities.py
=====================================
@@ -181,7 +181,44 @@ def fetch_and_parse_file(input_path: str, headers: Optional[Dict] = None) -> Dic
         raise e
 
 
- at functools.lru_cache(maxsize=48)
+DEFAULT_SCHEMA_CACHE_SIZE = 16
+
+
+def _build_schema_cache(maxsize: int):
+    @functools.lru_cache(maxsize=maxsize)
+    def _cached_fetch(input_path: str) -> Dict:
+        return fetch_and_parse_file(input_path)
+
+    return _cached_fetch
+
+
+_schema_cache = _build_schema_cache(DEFAULT_SCHEMA_CACHE_SIZE)
+
+
+def set_schema_cache_size(maxsize: int) -> None:
+    """Reconfigure the schema cache max size at runtime.
+
+    Args:
+        maxsize: Maximum number of cached schema entries. Use 0 to disable caching.
+
+    Raises:
+        ValueError: If maxsize is negative.
+    """
+    if maxsize < 0:
+        raise ValueError("schema cache size must be greater than or equal to 0")
+
+    global _schema_cache
+    _schema_cache = _build_schema_cache(maxsize)
+
+
+def _fetch_and_parse_schema_cache_info():
+    return _schema_cache.cache_info()
+
+
+def _fetch_and_parse_schema_cache_clear() -> None:
+    _schema_cache.cache_clear()
+
+
 def fetch_and_parse_schema(input_path: str) -> Dict:
     """Fetches and parses a JSON schema file from a URL or local file using a cache.
 
@@ -201,7 +238,11 @@ def fetch_and_parse_schema(input_path: str) -> Dict:
         ValueError: If the input is not a valid URL or local file path.
         requests.exceptions.RequestException: If there is an error while downloading the file.
     """
-    return fetch_and_parse_file(input_path)
+    return _schema_cache(input_path)
+
+
+fetch_and_parse_schema.cache_info = _fetch_and_parse_schema_cache_info  # type: ignore[attr-defined]
+fetch_and_parse_schema.cache_clear = _fetch_and_parse_schema_cache_clear  # type: ignore[attr-defined]
 
 
 def set_schema_addr(version: str, stac_type: str) -> str:


=====================================
tests/test_schema_cache.py
=====================================
@@ -0,0 +1,91 @@
+"""Test schema caching behavior."""
+
+import pytest
+
+from stac_validator.utilities import (
+    DEFAULT_SCHEMA_CACHE_SIZE,
+    fetch_and_parse_schema,
+    set_schema_cache_size,
+)
+from stac_validator.validate import StacValidate
+
+
+def test_schema_cache_with_extensions():
+    """Test that extension schemas are cached across validations."""
+    original_cache_info = fetch_and_parse_schema.cache_info()
+    original_maxsize = original_cache_info.maxsize
+    try:
+        # Use a sufficiently large cache to avoid evictions during this test
+        set_schema_cache_size(256)
+        fetch_and_parse_schema.cache_clear()
+        stac_file = "tests/test_data/v100/extended-item.json"
+        # First validation
+        stac1 = StacValidate(stac_file)
+        stac1.run()
+        cache_info_1 = fetch_and_parse_schema.cache_info()
+        hits_after_first = cache_info_1.hits
+        misses_after_first = cache_info_1.misses
+        size_after_first = cache_info_1.currsize
+        # Second validation with same file
+        stac2 = StacValidate(stac_file)
+        stac2.run()
+        cache_info_2 = fetch_and_parse_schema.cache_info()
+        hits_after_second = cache_info_2.hits
+        misses_after_second = cache_info_2.misses
+        size_after_second = cache_info_2.currsize
+        # Verify cache is working
+        assert (
+            size_after_first > 0
+        ), "Cache should contain schemas after first validation"
+        assert (
+            size_after_second == size_after_first
+        ), "Cache size should not grow on second validation"
+        assert (
+            hits_after_second > hits_after_first
+        ), "Cache hits should increase on second validation"
+        assert (
+            misses_after_second == misses_after_first
+        ), "No new misses on second validation"
+    finally:
+        set_schema_cache_size(original_maxsize)
+        fetch_and_parse_schema.cache_clear()
+
+
+def test_schema_cache_size_can_be_reconfigured():
+    """Cache maxsize can be changed at runtime."""
+    try:
+        set_schema_cache_size(2)
+        fetch_and_parse_schema.cache_clear()
+
+        fetch_and_parse_schema("local_schemas/v1.0.0/catalog.json")
+        fetch_and_parse_schema("local_schemas/v1.0.0/collection.json")
+        fetch_and_parse_schema("local_schemas/v1.0.0/item.json")
+
+        cache_info = fetch_and_parse_schema.cache_info()
+        assert cache_info.maxsize == 2
+        assert cache_info.currsize == 2
+    finally:
+        set_schema_cache_size(DEFAULT_SCHEMA_CACHE_SIZE)
+        fetch_and_parse_schema.cache_clear()
+
+
+def test_schema_cache_size_rejects_negative():
+    with pytest.raises(ValueError):
+        set_schema_cache_size(-1)
+
+
+def test_schema_cache_size_zero_disables_cache():
+    try:
+        set_schema_cache_size(0)
+        fetch_and_parse_schema.cache_clear()
+
+        fetch_and_parse_schema("local_schemas/v1.0.0/catalog.json")
+        fetch_and_parse_schema("local_schemas/v1.0.0/catalog.json")
+
+        cache_info = fetch_and_parse_schema.cache_info()
+        assert cache_info.maxsize == 0
+        assert cache_info.hits == 0
+        assert cache_info.currsize == 0
+    finally:
+        set_schema_cache_size(DEFAULT_SCHEMA_CACHE_SIZE)
+    fetch_and_parse_schema.cache_clear()


=====================================
tests/test_sys_exit.py
=====================================
@@ -30,3 +30,15 @@ def test_false_sys_exit_error_python():
         ["stac-validator", "tests/test_data/v090/items/good_item_v090.json"],
         check=True,
     )
+
+
+def test_cli_schema_cache_size_option():
+    subprocess.run(
+        [
+            "stac-validator",
+            "tests/test_data/v090/items/good_item_v090.json",
+            "--schema-cache-size",
+            "8",
+        ],
+        check=True,
+    )


=====================================
tests/test_validate_dict.py
=====================================
@@ -6,6 +6,11 @@ Description: Test the validator
 import json
 
 from stac_validator import stac_validator
+from stac_validator.utilities import (
+    DEFAULT_SCHEMA_CACHE_SIZE,
+    fetch_and_parse_schema,
+    set_schema_cache_size,
+)
 
 
 def test_validate_dict_catalog_v1rc2():
@@ -65,3 +70,27 @@ def test_incorrect_validate_dict_return_method():
         good_stac = json.load(f)
         bad_stac = good_stac.pop("type", None)
     assert stac.validate_dict(bad_stac) is False
+
+
+def test_validate_dict_does_not_configure_schema_cache_size():
+    try:
+        set_schema_cache_size(7)
+        fetch_and_parse_schema.cache_clear()
+
+        stac = stac_validator.StacValidate()
+
+        # Instantiating StacValidate should not change cache configuration.
+        assert fetch_and_parse_schema.cache_info().maxsize == 7
+
+        with open(
+            "tests/test_data/1rc2/extensions-collection/collection.json", "r"
+        ) as f:
+            good_stac = json.load(f)
+
+        stac.validate_dict(good_stac)
+
+        # Running validate_dict should use the configured cache size, not override it.
+        assert fetch_and_parse_schema.cache_info().maxsize == 7
+    finally:
+        set_schema_cache_size(DEFAULT_SCHEMA_CACHE_SIZE)
+        fetch_and_parse_schema.cache_clear()



View it on GitLab: https://salsa.debian.org/debian-gis-team/stac-validator/-/commit/faedebc16d98b5c0b4b19fea36afab938fd73595

-- 
View it on GitLab: https://salsa.debian.org/debian-gis-team/stac-validator/-/commit/faedebc16d98b5c0b4b19fea36afab938fd73595
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-grass-devel/attachments/20260328/238ce0f4/attachment-0001.htm>


More information about the Pkg-grass-devel mailing list