[Git][debian-gis-team/trollsift][master] 5 commits: New upstream version 0.6.0
Antonio Valentino (@antonio.valentino)
gitlab at salsa.debian.org
Wed Sep 3 23:09:17 BST 2025
Antonio Valentino pushed to branch master at Debian GIS Project / trollsift
Commits:
5338a7f1 by Antonio Valentino at 2025-09-03T21:58:31+00:00
New upstream version 0.6.0
- - - - -
1481e072 by Antonio Valentino at 2025-09-03T21:58:32+00:00
Update upstream source from tag 'upstream/0.6.0'
Update to upstream version '0.6.0'
with Debian dir 49a3911f0e675094b14ef125a19a6dd0f3e8b2aa
- - - - -
28b69294 by Antonio Valentino at 2025-09-03T22:01:52+00:00
New upstream release
- - - - -
2fbe6828 by Antonio Valentino at 2025-09-03T22:05:25+00:00
Update dates in d/copyright
- - - - -
fb1a1241 by Antonio Valentino at 2025-09-03T22:05:25+00:00
Set distribution to unstable
- - - - -
23 changed files:
- .github/workflows/ci.yaml
- .github/workflows/deploy-sdist.yaml
- + .pre-commit-config.yaml
- .readthedocs.yaml
- AUTHORS.md
- CHANGELOG.md
- LICENSE.txt
- RELEASING.md
- debian/changelog
- debian/copyright
- doc/source/api.rst
- doc/source/conf.py
- doc/source/index.rst
- doc/source/installation.rst
- doc/source/usage.rst
- pyproject.toml
- − setup.cfg
- trollsift/__init__.py
- trollsift/parser.py
- + trollsift/py.typed
- trollsift/tests/integrationtests/test_parser.py
- trollsift/tests/regressiontests/test_parser.py
- trollsift/tests/unittests/test_parser.py
Changes:
=====================================
.github/workflows/ci.yaml
=====================================
@@ -9,7 +9,7 @@ jobs:
fail-fast: true
matrix:
os: ["windows-latest", "ubuntu-latest", "macos-latest"]
- python-version: ["3.10", "3.11", "3.12"]
+ python-version: ["3.10", "3.11", "3.13"]
env:
PYTHON_VERSION: ${{ matrix.python-version }}
@@ -18,7 +18,7 @@ jobs:
steps:
- name: Checkout source
- uses: actions/checkout at v4
+ uses: actions/checkout at v5
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python at v5
@@ -27,7 +27,7 @@ jobs:
- name: Install dependencies
run: |
- pip install -U codecov pytest pytest-cov
+ pip install -U codecov pytest pytest-cov mypy
- name: Install trollsift
run: |
@@ -37,10 +37,13 @@ jobs:
run: |
pytest --cov=trollsift trollsift/tests --cov-report=xml
+ - name: Run mypy
+ run: |
+ mypy trollsift
+
- name: Upload unittest coverage to Codecov
uses: codecov/codecov-action at v5
with:
flags: unittests
- file: ./coverage.xml
+ files: ./coverage.xml
env_vars: OS,PYTHON_VERSION
-
=====================================
.github/workflows/deploy-sdist.yaml
=====================================
@@ -6,12 +6,12 @@ on:
- published
jobs:
- test:
+ sdist:
runs-on: ubuntu-latest
steps:
- name: Checkout source
- uses: actions/checkout at v4
+ uses: actions/checkout at v5
- name: Create sdist
shell: bash -l {0}
@@ -21,7 +21,7 @@ jobs:
- name: Publish package to PyPI
if: github.event.action == 'published'
- uses: pypa/gh-action-pypi-publish at v1.12.2
+ uses: pypa/gh-action-pypi-publish at v1.12.4
with:
user: __token__
- password: ${{ secrets.pypi_password }}
\ No newline at end of file
+ password: ${{ secrets.pypi_password }}
=====================================
.pre-commit-config.yaml
=====================================
@@ -0,0 +1,20 @@
+exclude: '^$'
+fail_fast: false
+repos:
+ - repo: https://github.com/astral-sh/ruff-pre-commit
+ rev: 'v0.12.7'
+ hooks:
+ - id: ruff
+ args: ["--fix"]
+ - id: ruff-format
+ - repo: https://github.com/pre-commit/pre-commit-hooks
+ rev: v5.0.0
+ hooks:
+ - id: trailing-whitespace
+ - id: end-of-file-fixer
+ - id: check-yaml
+ args: [--unsafe]
+ci:
+ # To trigger manually, comment on a pull request with "pre-commit.ci autofix"
+ autofix_prs: false
+ autoupdate_schedule: "monthly"
=====================================
.readthedocs.yaml
=====================================
@@ -15,4 +15,3 @@ build:
python: "mambaforge-4.10"
conda:
environment: doc/rtd_environment.yaml
-
=====================================
AUTHORS.md
=====================================
@@ -11,4 +11,4 @@ The following people have made contributions to this project:
- [Hrobjartur Thorsteinsson (thorsteinssonh)](https://github.com/thorsteinssonh)
- [Stephan Finkensieper (sfinkens)](https://github.com/sfinkens)
- [Paulo Medeiros (paulovcmedeiros)](https://github.com/paulovcmedeiros)
-- [Regan Koopmans (Regan-Koopmans)](https://github.com/Regan-Koopmans)
\ No newline at end of file
+- [Regan Koopmans (Regan-Koopmans)](https://github.com/Regan-Koopmans)
=====================================
CHANGELOG.md
=====================================
@@ -1,5 +1,25 @@
-## Version 0.5.3 (2024/12/03)
+## Version 0.6.0 (2025/09/03)
+
+### Issues Closed
+
+* [Issue 7](https://github.com/pytroll/trollsift/issues/7) - Switch accepted arguments for parser methods to args and kwargs
+
+In this release 1 issue was closed.
+
+### Pull Requests Merged
+
+#### Bugs fixed
+
+* [PR 81](https://github.com/pytroll/trollsift/pull/81) - Add pre-commit with ruff and ruff format
+
+#### Features added
+
+* [PR 80](https://github.com/pytroll/trollsift/pull/80) - Add type annotations
+In this release 2 pull requests were closed.
+
+
+## Version 0.5.3 (2024/12/03)
### Pull Requests Merged
=====================================
LICENSE.txt
=====================================
@@ -671,4 +671,4 @@ into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
-<http://www.gnu.org/philosophy/why-not-lgpl.html>.
\ No newline at end of file
+<http://www.gnu.org/philosophy/why-not-lgpl.html>.
=====================================
RELEASING.md
=====================================
@@ -28,4 +28,3 @@
the changelog (the portion under the version section header) in the
"Describe this release" box. Finally click "Publish release".
9. Verify the GitHub actions for deployment succeed and the release is on PyPI.
-
=====================================
debian/changelog
=====================================
@@ -1,9 +1,13 @@
-trollsift (0.5.3-2) UNRELEASED; urgency=medium
+trollsift (0.6.0-1) unstable; urgency=medium
- * Team upload.
+ [ Bas Couwenberg ]
* Bump Standards-Version to 4.7.2, no changes.
- -- Bas Couwenberg <sebastic at debian.org> Thu, 20 Mar 2025 06:24:53 +0100
+ [ Antonio Valentino ]
+ * New upstream release.
+ * Update dates in d/copyright.
+
+ -- Antonio Valentino <antonio.valentino at tiscali.it> Wed, 03 Sep 2025 22:02:25 +0000
trollsift (0.5.3-1) unstable; urgency=medium
=====================================
debian/copyright
=====================================
@@ -10,7 +10,7 @@ Copyright: 2014-2022, trollsift Developers
License: GPL-3+
Files: debian/*
-Copyright: 2018-2024, Antonio Valentino <antonio.valentino at tiscali.it>
+Copyright: 2018-2025, Antonio Valentino <antonio.valentino at tiscali.it>
License: GPL-3+
License: GPL-3+
=====================================
doc/source/api.rst
=====================================
@@ -7,5 +7,3 @@ trollsift parser
.. automodule:: trollsift.parser
:members:
:undoc-members:
-
-
=====================================
doc/source/conf.py
=====================================
@@ -1,22 +1,13 @@
# -*- coding: utf-8 -*-
-#
-# trollsift documentation build configuration file, created by
-# sphinx-quickstart on Wed Nov 27 13:05:45 2013.
-#
-# This file is execfile()d with the current directory set to its containing dir.
-#
-# Note that not all possible configuration values are present in this
-# autogenerated file.
-#
-# All configuration values have a default; values that are commented out
-# serve to show the default.
+"""Build configuration file for trollsift's documentation."""
-import sys, os
+import sys
+import os
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
-sys.path.insert(0, os.path.abspath('../../'))
+sys.path.insert(0, os.path.abspath("../../"))
# -- General configuration -----------------------------------------------------
@@ -25,33 +16,38 @@ sys.path.insert(0, os.path.abspath('../../'))
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
-extensions = ['sphinx.ext.autodoc', 'sphinx.ext.doctest', 'sphinx.ext.intersphinx',
- 'sphinx.ext.napoleon', 'sphinx.ext.viewcode']
+extensions = [
+ "sphinx.ext.autodoc",
+ "sphinx.ext.doctest",
+ "sphinx.ext.intersphinx",
+ "sphinx.ext.napoleon",
+ "sphinx.ext.viewcode",
+]
# Add any paths that contain templates here, relative to this directory.
-templates_path = ['_templates']
+templates_path = ["_templates"]
# The suffix of source filenames.
-source_suffix = '.rst'
+source_suffix = ".rst"
# The encoding of source files.
# source_encoding = 'utf-8-sig'
# The master toctree document.
-master_doc = 'index'
+master_doc = "index"
# General information about the project.
-project = u'trollsift'
-copyright = u'2014, Panu Lahtinen, Hrobjartur Thorsteinsson'
+project = "trollsift"
+copyright = "2014, Panu Lahtinen, Hrobjartur Thorsteinsson"
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
-version = '0.1'
+version = "0.1"
# The full version, including alpha/beta/rc tags.
-release = '0.1.0'
+release = "0.1.0"
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
@@ -82,7 +78,7 @@ exclude_patterns = []
# show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
-pygments_style = 'sphinx'
+pygments_style = "sphinx"
# A list of ignored prefixes for module index sorting.
# modindex_common_prefix = []
@@ -92,7 +88,7 @@ pygments_style = 'sphinx'
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
-html_theme = 'default'
+html_theme = "default"
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
@@ -121,7 +117,7 @@ html_theme = 'default'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
-html_static_path = ['_static']
+html_static_path = ["_static"]
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
@@ -165,7 +161,7 @@ html_static_path = ['_static']
# html_file_suffix = None
# Output file base name for HTML help builder.
-htmlhelp_basename = 'trollsiftdoc'
+htmlhelp_basename = "trollsiftdoc"
# -- Options for LaTeX output --------------------------------------------------
@@ -173,20 +169,23 @@ htmlhelp_basename = 'trollsiftdoc'
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
# 'papersize': 'letterpaper',
-
# The font size ('10pt', '11pt' or '12pt').
# 'pointsize': '10pt',
-
# Additional stuff for the LaTeX preamble.
# 'preamble': '',
- }
+}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, documentclass [howto/manual]).
latex_documents = [
- ('index', 'trollsift.tex', u'Trollsift Documentation',
- u'Hrobjartur Thorsteinsson', 'manual'),
- ]
+ (
+ "index",
+ "trollsift.tex",
+ "Trollsift Documentation",
+ "Hrobjartur Thorsteinsson",
+ "manual",
+ ),
+]
# The name of an image file (relative to this directory) to place at the top of
# the title page.
@@ -214,8 +213,13 @@ latex_documents = [
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
- ('index', 'trollsift', u'Trollsift Documentation',
- [u'Panu Lahtinen', u'Hrobjartur Thorsteinsson'], 1)
+ (
+ "index",
+ "trollsift",
+ "Trollsift Documentation",
+ ["Panu Lahtinen", "Hrobjartur Thorsteinsson"],
+ 1,
+ )
]
# If true, show URL addresses after external links.
@@ -228,11 +232,17 @@ man_pages = [
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
- ('index', 'trollsift', u'Trollsift Documentation', u'Panu Lahtinen',
- u'Hrobjartur Thorsteinsson', 'trollsift',
- 'One line description of project.',
- 'Miscellaneous'),
- ]
+ (
+ "index",
+ "trollsift",
+ "Trollsift Documentation",
+ "Panu Lahtinen",
+ "Hrobjartur Thorsteinsson",
+ "trollsift",
+ "One line description of project.",
+ "Miscellaneous",
+ ),
+]
# Documents to append as an appendix to all manuals.
# texinfo_appendices = []
@@ -245,5 +255,5 @@ texinfo_documents = [
# How intersphinx should find links to other packages
intersphinx_mapping = {
- 'python': ('https://docs.python.org/3', None),
+ "python": ("https://docs.python.org/3", None),
}
=====================================
doc/source/index.rst
=====================================
@@ -11,7 +11,7 @@ Welcome to the trollsift documentation!
=========================================
Trollsift is a collection of modules that assist with formatting, parsing
-and filtering satellite granule file names. These modules are useful and necessary
+and filtering satellite granule file names. These modules are useful and necessary
for writing higher level applications and api's for satellite batch processing.
The source code of the package can be found at github, github_
@@ -34,4 +34,3 @@ Indices and tables
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
-
=====================================
doc/source/installation.rst
=====================================
@@ -38,4 +38,3 @@ To check if your python setup is compatible with trollsift,
you can run the test suite using pytest::
$ pytest trollsift/tests
-
=====================================
doc/source/usage.rst
=====================================
@@ -11,14 +11,14 @@ the library is useful for extracting typical information from granule filenames,
as observation time, platform and instrument names. The trollsift Parser can also
verify that the string formatting is invertible, i.e. specific enough to ensure that
parsing and composing of strings are bijective mappings ( aka one-to-one correspondence )
-which may be essential for some applications, such as predicting granule
+which may be essential for some applications, such as predicting granule filenames.
parsing
^^^^^^^
The Parser object holds a format string, allowing us to parse and compose strings:
>>> from trollsift import Parser
- >>>
+ >>>
>>> p = Parser("/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:05d}.l1b")
>>> data = p.parse("/somedir/otherdir/hrpt_noaa16_20140210_1004_69022.l1b")
>>> print(data) # doctest: +NORMALIZE_WHITESPACE
@@ -82,6 +82,3 @@ depending on your requirements you can call,
'/somedir/otherdir/hrpt_noaa16_20120101_0101_69022.l1b'
And achieve the exact same result as in the Parse object example above.
-
-
-
=====================================
pyproject.toml
=====================================
@@ -6,13 +6,15 @@ readme = "README.rst"
authors = [
{ name = "The Pytroll Team", email = "pytroll at googlegroups.com" }
]
+license = "GPL-3.0-or-later"
+license-files = ["LICENSE.txt"]
classifiers = [
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Science/Research",
- "License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)",
"Operating System :: OS Independent",
"Programming Language :: Python",
- "Topic :: Scientific/Engineering"
+ "Topic :: Scientific/Engineering",
+ "Typing :: Typed",
]
keywords = ["string parsing", "string formatting", "pytroll"]
requires-python = ">=3.9"
@@ -38,3 +40,35 @@ version-file = "trollsift/version.py"
[tool.coverage.run]
relative_files = true
omit = ["trollsift/version.py"]
+
+[tool.pytest.ini_options]
+minversion = 6.0
+addopts = ["-ra", "--showlocals", "--strict-markers", "--strict-config"]
+xfail_strict = true
+log_cli_level = "info"
+testpaths = ["trollsift/tests"]
+filterwarnings = [
+ "error",
+ "ignore:numpy.ndarray size changed, may indicate binary incompatibility:RuntimeWarning",
+]
+
+[tool.ruff]
+line-length = 120
+
+[tool.ruff.lint]
+# See https://docs.astral.sh/ruff/rules/
+select = ["E", "W", "B", "D", "T10", "C90"]
+ignore = ["D101", "D102", "D103", "D104", "D105", "D106", "D107", "E203"]
+
+[tool.ruff.lint.per-file-ignores]
+"doc/source/conf.py" = ["E501"]
+"trollsift/tests/*.py" = ["D205", "D400", "D415", "S101"] # assert allowed in tests
+
+[tool.ruff.lint.pydocstyle]
+convention = "google"
+
+[tool.ruff.lint.mccabe]
+max-complexity = 10
+
+[tool.ruff.lint.isort]
+known-first-party = ["trollsift"]
=====================================
setup.cfg deleted
=====================================
@@ -1,23 +0,0 @@
-[metadata]
-description-file = README.md
-
-[bdist_rpm]
-release=1
-
-[bdist_wheel]
-universal=1
-
-[flake8]
-max-line-length = 120
-
-[versioneer]
-VCS = git
-style = pep440
-versionfile_source = trollsift/version.py
-versionfile_build =
-tag_prefix = v
-
-[coverage:run]
-omit =
- trollsift/version.py
- versioneer.py
=====================================
trollsift/__init__.py
=====================================
@@ -1,5 +1,4 @@
-
-from .parser import *
+from .parser import Parser, StringFormatter, parse, compose, globify, purge, validate
try:
from trollsift.version import version as __version__ # noqa
@@ -7,4 +6,15 @@ except ModuleNotFoundError: # pragma: no cover
raise ModuleNotFoundError(
"No module named trollsift.version. This could mean "
"you didn't install 'trollsift' properly. Try reinstalling ('pip "
- "install').")
+ "install')."
+ ) from None
+
+__all__ = [
+ "Parser",
+ "StringFormatter",
+ "parse",
+ "compose",
+ "globify",
+ "purge",
+ "validate",
+]
=====================================
trollsift/parser.py
=====================================
@@ -17,17 +17,25 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>.
"""Main parsing and formatting functionality."""
+from __future__ import annotations
+
import re
import datetime as dt
import random
import string
from functools import lru_cache
+import typing
+
+if typing.TYPE_CHECKING:
+ from _typeshed import StrOrLiteralStr
+ from typing import Any
+ from collections.abc import Iterable, Sequence, Mapping
-class Parser(object):
+class Parser:
"""Class-based interface to parsing and formatting functionality."""
- def __init__(self, fmt):
+ def __init__(self, fmt: str):
self.fmt = fmt
def __str__(self):
@@ -38,60 +46,57 @@ class Parser(object):
convert_dict = get_convert_dict(self.fmt)
return convert_dict.keys()
- def parse(self, stri, full_match=True):
- '''Parse keys and corresponding values from *stri* using format
- described in *fmt* string.
- '''
+ def parse(self, stri: str, full_match: bool = True) -> dict[str, Any]:
+ """Parse keys and values from ``stri`` using parser's format."""
return parse(self.fmt, stri, full_match=full_match)
- def compose(self, keyvals, allow_partial=False):
- """Compose format string *self.fmt* with parameters given in the *keyvals* dict.
+ def compose(self, keyvals: Mapping[str, Any], allow_partial: bool = False) -> str:
+ """Compose format string ``self.fmt`` with parameters given in the ``keyvals`` dict.
Args:
- keyvals (dict): "Parameter --> parameter value" map
- allow_partial (bool): If True, then partial composition is allowed, i.e.,
+ keyvals: "Parameter --> parameter value" map
+ allow_partial: If True, then partial composition is allowed, i.e.,
not all parameters present in `fmt` need to be specified in `keyvals`.
Unspecified parameters will, in this case, be left unchanged.
(Default value = False).
Returns:
- str: Result of formatting the *self.fmt* string with parameter values
- extracted from the corresponding items in the *keyvals* dictionary.
+ Result of formatting the *self.fmt* string with parameter values
+ extracted from the corresponding items in the *keyvals* dictionary.
"""
return compose(fmt=self.fmt, keyvals=keyvals, allow_partial=allow_partial)
format = compose
- def globify(self, keyvals=None):
- '''Generate a string useable with glob.glob() from format string
- *fmt* and *keyvals* dictionary.
- '''
+ def globify(self, keyvals: Mapping[str, Any] | None = None) -> str:
+ """Generate a string usable with glob.glob() from format string."""
return globify(self.fmt, keyvals)
- def validate(self, stri):
- """
- Validates that string *stri* is parsable and therefore complies with
- this string format definition. Useful for filtering strings, or to
- check if a string if compatible before passing it to the
+ def validate(self, stri: str) -> bool:
+ """Validate that string ``stri`` conforms to the parser's format definition.
+
+ Checks that the provided string is parsable and therefore complies with
+ this parser's string format definition. Useful for filtering strings,
+ or to check if a string is compatible before passing it to the
parser function.
"""
return validate(self.fmt, stri)
def is_one2one(self):
- """
- Runs a check to evaluate if this format string has a
- one to one correspondence. I.e. that successive composing and
- parsing opperations will result in the original data.
+ """Check if this parser's format string has a one to one correspondence.
+
+ That is, that successive composing and
+ parsing operations will result in the original data.
In other words, that input data maps to a string,
which then maps back to the original data without any change
or loss in information.
Note: This test only applies to sensible usage of the format string.
- If string or numeric data is causes overflow, e.g.
- if composing "abcd" into {3s}, one to one correspondence will always
- be broken in such cases. This off course also applies to precision
- losses when using datetime data.
+ If string or numeric data causes overflow, e.g.
+ if composing "abcd" into ``{3s}``, one to one correspondence will always
+ be broken in such cases. This of course also applies to precision
+ losses when using datetime data.
"""
return is_one2one(self.fmt)
@@ -119,26 +124,30 @@ class StringFormatter(string.Formatter):
- H: A combination of 'R' and 'u'.
"""
+
CONV_FUNCS = {
- 'c': 'capitalize',
- 'h': 'lower',
- 'H': 'upper',
- 'l': 'lower',
- 't': 'title',
- 'u': 'upper'
+ "c": "capitalize",
+ "h": "lower",
+ "H": "upper",
+ "l": "lower",
+ "t": "title",
+ "u": "upper",
}
- def convert_field(self, value, conversion):
- """Apply conversions mentioned above."""
- func = self.CONV_FUNCS.get(conversion)
+ def convert_field(self, value: str, conversion: str | None) -> str:
+ """Apply conversions mentioned in `StringFormatter.CONV_FUNCS`."""
+ if conversion is None:
+ func = None
+ else:
+ func = self.CONV_FUNCS.get(conversion)
if func is not None:
value = getattr(value, func)()
- elif conversion not in ['R']:
+ elif conversion not in ["R"]:
# default conversion ('r', 's')
return super(StringFormatter, self).convert_field(value, conversion)
- if conversion in ['h', 'H', 'R']:
- value = value.replace('-', '').replace('_', '').replace(':', '').replace(' ', '')
+ if conversion in ["h", "H", "R"]:
+ value = value.replace("-", "").replace("_", "").replace(":", "").replace(" ", "")
return value
@@ -147,39 +156,37 @@ formatter = StringFormatter()
# taken from https://docs.python.org/3/library/re.html#simulating-scanf
spec_regexes = {
- 'b': r'[-+]?[0-1]',
- 'c': r'.',
- 'd': r'[-+]?\d',
- 'f': {
- # Naive fixed point format specifier (e.g. {foo:f})
- 'naive': r'[-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?',
- # Fixed point format specifier including width and precision
- # (e.g. {foo:4.2f}). The lookahead (?=.{width}) makes sure that the
- # subsequent pattern is only matched if the string has the required
- # (minimum) width.
- 'precision': r'(?=.{{{width}}})([-+]?([\d ]+(\.\d{{{decimals}}})+|\.\d{{{decimals}}})([eE][-+]?\d+)?)'
-
- },
- 'i': r'[-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)',
- 'o': r'[-+]?[0-7]',
- 's': r'\S',
- 'x': r'[-+]?(0[xX])?[\dA-Fa-f]',
+ "b": r"[-+]?[0-1]",
+ "c": r".",
+ "d": r"[-+]?\d",
+ # Naive fixed point format specifier (e.g. {foo:f})
+ "f": r"[-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?",
+ # Fixed point format specifier including width and precision
+ # (e.g. {foo:4.2f}). The lookahead (?=.{width}) makes sure that the
+ # subsequent pattern is only matched if the string has the required
+ # (minimum) width.
+ "f_with_precision": r"(?=.{{{width}}})([-+]?([\d ]+(\.\d{{{decimals}}})+|\.\d{{{decimals}}})([eE][-+]?\d+)?)",
+ "i": r"[-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)",
+ "o": r"[-+]?[0-7]",
+ "s": r"\S",
+ "x": r"[-+]?(0[xX])?[\dA-Fa-f]",
}
-spec_regexes['e'] = spec_regexes['f']
-spec_regexes['E'] = spec_regexes['f']
-spec_regexes['g'] = spec_regexes['f']
-spec_regexes['X'] = spec_regexes['x']
-spec_regexes[''] = spec_regexes['s']
-allow_multiple = ['b', 'c', 'd', 'o', 's', '', 'x', 'X']
-fixed_point_types = ['f', 'e', 'E', 'g']
+spec_regexes["e"] = spec_regexes["f"]
+spec_regexes["E"] = spec_regexes["f"]
+spec_regexes["g"] = spec_regexes["f"]
+spec_regexes["X"] = spec_regexes["x"]
+spec_regexes[""] = spec_regexes["s"]
+allow_multiple = ["b", "c", "d", "o", "s", "", "x", "X"]
+fixed_point_types = ["f", "e", "E", "g"]
# format_spec ::= [[fill]align][sign][#][0][width][,][.precision][type]
# https://docs.python.org/3.4/library/string.html#format-specification-mini-language
fmt_spec_regex = re.compile(
- r'(?P<align>(?P<fill>.)?[<>=^])?(?P<sign>[\+\-\s])?(?P<pound>#)?(?P<zero>0)?(?P<width>\d+)?'
- r'(?P<comma>,)?(?P<precision>.\d+)?(?P<type>[bcdeEfFgGnosxX%]?)')
+ r"(?P<align>(?P<fill>.)?[<>=^])?(?P<sign>[\+\-\s])?(?P<pound>#)?(?P<zero>0)?(?P<width>\d+)?"
+ r"(?P<comma>,)?(?P<precision>.\d+)?(?P<type>[bcdeEfFgGnosxX%]?)"
+)
-def _get_fixed_point_regex(regex_dict, width, precision):
+def _get_fixed_point_regex(width: str | None, precision: str | None) -> str:
"""Get regular expression for fixed point numbers.
Args:
@@ -188,15 +195,14 @@ def _get_fixed_point_regex(regex_dict, width, precision):
"""
if width or precision:
if precision is None:
- precision = '0,'
+ precision = "0,"
else:
- precision = precision.strip('.')
+ precision = precision.strip(".")
if width is None:
- width = '1,'
- return regex_dict['precision'].format(
- width=width, decimals=precision)
+ width = "1,"
+ return spec_regexes["f_with_precision"].format(width=width, decimals=precision)
else:
- return regex_dict['naive']
+ return spec_regexes["f"]
class RegexFormatter(string.Formatter):
@@ -226,18 +232,18 @@ class RegexFormatter(string.Formatter):
"""
# special string to mark a parameter not being specified
- UNPROVIDED_VALUE = '<trollsift unprovided value>'
- ESCAPE_CHARACTERS = ['\\'] + [x for x in string.punctuation if x not in '\\%']
- ESCAPE_SETS = [(c, '\\' + c) for c in ESCAPE_CHARACTERS]
+ UNPROVIDED_VALUE = "<trollsift unprovided value>"
+ ESCAPE_CHARACTERS = ["\\"] + [x for x in string.punctuation if x not in "\\%"]
+ ESCAPE_SETS = [(c, "\\" + c) for c in ESCAPE_CHARACTERS]
def __init__(self):
# hold on to fields we've seen already so we can reuse their
# definitions in the regex
self._cached_fields = {}
+ self.format = lru_cache()(self._uncached_format)
super(RegexFormatter, self).__init__()
- @lru_cache()
- def format(*args, **kwargs):
+ def _uncached_format(*args, **kwargs):
try:
# super() doesn't seem to work here
ret_val = string.Formatter.format(*args, **kwargs)
@@ -246,7 +252,7 @@ class RegexFormatter(string.Formatter):
self._cached_fields.clear()
return ret_val
- def _escape(self, s):
+ def _escape(self, s: str) -> str:
"""Escape bad characters for regular expressions.
Similar to `re.escape` but allows '%' to pass through.
@@ -256,7 +262,16 @@ class RegexFormatter(string.Formatter):
s = s.replace(ch, r_ch)
return s
- def parse(self, format_string):
+ def parse(
+ self, format_string: StrOrLiteralStr
+ ) -> Iterable[
+ tuple[
+ StrOrLiteralStr,
+ StrOrLiteralStr | None,
+ StrOrLiteralStr | None,
+ StrOrLiteralStr | None,
+ ]
+ ]:
parse_ret = super(RegexFormatter, self).parse(format_string)
for literal_text, field_name, format_spec, conversion in parse_ret:
# the parent class will call parse multiple times moving
@@ -265,150 +280,153 @@ class RegexFormatter(string.Formatter):
literal_text = self._escape(literal_text)
yield literal_text, field_name, format_spec, conversion
- def get_value(self, key, args, kwargs):
+ def get_value(self, key: int | str, args: Sequence[Any], kwargs: Mapping[str, Any]) -> Any:
try:
return super(RegexFormatter, self).get_value(key, args, kwargs)
except (IndexError, KeyError):
return key, self.UNPROVIDED_VALUE
- def _regex_datetime(self, format_spec):
+ def _regex_datetime(self, format_spec: str) -> str:
replace_str = format_spec
for fmt_key, fmt_val in DT_FMT.items():
- if fmt_key == '%%':
+ if fmt_key == "%%":
# special case
- replace_str.replace('%%', '%')
+ replace_str.replace("%%", "%")
continue
- count = fmt_val.count('?')
+ count = fmt_val.count("?")
# either a series of numbers or letters/numbers
- regex = r'\d{{{:d}}}'.format(count) if count else r'[^ \t\n\r\f\v\-_:]+'
+ regex = r"\d{{{:d}}}".format(count) if count else r"[^ \t\n\r\f\v\-_:]+"
replace_str = replace_str.replace(fmt_key, regex)
return replace_str
- @staticmethod
- def format_spec_to_regex(field_name, format_spec):
- """Make an attempt at converting a format spec to a regular expression."""
- # NOTE: remove escaped backslashes so regex matches
- regex_match = fmt_spec_regex.match(format_spec.replace('\\', ''))
- if regex_match is None:
- raise ValueError("Invalid format specification: '{}'".format(format_spec))
- regex_dict = regex_match.groupdict()
- fill = regex_dict['fill']
- ftype = regex_dict['type']
- width = regex_dict['width']
- align = regex_dict['align']
- precision = regex_dict['precision']
- # NOTE: does not properly handle `=` alignment
- if fill is None:
- if width is not None and width[0] == '0':
- fill = '0'
- elif ftype in ['s', '', 'd', 'x', 'X', 'o', 'b']:
- fill = ' '
-
- char_type = spec_regexes[ftype]
- if ftype in fixed_point_types:
- char_type = _get_fixed_point_regex(
- char_type,
- width=width,
- precision=precision
- )
- if ftype in ('s', '') and align and align.endswith('='):
- raise ValueError("Invalid format specification: '{}'".format(format_spec))
- final_regex = char_type
- if ftype in allow_multiple and (not width or width == '0'):
- final_regex += r'*?'
- elif width and width != '0':
- if not fill and ftype not in fixed_point_types:
- # we know we have exactly this many characters
- final_regex += r'{{{}}}'.format(int(width))
- elif fill:
- # we don't know how many fill characters we have compared to
- # field characters so just match all characters and sort it out
- # later during type conversion.
- final_regex = r'.{{{}}}'.format(int(width))
- elif ftype in allow_multiple:
- final_regex += r'*?'
-
- return r'(?P<{}>{})'.format(field_name, final_regex)
-
- def regex_field(self, field_name, value, format_spec):
+ def regex_field(self, field_name: str, value: Any, format_spec: str) -> str:
if value != self.UNPROVIDED_VALUE:
return super(RegexFormatter, self).format_field(value, format_spec)
if self._cached_fields.get(field_name, format_spec) != format_spec:
- raise ValueError("Can't specify the same field_name with "
- "different formats: {}".format(field_name))
+ raise ValueError("Can't specify the same field_name with different formats: {}".format(field_name))
elif field_name in self._cached_fields:
- return r'(?P={})'.format(field_name)
+ return r"(?P={})".format(field_name)
else:
self._cached_fields[field_name] = format_spec
# Replace format spec with glob patterns (*, ?, etc)
if not format_spec:
- return r'(?P<{}>.*?)'.format(field_name)
- if '%' in format_spec:
- return r'(?P<{}>{})'.format(field_name, self._regex_datetime(format_spec))
- return self.format_spec_to_regex(field_name, format_spec)
+ return r"(?P<{}>.*?)".format(field_name)
+ if "%" in format_spec:
+ return r"(?P<{}>{})".format(field_name, self._regex_datetime(format_spec))
+ return format_spec_to_regex(field_name, format_spec)
- def format_field(self, value, format_spec):
+ def format_field(self, value: Any, format_spec: str) -> str:
if not isinstance(value, tuple) or value[1] != self.UNPROVIDED_VALUE:
return super(RegexFormatter, self).format_field(value, format_spec)
field_name, value = value
return self.regex_field(field_name, value, format_spec)
+def format_spec_to_regex(field_name: str, format_spec: str) -> str:
+ """Make an attempt at converting a format spec to a regular expression."""
+ # NOTE: remove escaped backslashes so regex matches
+ regex_match = fmt_spec_regex.match(format_spec.replace("\\", ""))
+ if regex_match is None:
+ raise ValueError("Invalid format specification: '{}'".format(format_spec))
+ regex_dict = regex_match.groupdict()
+ ftype = regex_dict["type"]
+ width = regex_dict["width"]
+ align = regex_dict["align"]
+ precision = regex_dict["precision"]
+ fill = _get_fill(regex_dict["fill"], width, ftype)
+
+ char_type = spec_regexes[ftype]
+ if ftype in fixed_point_types:
+ char_type = _get_fixed_point_regex(width=width, precision=precision)
+ if ftype in ("s", "") and align and align.endswith("="):
+ raise ValueError("Invalid format specification: '{}'".format(format_spec))
+ final_regex = char_type
+ if ftype in allow_multiple and (not width or width == "0"):
+ final_regex += r"*?"
+ elif width and width != "0":
+ if not fill and ftype not in fixed_point_types:
+ # we know we have exactly this many characters
+ final_regex += r"{{{}}}".format(int(width))
+ elif fill:
+ # we don't know how many fill characters we have compared to
+ # field characters so just match all characters and sort it out
+ # later during type conversion.
+ final_regex = r".{{{}}}".format(int(width))
+ elif ftype in allow_multiple:
+ final_regex += r"*?"
+
+ return r"(?P<{}>{})".format(field_name, final_regex)
+
+
+def _get_fill(fill: str | None, width: str | None, ftype: str | None) -> str | None:
+ # NOTE: does not properly handle `=` alignment
+ if fill is None:
+ if width is not None and width[0] == "0":
+ fill = "0"
+ elif ftype in ["s", "", "d", "x", "X", "o", "b"]:
+ fill = " "
+ return fill
+
+
@lru_cache()
-def regex_format(fmt):
+def regex_format(fmt: str) -> str:
# We create a new instance of RegexFormatter here to prevent concurrent calls to
# format interfering with one another.
return RegexFormatter().format(fmt)
-def extract_values(fmt, stri, full_match=True):
+def extract_values(fmt: str, stri: str, full_match: bool = True) -> dict[str, Any]:
"""Extract information from string matching format.
Args:
- fmt (str): Python format string to match against
- stri (str): String to extract information from
- full_match (bool): Force the match of the whole string. Default
+ fmt: Python format string to match against
+ stri: String to extract information from
+ full_match: Force the match of the whole string. Default
to ``True``.
"""
regex = regex_format(fmt)
if full_match:
- regex = '^' + regex + '$'
+ regex = "^" + regex + "$"
match = re.match(regex, stri)
if match is None:
raise ValueError("String does not match pattern.")
return match.groupdict()
-def _get_number_from_fmt(fmt):
+def _get_number_from_fmt(fmt: str) -> int:
"""Helper function for extract_values.
Figures out string length from format string.
"""
- if '%' in fmt:
+ if "%" in fmt:
# its datetime
return len(("{0:" + fmt + "}").format(dt.datetime.now()))
else:
# its something else
- fmt = fmt.lstrip('0')
- return int(re.search('[0-9]+', fmt).group(0))
+ fmt = fmt.lstrip("0")
+ fmt_digits_match = re.search("[0-9]+", fmt)
+ if fmt_digits_match is None:
+ raise ValueError(f"No number specified in format string: {fmt}")
+ return int(fmt_digits_match.group(0))
-def _convert(convdef, stri):
+def _convert(convdef: str, stri: str) -> Any:
"""Convert the string *stri* to the given conversion definition *convdef*."""
- if '%' in convdef:
+ result: Any # force mypy type
+ if "%" in convdef:
result = dt.datetime.strptime(stri, convdef)
else:
result = _strip_padding(convdef, stri)
- if 'd' in convdef:
+ if "d" in convdef:
result = int(result)
- elif 'x' in convdef or 'X' in convdef:
+ elif "x" in convdef or "X" in convdef:
result = int(result, 16)
- elif 'o' in convdef:
+ elif "o" in convdef:
result = int(result, 8)
- elif 'b' in convdef:
+ elif "b" in convdef:
result = int(result, 2)
elif any(float_type_marker in convdef for float_type_marker in fixed_point_types):
result = float(result)
@@ -416,7 +434,7 @@ def _convert(convdef, stri):
return result
-def _strip_padding(convdef, stri):
+def _strip_padding(convdef: str, stri: str) -> str:
"""Strip padding from the given string.
Args:
@@ -425,41 +443,41 @@ def _strip_padding(convdef, stri):
"""
regex_match = fmt_spec_regex.match(convdef)
match_dict = regex_match.groupdict() if regex_match else {}
- align = match_dict.get('align')
- pad = match_dict.get('fill')
+ align = match_dict.get("align")
+ pad = match_dict.get("fill")
if align:
# align character is the last one
align = align[-1]
- if align and align in '<>^' and not pad:
- pad = ' '
- if align == '>':
+ if align and align in "<>^" and not pad:
+ pad = " "
+ if align == ">":
stri = stri.lstrip(pad)
- elif align == '<':
+ elif align == "<":
stri = stri.rstrip(pad)
- elif align == '^':
+ elif align == "^":
stri = stri.strip(pad)
return stri
+
@lru_cache()
-def get_convert_dict(fmt):
+def get_convert_dict(fmt: str) -> dict[str, str]:
"""Retrieve parse definition from the format string `fmt`."""
convdef = {}
- for literal_text, field_name, format_spec, conversion in formatter.parse(fmt):
- if field_name is None:
+ for _literal_text, field_name, format_spec, _conversion in formatter.parse(fmt):
+ if field_name is None or format_spec is None:
continue
# XXX: Do I need to include 'conversion'?
convdef[field_name] = format_spec
return convdef
-def parse(fmt, stri, full_match=True):
+def parse(fmt: str, stri: str, full_match: bool = True) -> dict[str, Any]:
"""Parse keys and corresponding values from *stri* using format described in *fmt* string.
Args:
- fmt (str): Python format string to match against
- stri (str): String to extract information from
- full_match (bool): Force the match of the whole string. Default
- True.
+ fmt: Python format string to match against
+ stri: String to extract information from
+ full_match: Force the match of the whole string. Default True.
"""
convdef = get_convert_dict(fmt)
@@ -470,20 +488,20 @@ def parse(fmt, stri, full_match=True):
return keyvals
-def compose(fmt, keyvals, allow_partial=False):
+def compose(fmt: str, keyvals: Mapping[str, Any], allow_partial: bool = False) -> str:
"""Compose format string *self.fmt* with parameters given in the *keyvals* dict.
Args:
- fmt (str): Python format string to match against
- keyvals (dict): "Parameter --> parameter value" map
- allow_partial (bool): If True, then partial composition is allowed, i.e.,
+ fmt: Python format string to match against
+ keyvals: "Parameter --> parameter value" map
+ allow_partial: If True, then partial composition is allowed, i.e.,
not all parameters present in `fmt` need to be specified in `keyvals`.
Unspecified parameters will, in this case, be left unchanged.
(Default value = False).
Returns:
- str: Result of formatting the *self.fmt* string with parameter values
- extracted from the corresponding items in the *keyvals* dictionary.
+ Result of formatting the *self.fmt* string with parameter values
+ extracted from the corresponding items in the *keyvals* dictionary.
"""
if allow_partial:
@@ -515,23 +533,22 @@ DT_FMT = {
"%c": "*",
"%x": "*",
"%X": "*",
- "%%": "?"
+ "%%": "?",
}
class GlobifyFormatter(string.Formatter):
-
# special string to mark a parameter not being specified
- UNPROVIDED_VALUE = '<trollsift unprovided value>'
+ UNPROVIDED_VALUE = "<trollsift unprovided value>"
- def get_value(self, key, args, kwargs):
+ def get_value(self, key: str | int, args: Sequence[Any], kwargs: Mapping[str, Any]) -> Any:
try:
return super(GlobifyFormatter, self).get_value(key, args, kwargs)
except (IndexError, KeyError):
# assumes that
return self.UNPROVIDED_VALUE
- def format_field(self, value, format_spec):
+ def format_field(self, value: Any, format_spec: str) -> str:
if not isinstance(value, (list, tuple)) and value != self.UNPROVIDED_VALUE:
return super(GlobifyFormatter, self).format_field(value, format_spec)
elif value != self.UNPROVIDED_VALUE:
@@ -540,41 +557,38 @@ class GlobifyFormatter(string.Formatter):
# (value, partial format string)
value, dt_fmt = value
for fmt_letter in dt_fmt:
- fmt = '%' + fmt_letter
+ fmt = "%" + fmt_letter
format_spec = format_spec.replace(fmt, value.strftime(fmt))
# Replace format spec with glob patterns (*, ?, etc)
if not format_spec:
- return '*'
- if '%' in format_spec:
+ return "*"
+ if "%" in format_spec:
replace_str = format_spec
for fmt_key, fmt_val in DT_FMT.items():
replace_str = replace_str.replace(fmt_key, fmt_val)
return replace_str
- if not re.search('[0-9]+', format_spec):
+ if not re.search("[0-9]+", format_spec):
# non-integer type
- return '*'
- return '?' * _get_number_from_fmt(format_spec)
+ return "*"
+ return "?" * _get_number_from_fmt(format_spec)
globify_formatter = GlobifyFormatter()
-def globify(fmt, keyvals=None):
- """Generate a string usable with glob.glob() from format string
- *fmt* and *keyvals* dictionary.
- """
+def globify(fmt: str, keyvals: Mapping[str, Any] | None = None) -> Any:
+ """Generate a string usable with glob.glob() from format string and provided information."""
if keyvals is None:
keyvals = {}
return globify_formatter.format(fmt, **keyvals)
-def validate(fmt, stri):
- """
- Validates that string *stri* is parsable and therefore complies with
- the format string, *fmt*. Useful for filtering string, or to
- check if string if compatible before passing the string to the
- parser function.
+def validate(fmt: str, stri: str) -> bool:
+ """Validates that string ``stri`` conforms to ``fmt``.
+
+ Useful for filtering string, or to check if string is compatible before
+ passing the string to the parser function.
"""
try:
parse(fmt, stri)
@@ -583,7 +597,7 @@ def validate(fmt, stri):
return False
-def _generate_data_for_format(fmt):
+def _generate_data_for_format(fmt: str) -> dict[str, Any]:
"""Generate a fake data dictionary to fill in the provided format string."""
# finally try some data, create some random data for the fmt.
data = {}
@@ -591,7 +605,7 @@ def _generate_data_for_format(fmt):
# if we get two in a row then we know the pattern is invalid, meaning
# we'll never be able to match the second wildcard field
free_size_start = False
- for literal_text, field_name, format_spec, conversion in formatter.parse(fmt):
+ for literal_text, field_name, format_spec, _conversion in formatter.parse(fmt):
if literal_text:
free_size_start = False
@@ -603,46 +617,50 @@ def _generate_data_for_format(fmt):
# e.g. {:s}{:s} or {:s}{:4s}{:d}
if not format_spec or format_spec == "s" or format_spec == "d":
if free_size_start:
- return None
+ raise ValueError("Can't generate data for spec with two or more fields with no size specifier.")
else:
free_size_start = True
# make some data for this key and format
- if format_spec and '%' in format_spec:
- # some datetime
- t = dt.datetime.now()
- # run once through format to limit precision
- t = parse(
- "{t:" + format_spec + "}", compose("{t:" + format_spec + "}", {'t': t}))['t']
- data[field_name] = t
- elif format_spec and 'd' in format_spec:
- # random number (with n sign. figures)
- if not format_spec.isalpha():
- n = _get_number_from_fmt(format_spec)
- else:
- # clearly bad
- return None
- data[field_name] = random.randint(0, 99999999999999999) % (10 ** n)
- else:
- # string type
- if format_spec is None:
- n = 4
- elif format_spec.isalnum():
- n = _get_number_from_fmt(format_spec)
- else:
- n = 4
- randstri = ''
- for x in range(n):
- randstri += random.choice(string.ascii_letters)
- data[field_name] = randstri
+ data[field_name] = _gen_data_for_spec(format_spec)
return data
-def is_one2one(fmt):
- """
- Runs a check to evaluate if the format string has a
- one to one correspondence. I.e. that successive composing and
- parsing opperations will result in the original data.
+def _gen_data_for_spec(format_spec: str | None) -> int | str | dt.datetime:
+ if format_spec and "%" in format_spec:
+ # some datetime
+ t = dt.datetime.now()
+ # run once through format to limit precision
+ t = parse("{t:" + format_spec + "}", compose("{t:" + format_spec + "}", {"t": t}))["t"]
+ return t
+
+ if format_spec and "d" in format_spec:
+ # random number (with n sign. figures)
+ if not format_spec.isalpha():
+ n = _get_number_from_fmt(format_spec)
+ else:
+ # clearly bad
+ raise ValueError(f"Bad format specification: {format_spec!r}")
+ return random.randint(0, 99999999999999999) % (10**n)
+
+ # string type
+ if format_spec is None:
+ n = 4
+ elif format_spec.isalnum():
+ n = _get_number_from_fmt(format_spec)
+ else:
+ n = 4
+ randstri = ""
+ for _ in range(n):
+ randstri += random.choice(string.ascii_letters)
+ return randstri
+
+
+def is_one2one(fmt: str) -> bool:
+ """Check if the format string has a one to one correspondence.
+
+ That is, that successive composing and
+ parsing operations will result in the original data.
In other words, that input data maps to a string,
which then maps back to the original data without any change
or loss in information.
@@ -653,8 +671,9 @@ def is_one2one(fmt):
be broken in such cases. This of course also applies to precision
losses when using datetime data.
"""
- data = _generate_data_for_format(fmt)
- if data is None:
+ try:
+ data = _generate_data_for_format(fmt)
+ except ValueError:
return False
# run data forward once and back to data
@@ -672,7 +691,7 @@ def is_one2one(fmt):
return True
-def purge():
+def purge() -> None:
"""Clear internal caches.
Not needed normally, but can be used to force cache clear when memory
@@ -683,12 +702,12 @@ def purge():
get_convert_dict.cache_clear()
-def _strict_compose(fmt, keyvals):
+def _strict_compose(fmt: str, keyvals: Mapping[str, Any]) -> str:
"""Convert parameters in `keyvals` to a string based on `fmt` string."""
return formatter.format(fmt, **keyvals)
-def _partial_compose(fmt, keyvals):
+def _partial_compose(fmt: str, keyvals: Mapping[str, Any]) -> str:
"""Convert parameters in `keyvals` to a string based on `fmt` string.
Similar to _strict_compose, but accepts partial composing, i.e., not all
@@ -708,19 +727,18 @@ def _partial_compose(fmt, keyvals):
return composed_string
-def _replace_undefined_params_with_placeholders(fmt, keyvals=None):
+def _replace_undefined_params_with_placeholders(
+ fmt: str, keyvals: Mapping[str, Any] | None = None
+) -> tuple[str, dict[str, Any]]:
"""Replace with placeholders params in `fmt` not specified in `keyvals`."""
- vars_left_undefined = get_convert_dict(fmt).keys()
+ vars_left_undefined = set(get_convert_dict(fmt).keys())
if keyvals is not None:
vars_left_undefined -= keyvals.keys()
undefined_vars_placeholders_dict = {}
new_fmt = fmt
for var in sorted(vars_left_undefined):
- matches = set(
- match.group()
- for match in re.finditer(rf"{{{re.escape(var)}([^\w{{}}].*?)*}}", new_fmt)
- )
+ matches = set(match.group() for match in re.finditer(rf"{{{re.escape(var)}([^\w{{}}].*?)*}}", new_fmt))
if len(matches) == 0:
raise ValueError(f"Could not capture definitions for {var} from {fmt}")
for var_specification in matches:
=====================================
trollsift/py.typed
=====================================
=====================================
trollsift/tests/integrationtests/test_parser.py
=====================================
@@ -16,6 +16,7 @@
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
"""Parser integration tests."""
+
import os
import unittest
import datetime as dt
@@ -24,14 +25,16 @@ from trollsift.parser import Parser
class TestParser(unittest.TestCase):
-
def setUp(self):
- self.fmt = "/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}" +\
- "_{time:%Y%m%d_%H%M}_{orbit:05d}.l1b"
+ self.fmt = "/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}" + "_{time:%Y%m%d_%H%M}_{orbit:05d}.l1b"
self.string = "/somedir/otherdir/hrpt_noaa16_20140210_1004_69022.l1b"
- self.data = {'directory': 'otherdir', 'platform': 'noaa',
- 'platnum': '16',
- 'time': dt.datetime(2014, 2, 10, 10, 4), 'orbit': 69022}
+ self.data = {
+ "directory": "otherdir",
+ "platform": "noaa",
+ "platnum": "16",
+ "time": dt.datetime(2014, 2, 10, 10, 4),
+ "orbit": 69022,
+ }
self.p = Parser(self.fmt)
def test_parse(self):
@@ -43,6 +46,7 @@ class TestParser(unittest.TestCase):
def test_cache_clear(self):
"""Test we can clear the internal cache properly"""
from trollsift.parser import purge, regex_format
+
# Run
result = self.p.parse(self.string)
# Assert
@@ -59,11 +63,9 @@ class TestParser(unittest.TestCase):
def test_validate(self):
# These cases are True
- self.assertTrue(
- self.p.validate("/somedir/avhrr/2014/hrpt_noaa19_20140212_1412_12345.l1b"))
+ self.assertTrue(self.p.validate("/somedir/avhrr/2014/hrpt_noaa19_20140212_1412_12345.l1b"))
# These cases are False
- self.assertFalse(
- self.p.validate("/somedir/bla/bla/hrpt_noaa19_20140212__1412_00000.l1b"))
+ self.assertFalse(self.p.validate("/somedir/bla/bla/hrpt_noaa19_20140212__1412_00000.l1b"))
def assertDictEqual(self, a, b):
for key in a:
@@ -82,30 +84,44 @@ class TestParser(unittest.TestCase):
class TestParserVariousFormats(unittest.TestCase):
-
def test_parse_viirs_sdr(self):
- fmt = 'SVI01_{platform_shortname}_d{start_time:%Y%m%d_t%H%M%S%f}_e{end_time:%H%M%S%f}_b{orbit:5d}_c{creation_time:%Y%m%d%H%M%S%f}_{source}.h5'
- filename = 'SVI01_npp_d20120225_t1801245_e1802487_b01708_c20120226002130255476_noaa_ops.h5'
- data = {'platform_shortname': 'npp',
- 'start_time': dt.datetime(2012, 2, 25, 18, 1, 24, 500000), 'orbit': 1708,
- 'end_time': dt.datetime(1900, 1, 1, 18, 2, 48, 700000),
- 'source': 'noaa_ops',
- 'creation_time': dt.datetime(2012, 2, 26, 0, 21, 30, 255476)}
+ fmt = (
+ "SVI01_{platform_shortname}_d{start_time:%Y%m%d_t%H%M%S%f}_"
+ "e{end_time:%H%M%S%f}_b{orbit:5d}_c{creation_time:%Y%m%d%H%M%S%f}_{source}.h5"
+ )
+ filename = "SVI01_npp_d20120225_t1801245_e1802487_b01708_c20120226002130255476_noaa_ops.h5"
+ data = {
+ "platform_shortname": "npp",
+ "start_time": dt.datetime(2012, 2, 25, 18, 1, 24, 500000),
+ "orbit": 1708,
+ "end_time": dt.datetime(1900, 1, 1, 18, 2, 48, 700000),
+ "source": "noaa_ops",
+ "creation_time": dt.datetime(2012, 2, 26, 0, 21, 30, 255476),
+ }
p = Parser(fmt)
result = p.parse(filename)
self.assertDictEqual(result, data)
def test_parse_iasi_l2(self):
- fmt = "W_XX-EUMETSAT-{reception_location},{instrument},{long_platform_id}+{processing_location}_C_EUMS_{processing_time:%Y%m%d%H%M%S}_IASI_PW3_02_{platform_id}_{start_time:%Y%m%d-%H%M%S}Z_{end_time:%Y%m%d.%H%M%S}Z.hdf"
- filename = "W_XX-EUMETSAT-kan,iasi,metopb+kan_C_EUMS_20170920103559_IASI_PW3_02_M01_20170920-102217Z_20170920.102912Z.hdf"
- data = {'reception_location': 'kan',
- 'instrument': 'iasi',
- 'long_platform_id': 'metopb',
- 'processing_location': 'kan',
- 'processing_time': dt.datetime(2017, 9, 20, 10, 35, 59),
- 'platform_id': 'M01',
- 'start_time': dt.datetime(2017, 9, 20, 10, 22, 17),
- 'end_time': dt.datetime(2017, 9, 20, 10, 29, 12)}
+ fmt = (
+ "W_XX-EUMETSAT-{reception_location},{instrument},{long_platform_id}+{processing_location}_"
+ "C_EUMS_{processing_time:%Y%m%d%H%M%S}_IASI_PW3_02_{platform_id}_{start_time:%Y%m%d-%H%M%S}Z_"
+ "{end_time:%Y%m%d.%H%M%S}Z.hdf"
+ )
+ filename = (
+ "W_XX-EUMETSAT-kan,iasi,metopb+kan_C_EUMS_20170920103559_IASI_PW3_02_"
+ "M01_20170920-102217Z_20170920.102912Z.hdf"
+ )
+ data = {
+ "reception_location": "kan",
+ "instrument": "iasi",
+ "long_platform_id": "metopb",
+ "processing_location": "kan",
+ "processing_time": dt.datetime(2017, 9, 20, 10, 35, 59),
+ "platform_id": "M01",
+ "start_time": dt.datetime(2017, 9, 20, 10, 22, 17),
+ "end_time": dt.datetime(2017, 9, 20, 10, 29, 12),
+ }
p = Parser(fmt)
result = p.parse(filename)
self.assertDictEqual(result, data)
@@ -116,37 +132,37 @@ class TestParserVariousFormats(unittest.TestCase):
"{end_time:%Y%m%dT%H%M%S}_{creation_time:%Y%m%dT%H%M%S}_{duration:4d}_"
"{cycle:3d}_{relative_orbit:3d}_{frame:4d}_{centre:3s}_{platform_mode:1s}_"
"{timeliness:2s}_{collection:3s}.SEN3",
- "{dataset_name}_radiance.nc")
+ "{dataset_name}_radiance.nc",
+ )
# made up:
filename = os.path.join(
- "S3A_OL_1_EFR____20180916T090539_"
- "20180916T090839_20180916T090539_0001_"
- "001_001_0001_CEN_M_"
- "AA_AAA.SEN3",
- "Oa21_radiance.nc")
- data = {'mission_id': 'S3A',
- 'datatype_id': 'EFR',
- 'start_time': dt.datetime(2018, 9, 16, 9, 5, 39),
- 'end_time': dt.datetime(2018, 9, 16, 9, 8, 39),
- 'creation_time': dt.datetime(2018, 9, 16, 9, 5, 39),
- 'duration': 1,
- 'cycle': 1,
- 'relative_orbit': 1,
- 'frame': 1,
- 'centre': 'CEN',
- 'platform_mode': 'M',
- 'timeliness': 'AA',
- 'collection': 'AAA',
- 'dataset_name': 'Oa21',
- }
+ "S3A_OL_1_EFR____20180916T090539_20180916T090839_20180916T090539_0001_001_001_0001_CEN_M_AA_AAA.SEN3",
+ "Oa21_radiance.nc",
+ )
+ data = {
+ "mission_id": "S3A",
+ "datatype_id": "EFR",
+ "start_time": dt.datetime(2018, 9, 16, 9, 5, 39),
+ "end_time": dt.datetime(2018, 9, 16, 9, 8, 39),
+ "creation_time": dt.datetime(2018, 9, 16, 9, 5, 39),
+ "duration": 1,
+ "cycle": 1,
+ "relative_orbit": 1,
+ "frame": 1,
+ "centre": "CEN",
+ "platform_mode": "M",
+ "timeliness": "AA",
+ "collection": "AAA",
+ "dataset_name": "Oa21",
+ }
p = Parser(fmt)
result = p.parse(filename)
self.assertDictEqual(result, data)
def test_parse_duplicate_fields(self):
"""Test parsing a pattern that has duplicate fields."""
- fmt = '{version_number:1s}/filename_with_version_number_{version_number:1s}.tif'
- filename = '1/filename_with_version_number_1.tif'
+ fmt = "{version_number:1s}/filename_with_version_number_{version_number:1s}.tif"
+ filename = "1/filename_with_version_number_1.tif"
p = Parser(fmt)
result = p.parse(filename)
- self.assertEqual(result['version_number'], '1')
+ self.assertEqual(result["version_number"], "1")
=====================================
trollsift/tests/regressiontests/test_parser.py
=====================================
@@ -24,10 +24,16 @@ from trollsift.parser import parse
class TestParser(unittest.TestCase):
-
def test_002(self):
- res = parse('hrpt16_{satellite:7s}_{start_time:%d-%b-%Y_%H:%M:%S.000}_{orbit_number:5d}',
- "hrpt16_NOAA-19_26-NOV-2014_10:12:00.000_29889")
- self.assertEqual(res, {'orbit_number': 29889,
- 'satellite': 'NOAA-19',
- 'start_time': dt.datetime(2014, 11, 26, 10, 12)})
+ res = parse(
+ "hrpt16_{satellite:7s}_{start_time:%d-%b-%Y_%H:%M:%S.000}_{orbit_number:5d}",
+ "hrpt16_NOAA-19_26-NOV-2014_10:12:00.000_29889",
+ )
+ self.assertEqual(
+ res,
+ {
+ "orbit_number": 29889,
+ "satellite": "NOAA-19",
+ "start_time": dt.datetime(2014, 11, 26, 10, 12),
+ },
+ )
=====================================
trollsift/tests/unittests/test_parser.py
=====================================
@@ -1,3 +1,5 @@
+"""Basic unit tests for the parser module."""
+
import unittest
import datetime as dt
import pytest
@@ -8,10 +10,8 @@ from trollsift.parser import parse, globify, validate, is_one2one, compose, Pars
class TestParser(unittest.TestCase):
-
def setUp(self):
- self.fmt = "/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}" +\
- "_{time:%Y%m%d_%H%M}_{orbit:05d}.l1b"
+ self.fmt = "/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}" + "_{time:%Y%m%d_%H%M}_{orbit:05d}.l1b"
self.string = "/somedir/otherdir/hrpt_noaa16_20140210_1004_69022.l1b"
self.string2 = "/somedir/otherdir/hrpt_noaa16_20140210_1004_00022.l1b"
self.string3 = "/somedir/otherdir/hrpt_noaa16_20140210_1004_69022"
@@ -20,73 +20,116 @@ class TestParser(unittest.TestCase):
def test_parser_keys(self):
parser = Parser(self.fmt)
keys = {"directory", "platform", "platnum", "time", "orbit"}
- self.assertTrue(keys.issubset(parser.keys())
- and keys.issuperset(parser.keys()))
+ self.assertTrue(keys.issubset(parser.keys()) and keys.issuperset(parser.keys()))
def test_get_convert_dict(self):
# Run
result = get_convert_dict(self.fmt)
# Assert
- self.assertDictEqual(result, {
- 'directory': '',
- 'platform': '4s',
- 'platnum': '2s',
- 'time': '%Y%m%d_%H%M',
- 'orbit': '05d',
- })
+ self.assertDictEqual(
+ result,
+ {
+ "directory": "",
+ "platform": "4s",
+ "platnum": "2s",
+ "time": "%Y%m%d_%H%M",
+ "orbit": "05d",
+ },
+ )
def test_extract_values(self):
fmt = "/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:d}.l1b"
result = extract_values(fmt, self.string)
- self.assertDictEqual(result, {'directory': 'otherdir',
- 'platform': 'noaa', 'platnum': '16',
- 'time': '20140210_1004', 'orbit': '69022'})
+ self.assertDictEqual(
+ result,
+ {
+ "directory": "otherdir",
+ "platform": "noaa",
+ "platnum": "16",
+ "time": "20140210_1004",
+ "orbit": "69022",
+ },
+ )
def test_extract_values_end(self):
fmt = "/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:d}"
result = extract_values(fmt, self.string3)
- self.assertDictEqual(result, {'directory': 'otherdir',
- 'platform': 'noaa', 'platnum': '16',
- 'time': '20140210_1004', 'orbit': '69022'})
+ self.assertDictEqual(
+ result,
+ {
+ "directory": "otherdir",
+ "platform": "noaa",
+ "platnum": "16",
+ "time": "20140210_1004",
+ "orbit": "69022",
+ },
+ )
def test_extract_values_beginning(self):
fmt = "{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:d}"
result = extract_values(fmt, self.string4)
- self.assertDictEqual(result, {'directory': '/somedir/otherdir',
- 'platform': 'noaa', 'platnum': '16',
- 'time': '20140210_1004', 'orbit': '69022'})
+ self.assertDictEqual(
+ result,
+ {
+ "directory": "/somedir/otherdir",
+ "platform": "noaa",
+ "platnum": "16",
+ "time": "20140210_1004",
+ "orbit": "69022",
+ },
+ )
def test_extract_values_s4spair(self):
fmt = "{directory}/hrpt_{platform:4s}{platnum:s}_{time:%Y%m%d_%H%M}_{orbit:d}"
result = extract_values(fmt, self.string4)
- self.assertDictEqual(result, {'directory': '/somedir/otherdir',
- 'platform': 'noaa', 'platnum': '16',
- 'time': '20140210_1004', 'orbit': '69022'})
+ self.assertDictEqual(
+ result,
+ {
+ "directory": "/somedir/otherdir",
+ "platform": "noaa",
+ "platnum": "16",
+ "time": "20140210_1004",
+ "orbit": "69022",
+ },
+ )
def test_extract_values_ss2pair(self):
fmt = "{directory}/hrpt_{platform:s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:d}"
result = extract_values(fmt, self.string4)
- self.assertDictEqual(result, {'directory': '/somedir/otherdir',
- 'platform': 'noaa', 'platnum': '16',
- 'time': '20140210_1004', 'orbit': '69022'})
+ self.assertDictEqual(
+ result,
+ {
+ "directory": "/somedir/otherdir",
+ "platform": "noaa",
+ "platnum": "16",
+ "time": "20140210_1004",
+ "orbit": "69022",
+ },
+ )
def test_extract_values_ss2pair_end(self):
fmt = "{directory}/hrpt_{platform:s}{platnum:2s}"
result = extract_values(fmt, "/somedir/otherdir/hrpt_noaa16")
- self.assertDictEqual(result, {'directory': '/somedir/otherdir',
- 'platform': 'noaa', 'platnum': '16'})
+ self.assertDictEqual(
+ result,
+ {"directory": "/somedir/otherdir", "platform": "noaa", "platnum": "16"},
+ )
def test_extract_values_sdatetimepair_end(self):
fmt = "{directory}/hrpt_{platform:s}{date:%Y%m%d}"
result = extract_values(fmt, "/somedir/otherdir/hrpt_noaa20140212")
- self.assertDictEqual(result, {'directory': '/somedir/otherdir',
- 'platform': 'noaa', 'date': '20140212'})
+ self.assertDictEqual(
+ result,
+ {"directory": "/somedir/otherdir", "platform": "noaa", "date": "20140212"},
+ )
def test_extract_values_everything(self):
fmt = "{everything}"
result = extract_values(fmt, self.string)
self.assertDictEqual(
- result, {'everything': '/somedir/otherdir/hrpt_noaa16_20140210_1004_69022.l1b'})
+ result,
+ {"everything": "/somedir/otherdir/hrpt_noaa16_20140210_1004_69022.l1b"},
+ )
def test_extract_values_padding2(self):
fmt = "/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:0>5d}.l1b"
@@ -96,41 +139,52 @@ class TestParser(unittest.TestCase):
# {'orbit': '0>5d'}, '.l1b']
result = extract_values(fmt, self.string2)
# Assert
- self.assertDictEqual(result, {'directory': 'otherdir',
- 'platform': 'noaa', 'platnum': '16',
- 'time': '20140210_1004', 'orbit': '00022'})
+ self.assertDictEqual(
+ result,
+ {
+ "directory": "otherdir",
+ "platform": "noaa",
+ "platnum": "16",
+ "time": "20140210_1004",
+ "orbit": "00022",
+ },
+ )
def test_extract_values_fails(self):
- fmt = '/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:4d}.l1b'
+ fmt = "/somedir/{directory}/hrpt_{platform:4s}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:4d}.l1b"
self.assertRaises(ValueError, extract_values, fmt, self.string)
def test_extract_values_full_match(self):
"""Test that a string must completely match."""
- fmt = '{orbit:05d}'
- val = extract_values(fmt, '12345')
- self.assertEqual(val, {'orbit': '12345'})
- self.assertRaises(ValueError, extract_values, fmt, '12345abc')
- val = extract_values(fmt, '12345abc', full_match=False)
- self.assertEqual(val, {'orbit': '12345'})
+ fmt = "{orbit:05d}"
+ val = extract_values(fmt, "12345")
+ self.assertEqual(val, {"orbit": "12345"})
+ self.assertRaises(ValueError, extract_values, fmt, "12345abc")
+ val = extract_values(fmt, "12345abc", full_match=False)
+ self.assertEqual(val, {"orbit": "12345"})
def test_convert_digits(self):
- self.assertEqual(_convert('d', '69022'), 69022)
- self.assertRaises(ValueError, _convert, 'd', '69dsf')
- self.assertEqual(_convert('d', '00022'), 22)
- self.assertEqual(_convert('4d', '69022'), 69022)
- self.assertEqual(_convert('_>10d', '_____69022'), 69022)
- self.assertEqual(_convert('%Y%m%d_%H%M', '20140210_1004'),
- dt.datetime(2014, 2, 10, 10, 4))
+ self.assertEqual(_convert("d", "69022"), 69022)
+ self.assertRaises(ValueError, _convert, "d", "69dsf")
+ self.assertEqual(_convert("d", "00022"), 22)
+ self.assertEqual(_convert("4d", "69022"), 69022)
+ self.assertEqual(_convert("_>10d", "_____69022"), 69022)
+ self.assertEqual(_convert("%Y%m%d_%H%M", "20140210_1004"), dt.datetime(2014, 2, 10, 10, 4))
def test_parse(self):
# Run
- result = parse(
- self.fmt, "/somedir/avhrr/2014/hrpt_noaa19_20140212_1412_12345.l1b")
+ result = parse(self.fmt, "/somedir/avhrr/2014/hrpt_noaa19_20140212_1412_12345.l1b")
# Assert
- self.assertDictEqual(result, {'directory': 'avhrr/2014',
- 'platform': 'noaa', 'platnum': '19',
- 'time': dt.datetime(2014, 2, 12, 14, 12),
- 'orbit': 12345})
+ self.assertDictEqual(
+ result,
+ {
+ "directory": "avhrr/2014",
+ "platform": "noaa",
+ "platnum": "19",
+ "time": dt.datetime(2014, 2, 12, 14, 12),
+ "orbit": 12345,
+ },
+ )
def test_parse_string_padding_syntax_with_and_without_s(self):
"""Test that, in string padding syntax, '' is equivalent to 's'.
@@ -139,8 +193,8 @@ class TestParser(unittest.TestCase):
* Type 's': String format. This is the default type for strings and may be omitted.
* Type None: The same as 's'.
"""
- result = parse('{foo}/{bar:_<8}', 'baz/qux_____')
- expected_result = parse('{foo}/{bar:_<8s}', 'baz/qux_____')
+ result = parse("{foo}/{bar:_<8}", "baz/qux_____")
+ expected_result = parse("{foo}/{bar:_<8s}", "baz/qux_____")
self.assertEqual(expected_result["foo"], "baz")
self.assertEqual(expected_result["bar"], "qux")
self.assertEqual(result, expected_result)
@@ -149,147 +203,175 @@ class TestParser(unittest.TestCase):
# Run
result = parse(
"hrpt_{platform}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:05d}{ext}",
- "hrpt_noaa19_20140212_1412_12345.l1b")
+ "hrpt_noaa19_20140212_1412_12345.l1b",
+ )
# Assert
- self.assertDictEqual(result, {'platform': 'noaa', 'platnum': '19',
- 'time': dt.datetime(2014, 2, 12, 14, 12),
- 'orbit': 12345,
- 'ext': '.l1b'})
+ self.assertDictEqual(
+ result,
+ {
+ "platform": "noaa",
+ "platnum": "19",
+ "time": dt.datetime(2014, 2, 12, 14, 12),
+ "orbit": 12345,
+ "ext": ".l1b",
+ },
+ )
def test_parse_align(self):
- filepattern="H-000-{hrit_format:4s}__-{platform_name:4s}________-{channel_name:_<9s}-{segment:_<9s}-{start_time:%Y%m%d%H%M}-__"
+ filepattern = (
+ "H-000-{hrit_format:4s}__-{platform_name:4s}________-"
+ "{channel_name:_<9s}-{segment:_<9s}-{start_time:%Y%m%d%H%M}-__"
+ )
result = parse(filepattern, "H-000-MSG3__-MSG3________-IR_039___-000007___-201506051700-__")
- self.assertDictEqual(result, {'channel_name': 'IR_039',
- 'hrit_format': 'MSG3',
- 'platform_name': 'MSG3',
- 'segment': '000007',
- 'start_time': dt.datetime(2015, 6, 5, 17, 0)})
+ self.assertDictEqual(
+ result,
+ {
+ "channel_name": "IR_039",
+ "hrit_format": "MSG3",
+ "platform_name": "MSG3",
+ "segment": "000007",
+ "start_time": dt.datetime(2015, 6, 5, 17, 0),
+ },
+ )
def test_parse_digits(self):
"""Test when a digit field is shorter than the format spec."""
result = parse(
"hrpt_{platform}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:05d}{ext}",
- "hrpt_noaa19_20140212_1412_02345.l1b")
- self.assertDictEqual(result, {'platform': 'noaa', 'platnum': '19',
- 'time': dt.datetime(2014, 2, 12, 14, 12),
- 'orbit': 2345,
- 'ext': '.l1b'})
+ "hrpt_noaa19_20140212_1412_02345.l1b",
+ )
+ self.assertDictEqual(
+ result,
+ {
+ "platform": "noaa",
+ "platnum": "19",
+ "time": dt.datetime(2014, 2, 12, 14, 12),
+ "orbit": 2345,
+ "ext": ".l1b",
+ },
+ )
result = parse(
"hrpt_{platform}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:5d}{ext}",
- "hrpt_noaa19_20140212_1412_ 2345.l1b")
- self.assertDictEqual(result, {'platform': 'noaa', 'platnum': '19',
- 'time': dt.datetime(2014, 2, 12, 14, 12),
- 'orbit': 2345,
- 'ext': '.l1b'})
+ "hrpt_noaa19_20140212_1412_ 2345.l1b",
+ )
+ self.assertDictEqual(
+ result,
+ {
+ "platform": "noaa",
+ "platnum": "19",
+ "time": dt.datetime(2014, 2, 12, 14, 12),
+ "orbit": 2345,
+ "ext": ".l1b",
+ },
+ )
result = parse(
"hrpt_{platform}{platnum:2s}_{time:%Y%m%d_%H%M}_{orbit:_>5d}{ext}",
- "hrpt_noaa19_20140212_1412___345.l1b")
- self.assertDictEqual(result, {'platform': 'noaa', 'platnum': '19',
- 'time': dt.datetime(2014, 2, 12, 14, 12),
- 'orbit': 345,
- 'ext': '.l1b'})
+ "hrpt_noaa19_20140212_1412___345.l1b",
+ )
+ self.assertDictEqual(
+ result,
+ {
+ "platform": "noaa",
+ "platnum": "19",
+ "time": dt.datetime(2014, 2, 12, 14, 12),
+ "orbit": 345,
+ "ext": ".l1b",
+ },
+ )
def test_parse_bad_pattern(self):
"""Test when a digit field is shorter than the format spec."""
- self.assertRaises(ValueError, parse,
- "hrpt_{platform}{platnum:-=2s}_{time:%Y%m%d_%H%M}_{orbit:05d}{ext}",
- "hrpt_noaa19_20140212_1412_02345.l1b")
+ self.assertRaises(
+ ValueError,
+ parse,
+ "hrpt_{platform}{platnum:-=2s}_{time:%Y%m%d_%H%M}_{orbit:05d}{ext}",
+ "hrpt_noaa19_20140212_1412_02345.l1b",
+ )
def test_globify_simple(self):
# Run
- result = globify('{a}_{b}.end', {'a': 'a', 'b': 'b'})
+ result = globify("{a}_{b}.end", {"a": "a", "b": "b"})
# Assert
- self.assertEqual(result, 'a_b.end')
+ self.assertEqual(result, "a_b.end")
def test_globify_empty(self):
# Run
- result = globify('{a}_{b:4d}.end', {})
+ result = globify("{a}_{b:4d}.end", {})
# Assert
- self.assertEqual(result, '*_????.end')
+ self.assertEqual(result, "*_????.end")
def test_globify_noarg(self):
# Run
- result = globify('{a}_{b:4d}.end')
+ result = globify("{a}_{b:4d}.end")
# Assert
- self.assertEqual(result, '*_????.end')
+ self.assertEqual(result, "*_????.end")
def test_globify_known_lengths(self):
# Run
- result = globify('{directory}/{platform:4s}{satnum:2d}/{orbit:05d}',
- {'directory': 'otherdir',
- 'platform': 'noaa'})
+ result = globify(
+ "{directory}/{platform:4s}{satnum:2d}/{orbit:05d}",
+ {"directory": "otherdir", "platform": "noaa"},
+ )
# Assert
- self.assertEqual(result, 'otherdir/noaa??/?????')
+ self.assertEqual(result, "otherdir/noaa??/?????")
def test_globify_unknown_lengths(self):
# Run
- result = globify('hrpt_{platform_and_num}_' +
- '{date}_{time}_{orbit}.l1b',
- {'platform_and_num': 'noaa16'})
+ result = globify(
+ "hrpt_{platform_and_num}_" + "{date}_{time}_{orbit}.l1b",
+ {"platform_and_num": "noaa16"},
+ )
# Assert
- self.assertEqual(result, 'hrpt_noaa16_*_*_*.l1b')
+ self.assertEqual(result, "hrpt_noaa16_*_*_*.l1b")
def test_globify_datetime(self):
# Run
- result = globify('hrpt_{platform}{satnum}_' +
- '{time:%Y%m%d_%H%M}_{orbit}.l1b',
- {'platform': 'noaa',
- 'time': dt.datetime(2014, 2, 10, 12, 12)})
+ result = globify(
+ "hrpt_{platform}{satnum}_" + "{time:%Y%m%d_%H%M}_{orbit}.l1b",
+ {"platform": "noaa", "time": dt.datetime(2014, 2, 10, 12, 12)},
+ )
# Assert
- self.assertEqual(result, 'hrpt_noaa*_20140210_1212_*.l1b')
+ self.assertEqual(result, "hrpt_noaa*_20140210_1212_*.l1b")
def test_globify_partial_datetime(self):
# Run
- result = globify('hrpt_{platform:4s}{satnum:2d}_' +
- '{time:%Y%m%d_%H%M}_{orbit}.l1b',
- {'platform': 'noaa',
- 'time': (dt.datetime(2014, 2, 10, 12, 12),
- 'Ymd')})
+ result = globify(
+ "hrpt_{platform:4s}{satnum:2d}_" + "{time:%Y%m%d_%H%M}_{orbit}.l1b",
+ {"platform": "noaa", "time": (dt.datetime(2014, 2, 10, 12, 12), "Ymd")},
+ )
# Assert
- self.assertEqual(result, 'hrpt_noaa??_20140210_????_*.l1b')
+ self.assertEqual(result, "hrpt_noaa??_20140210_????_*.l1b")
def test_globify_datetime_nosub(self):
# Run
- result = globify('hrpt_{platform:4s}{satnum:2d}_' +
- '{time:%Y%m%d_%H%M}_{orbit}.l1b',
- {'platform': 'noaa'})
+ result = globify(
+ "hrpt_{platform:4s}{satnum:2d}_" + "{time:%Y%m%d_%H%M}_{orbit}.l1b",
+ {"platform": "noaa"},
+ )
# Assert
- self.assertEqual(result, 'hrpt_noaa??_????????_????_*.l1b')
+ self.assertEqual(result, "hrpt_noaa??_????????_????_*.l1b")
def test_validate(self):
# These cases are True
- self.assertTrue(
- validate(self.fmt, "/somedir/avhrr/2014/hrpt_noaa19_20140212_1412_12345.l1b"))
- self.assertTrue(
- validate(self.fmt, "/somedir/avhrr/2014/hrpt_noaa01_19790530_0705_00000.l1b"))
- self.assertTrue(validate(
- self.fmt, "/somedir/funny-char$dir/hrpt_noaa19_20140212_1412_12345.l1b"))
- self.assertTrue(
- validate(self.fmt, "/somedir//hrpt_noaa19_20140212_1412_12345.l1b"))
+ self.assertTrue(validate(self.fmt, "/somedir/avhrr/2014/hrpt_noaa19_20140212_1412_12345.l1b"))
+ self.assertTrue(validate(self.fmt, "/somedir/avhrr/2014/hrpt_noaa01_19790530_0705_00000.l1b"))
+ self.assertTrue(validate(self.fmt, "/somedir/funny-char$dir/hrpt_noaa19_20140212_1412_12345.l1b"))
+ self.assertTrue(validate(self.fmt, "/somedir//hrpt_noaa19_20140212_1412_12345.l1b"))
# These cases are False
- self.assertFalse(
- validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_20140212_1412_1A345.l1b"))
- self.assertFalse(
- validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_2014021_1412_00000.l1b"))
- self.assertFalse(
- validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_20140212__412_00000.l1b"))
- self.assertFalse(
- validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_20140212__1412_00000.l1b"))
- self.assertFalse(
- validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_20140212_1412_00000.l1"))
- self.assertFalse(
- validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_20140212_1412_00000"))
- self.assertFalse(
- validate(self.fmt, "{}/somedir/bla/bla/hrpt_noaa19_20140212_1412_00000.l1b"))
+ self.assertFalse(validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_20140212_1412_1A345.l1b"))
+ self.assertFalse(validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_2014021_1412_00000.l1b"))
+ self.assertFalse(validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_20140212__412_00000.l1b"))
+ self.assertFalse(validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_20140212__1412_00000.l1b"))
+ self.assertFalse(validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_20140212_1412_00000.l1"))
+ self.assertFalse(validate(self.fmt, "/somedir/bla/bla/hrpt_noaa19_20140212_1412_00000"))
+ self.assertFalse(validate(self.fmt, "{}/somedir/bla/bla/hrpt_noaa19_20140212_1412_00000.l1b"))
def test_is_one2one(self):
# These cases are True
- self.assertTrue(is_one2one(
- "/somedir/{directory}/somedata_{platform:4s}_{time:%Y%d%m-%H%M}_{orbit:5d}.l1b"))
+ self.assertTrue(is_one2one("/somedir/{directory}/somedata_{platform:4s}_{time:%Y%d%m-%H%M}_{orbit:5d}.l1b"))
# These cases are False
- self.assertFalse(is_one2one(
- "/somedir/{directory}/somedata_{platform:4s}_{time:%Y%d%m-%H%M}_{orbit:d}.l1b"))
+ self.assertFalse(is_one2one("/somedir/{directory}/somedata_{platform:4s}_{time:%Y%d%m-%H%M}_{orbit:d}.l1b"))
def test_greediness(self):
"""Test that the minimum match is parsed out.
@@ -297,18 +379,19 @@ class TestParser(unittest.TestCase):
See GH #18.
"""
from trollsift import parse
- template = '{band_type}_{polarization_extracted}_{unit}_{s1_fname}'
- fname = 'Amplitude_VH_db_S1A_IW_GRDH_1SDV_20160528T171628_20160528T171653_011462_011752_0EED.tif'
+
+ template = "{band_type}_{polarization_extracted}_{unit}_{s1_fname}"
+ fname = "Amplitude_VH_db_S1A_IW_GRDH_1SDV_20160528T171628_20160528T171653_011462_011752_0EED.tif"
res_dict = parse(template, fname)
exp = {
- 'band_type': 'Amplitude',
- 'polarization_extracted': 'VH',
- 'unit': 'db',
- 's1_fname': 'S1A_IW_GRDH_1SDV_20160528T171628_20160528T171653_011462_011752_0EED.tif',
+ "band_type": "Amplitude",
+ "polarization_extracted": "VH",
+ "unit": "db",
+ "s1_fname": "S1A_IW_GRDH_1SDV_20160528T171628_20160528T171653_011462_011752_0EED.tif",
}
self.assertEqual(exp, res_dict)
- template = '{band_type:s}_{polarization_extracted}_{unit}_{s1_fname}'
+ template = "{band_type:s}_{polarization_extracted}_{unit}_{s1_fname}"
res_dict = parse(template, fname)
self.assertEqual(exp, res_dict)
@@ -316,7 +399,7 @@ class TestParser(unittest.TestCase):
class TestCompose:
"""Test routines related to `compose` methods."""
- @pytest.mark.parametrize('allow_partial', [False, True])
+ @pytest.mark.parametrize("allow_partial", [False, True])
def test_compose(self, allow_partial):
"""Test the compose method's custom conversion options."""
key_vals = {"a": "this Is A-Test b_test c test"}
@@ -358,7 +441,7 @@ class TestCompose:
composed = compose(
fmt=fmt,
keyvals={"platform_name": "foo", "format": "bar"},
- allow_partial=True
+ allow_partial=True,
)
assert composed == "{variant:s}/foo_{start_time:%Y%m%d_%H%M}_{product}.bar"
@@ -375,8 +458,8 @@ class TestCompose:
assert composed == "/foo/{start_time:%Y%m}/bar/{baz}_{start_time:%Y%m%d_%H%M}.qux"
@pytest.mark.parametrize(
- 'original_fmt',
- ["{}_{}", "{foo}{afooo}{fooo}.{bar}/{baz:%Y}/{baz:%Y%m%d_%H}/{baz:%Y}/{bar:d}"]
+ "original_fmt",
+ ["{}_{}", "{foo}{afooo}{fooo}.{bar}/{baz:%Y}/{baz:%Y%m%d_%H}/{baz:%Y}/{bar:d}"],
)
def test_partial_compose_is_identity_with_empty_keyvals(self, original_fmt):
"""Test that partial compose leaves the input untouched if no keyvals at all."""
@@ -392,66 +475,65 @@ class TestCompose:
class TestParserFixedPoint:
"""Test parsing of fixed point numbers."""
- @pytest.mark.parametrize('allow_partial_compose', [False, True])
+ @pytest.mark.parametrize("allow_partial_compose", [False, True])
@pytest.mark.parametrize(
- ('fmt', 'string', 'expected'),
+ ("fmt", "string", "expected"),
[
# Naive
- ('{foo:f}', '12.34', 12.34),
+ ("{foo:f}", "12.34", 12.34),
# Including width and precision
- ('{foo:5.2f}', '12.34', 12.34),
- ('{foo:5.2f}', '-1.23', -1.23),
- ('{foo:5.2f}', '12.34', 12.34),
- ('{foo:5.2f}', '123.45', 123.45),
+ ("{foo:5.2f}", "12.34", 12.34),
+ ("{foo:5.2f}", "-1.23", -1.23),
+ ("{foo:5.2f}", "12.34", 12.34),
+ ("{foo:5.2f}", "123.45", 123.45),
# Whitespace padded
- ('{foo:5.2f}', ' 1.23', 1.23),
- ('{foo:5.2f}', ' 12.34', 12.34),
+ ("{foo:5.2f}", " 1.23", 1.23),
+ ("{foo:5.2f}", " 12.34", 12.34),
# Zero padded
- ('{foo:05.2f}', '01.23', 1.23),
- ('{foo:05.2f}', '012.34', 12.34),
+ ("{foo:05.2f}", "01.23", 1.23),
+ ("{foo:05.2f}", "012.34", 12.34),
# Only precision, no width
- ('{foo:.2f}', '12.34', 12.34),
+ ("{foo:.2f}", "12.34", 12.34),
# Only width, no precision
- ('{foo:16f}', ' 1.12', 1.12),
+ ("{foo:16f}", " 1.12", 1.12),
# No digits before decimal point
- ('{foo:3.2f}', '.12', 0.12),
- ('{foo:4.2f}', '-.12', -0.12),
- ('{foo:4.2f}', ' .12', 0.12),
- ('{foo:4.2f}', ' .12', 0.12),
- ('{foo:16f}', ' .12', 0.12),
+ ("{foo:3.2f}", ".12", 0.12),
+ ("{foo:4.2f}", "-.12", -0.12),
+ ("{foo:4.2f}", " .12", 0.12),
+ ("{foo:4.2f}", " .12", 0.12),
+ ("{foo:16f}", " .12", 0.12),
# Exponential format
- ('{foo:7.2e}', '-1.23e4', -1.23e4)
- ]
+ ("{foo:7.2e}", "-1.23e4", -1.23e4),
+ ],
)
def test_match(self, allow_partial_compose, fmt, string, expected):
"""Test cases expected to be matched."""
-
# Test parsed value
parsed = parse(fmt, string)
- assert parsed['foo'] == expected
+ assert parsed["foo"] == expected
# Test round trip
- composed = compose(fmt, {'foo': expected}, allow_partial=allow_partial_compose)
+ composed = compose(fmt, {"foo": expected}, allow_partial=allow_partial_compose)
parsed = parse(fmt, composed)
- assert parsed['foo'] == expected
+ assert parsed["foo"] == expected
@pytest.mark.parametrize(
- ('fmt', 'string'),
+ ("fmt", "string"),
[
# Decimals incorrect
- ('{foo:5.2f}', '12345'),
- ('{foo:5.2f}', '1234.'),
- ('{foo:5.2f}', '1.234'),
- ('{foo:5.2f}', '123.4'),
- ('{foo:.2f}', '12.345'),
+ ("{foo:5.2f}", "12345"),
+ ("{foo:5.2f}", "1234."),
+ ("{foo:5.2f}", "1.234"),
+ ("{foo:5.2f}", "123.4"),
+ ("{foo:.2f}", "12.345"),
# Decimals correct, but width too short
- ('{foo:5.2f}', '1.23'),
- ('{foo:5.2f}', '.23'),
- ('{foo:10.2e}', '1.23e4'),
+ ("{foo:5.2f}", "1.23"),
+ ("{foo:5.2f}", ".23"),
+ ("{foo:10.2e}", "1.23e4"),
# Invalid
- ('{foo:5.2f}', '12_34'),
- ('{foo:5.2f}', 'aBcD'),
- ]
+ ("{foo:5.2f}", "12_34"),
+ ("{foo:5.2f}", "aBcD"),
+ ],
)
def test_no_match(self, fmt, string):
"""Test cases expected to not be matched."""
@@ -460,7 +542,7 @@ class TestParserFixedPoint:
@pytest.mark.parametrize(
- ('fmt', 'string', 'expected'),
+ ("fmt", "string", "expected"),
[
# Decimal
("{foo:d}", "123", 123),
@@ -484,7 +566,7 @@ class TestParserFixedPoint:
# Fixed length with binary
("{foo:8b}", " 1111011", 123),
("{foo:_>8b}", "_1111011", 123),
- ]
+ ],
)
def test_parse_integers(fmt, string, expected):
assert parse(fmt, string)["foo"] == expected
View it on GitLab: https://salsa.debian.org/debian-gis-team/trollsift/-/compare/afffcaad8db486b8661e480b7e07cc75ee56ac79...fb1a12416cf9b31e4daf73a68316f593f39e8679
--
View it on GitLab: https://salsa.debian.org/debian-gis-team/trollsift/-/compare/afffcaad8db486b8661e480b7e07cc75ee56ac79...fb1a12416cf9b31e4daf73a68316f593f39e8679
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-grass-devel/attachments/20250903/c21103af/attachment-0001.htm>
More information about the Pkg-grass-devel
mailing list