[med-svn] [Git][med-team/python-ciso8601][upstream] New upstream version 2.2.0
Nilesh Patra (@nilesh)
gitlab at salsa.debian.org
Sat Aug 7 20:17:16 BST 2021
Nilesh Patra pushed to branch upstream at Debian Med / python-ciso8601
Commits:
3faec6b5 by Nilesh Patra at 2021-08-08T00:21:26+05:30
New upstream version 2.2.0
- - - - -
21 changed files:
- .circleci/config.yml
- CHANGELOG.md
- MANIFEST.in
- README.rst
- + benchmarking/Dockerfile
- benchmarking/README.rst
- benchmarking/format_results.py
- benchmarking/perform_comparison.py
- benchmarking/rst_include_replace.py
- benchmarking/run_benchmarks.sh
- benchmarking/tox.ini
- generate_test_timestamps.py
- module.c
- + pyproject.toml
- setup.py
- + tests/__init__.py
- + tests/test_timezone.py
- tests.py → tests/tests.py
- + timezone.c
- + timezone.h
- tox.ini
Changes:
=====================================
.circleci/config.yml
=====================================
@@ -1,67 +1,57 @@
-version: 2
+version: 2.1
workflows:
- version: 2
workflow:
jobs:
- - test-2.7
- - test-3.4
- - test-3.5
- - test-3.6
- - test-3.7
- - test-3.8
+ - test:
+ matrix:
+ parameters:
+ python_version: ["2.7", "3.4", "3.5", "3.6", "3.7", "3.8", "3.9"]
+ - test_pypy:
+ matrix:
+ parameters:
+ python_version: ["2.7", "3.7"]
- lint-rst
-defaults: &defaults
- working_directory: ~/code
- environment:
- STRICT_WARNINGS: '1'
- steps:
- - checkout
- - run:
- name: Test
- command: python setup.py test
-
jobs:
- test-2.7:
- <<: *defaults
- docker:
- - image: circleci/python:2.7
- test-3.4:
- <<: *defaults
- docker:
- - image: circleci/python:3.4
- test-3.5:
- <<: *defaults
- docker:
- - image: circleci/python:3.5
- test-3.6:
- <<: *defaults
- docker:
- - image: circleci/python:3.6
- test-3.7:
- <<: *defaults
+ test:
+ parameters:
+ python_version:
+ type: string
+ steps:
+ - checkout
+ - run:
+ name: Test
+ command: python setup.py test
docker:
- - image: circleci/python:3.7
- test-3.8:
- <<: *defaults
+ - image: circleci/python:<<parameters.python_version>>
+
+ test_pypy:
+ parameters:
+ python_version:
+ type: string
+ steps:
+ - checkout
+ - run:
+ name: Test
+ command: pypy setup.py test
docker:
- - image: circleci/python:3.8
+ - image: pypy:<<parameters.python_version>>
lint-rst:
working_directory: ~/code
steps:
- - checkout
- - run:
- name: Install lint tools
- command: |
+ - checkout
+ - run:
+ name: Install lint tools
+ command: |
python3 -m venv venv
. venv/bin/activate
pip install Pygments restructuredtext-lint
- - run:
- name: Lint
- command: |
+ - run:
+ name: Lint
+ command: |
. venv/bin/activate
rst-lint --encoding=utf-8 README.rst
docker:
- - image: circleci/python:3.8
+ - image: circleci/python:3.9
=====================================
CHANGELOG.md
=====================================
@@ -3,6 +3,7 @@
- [Unreleased](#unreleased)
- [2.x.x](#2xx)
+ - [Version 2.2.0](#version-220)
- [Version 2.1.3](#version-213)
- [Version 2.1.2](#version-212)
- [Version 2.1.1](#version-211)
@@ -14,16 +15,29 @@
- [v1.x.x -> 2.0.0 Migration guide](#v1xx---200-migration-guide)
- [ValueError instead of None](#valueerror-instead-of-none)
- [Tightened ISO 8601 conformance](#tightened-iso-8601-conformance)
- - [`parse_datetime_unaware` has been renamed](#parsedatetimeunaware-has-been-renamed)
+ - [`parse_datetime_unaware` has been renamed](#parse_datetime_unaware-has-been-renamed)
<!-- /TOC -->
# Unreleased
-* N/A
+*
# 2.x.x
+## Version 2.2.0
+
+* Added Python 3.9 support
+* Switched to using a C implementation of `timezone` objects.
+ * Much faster parse times for timestamps with timezone information
+ * ~2.5x faster on Python 2.7, ~10% faster on Python 3.9
+ * Thanks to [`pendulum`](https://github.com/sdispater/pendulum) and @sdispater for the code.
+ * Python 2.7 users no longer need to install `pytz` dependency :smiley:
+* Added caching of tzinfo objects
+ * Parsing is ~1.1x faster for subsequent timestamps that have the same time zone offset.
+ * Caching can be disabled at compile time by setting the `CISO8601_CACHING_ENABLED=0` environment variable
+* Fixed a memory leak in the case where an invalid timestamp had a non-UTC timezone and extra characters
+
## Version 2.1.3
* Fixed a problem where non-ASCII characters would give bad error messages (#84). Thanks @olliemath.
=====================================
MANIFEST.in
=====================================
@@ -1,3 +1,4 @@
include LICENSE
include README.rst
include CHANGELOG.md
+include timezone.h
=====================================
README.rst
=====================================
@@ -14,7 +14,7 @@ ciso8601
``ciso8601`` converts `ISO 8601`_ or `RFC 3339`_ date time strings into Python datetime objects.
Since it's written as a C module, it is much faster than other Python libraries.
-Tested with Python 2.7, 3.4, 3.5, 3.6, 3.7, 3.8.
+Tested with cPython 2.7, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9.
**Note:** ciso8601 doesn't support the entirety of the ISO 8601 spec, `only a popular subset`_.
@@ -76,7 +76,7 @@ Parsing a timestamp with no time zone information (ex. ``2014-01-09T21:48:00``):
.. <include:benchmark_with_no_time_zone.rst>
-.. table::
+.. table::
+---------------+----------+----------+----------+----------+----------+-------------------------------+-----------------------------------------------+
| Module |Python 3.8|Python 3.7|Python 3.6|Python 3.5|Python 3.4| Python 2.7 |Relative Slowdown (versus ciso8601, Python 3.8)|
@@ -118,7 +118,7 @@ Parsing a timestamp with time zone information (ex. ``2014-01-09T21:48:00-05:30`
.. <include:benchmark_with_time_zone.rst>
-.. table::
+.. table::
+---------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+----------+-------------------------------+-----------------------------------------------+
| Module | Python 3.8 | Python 3.7 | Python 3.6 | Python 3.5 |Python 3.4| Python 2.7 |Relative Slowdown (versus ciso8601, Python 3.8)|
@@ -185,29 +185,6 @@ For full benchmarking details (or to run the benchmark yourself), see `benchmark
.. _`benchmarking/README.rst`: https://github.com/closeio/ciso8601/blob/master/benchmarking/README.rst
-Dependency on pytz (Python 2)
------------------------------
-
-In Python 2, ``ciso8601`` uses the `pytz`_ library while parsing timestamps with time zone information. This means that if you wish to parse such timestamps, you must first install ``pytz``:
-
-.. _pytz: http://pytz.sourceforge.net/
-
-.. code:: python
-
- pip install pytz
-
-Otherwise, ``ciso8601`` will raise an exception when you try to parse a timestamp with time zone information:
-
-.. code:: python
-
- In [2]: ciso8601.parse_datetime('2014-12-05T12:30:45.123456-05:30')
- Out[2]: ImportError: Cannot parse a timestamp with time zone information without the pytz dependency. Install it with `pip install pytz`.
-
-``pytz`` is intentionally not an explicit dependency of ``ciso8601``. This is because many users use ``ciso8601`` to parse only naive timestamps, and therefore don't need this extra dependency.
-In Python 3, ``ciso8601`` makes use of the built-in `datetime.timezone`_ class instead, so ``pytz`` is not necessary.
-
-.. _datetime.timezone: https://docs.python.org/3/library/datetime.html#timezone-objects
-
Supported Subset of ISO 8601
----------------------------
@@ -227,11 +204,11 @@ The following date formats are supported:
``YYYY-MM-DD`` ``2018-04-29`` ✅
``YYYY-MM`` ``2018-04`` ✅
``YYYYMMDD`` ``2018-04`` ✅
- ``--MM-DD`` (omitted year) ``--04-29`` ❌
+ ``--MM-DD`` (omitted year) ``--04-29`` ❌
``--MMDD`` (omitted year) ``--0429`` ❌
- ``±YYYYY-MM`` (>4 digit year) ``+10000-04`` ❌
- ``+YYYY-MM`` (leading +) ``+2018-04`` ❌
- ``-YYYY-MM`` (negative -) ``-2018-04`` ❌
+ ``±YYYYY-MM`` (>4 digit year) ``+10000-04`` ❌
+ ``+YYYY-MM`` (leading +) ``+2018-04`` ❌
+ ``-YYYY-MM`` (negative -) ``-2018-04`` ❌
============================= ============== ==================
Week dates or ordinal dates are not currently supported.
@@ -247,7 +224,7 @@ Week dates or ordinal dates are not currently supported.
``YYYY-Www-D`` (week date) ``2009-W01-1`` ❌
``YYYYWwwD`` (week date) ``2009-W01-1`` ❌
``YYYY-DDD`` (ordinal date) ``1981-095`` ❌
- ``YYYYDDD`` (ordinal date) ``1981095`` ❌
+ ``YYYYDDD`` (ordinal date) ``1981095`` ❌
============================= ============== ==================
Time Formats
@@ -264,22 +241,22 @@ The following time formats are supported:
.. table::
:widths: auto
- =================================== =================== ==============
- Format Example Supported
- =================================== =================== ==============
- ``hh`` ``11`` ✅
- ``hhmm`` ``1130`` ✅
- ``hh:mm`` ``11:30`` ✅
- ``hhmmss`` ``113059`` ✅
- ``hh:mm:ss`` ``11:30:59`` ✅
- ``hhmmss.ssssss`` ``113059.123456`` ✅
- ``hh:mm:ss.ssssss`` ``11:30:59.123456`` ✅
- ``hhmmss,ssssss`` ``113059,123456`` ✅
- ``hh:mm:ss,ssssss`` ``11:30:59,123456`` ✅
- Midnight (special case) ``24:00:00`` ✅
- ``hh.hhh`` (fractional hours) ``11.5`` ❌
- ``hh:mm.mmm`` (fractional minutes) ``11:30.5`` ❌
- =================================== =================== ==============
+ =================================== =================== ==============
+ Format Example Supported
+ =================================== =================== ==============
+ ``hh`` ``11`` ✅
+ ``hhmm`` ``1130`` ✅
+ ``hh:mm`` ``11:30`` ✅
+ ``hhmmss`` ``113059`` ✅
+ ``hh:mm:ss`` ``11:30:59`` ✅
+ ``hhmmss.ssssss`` ``113059.123456`` ✅
+ ``hh:mm:ss.ssssss`` ``11:30:59.123456`` ✅
+ ``hhmmss,ssssss`` ``113059,123456`` ✅
+ ``hh:mm:ss,ssssss`` ``11:30:59,123456`` ✅
+ Midnight (special case) ``24:00:00`` ✅
+ ``hh.hhh`` (fractional hours) ``11.5`` ❌
+ ``hh:mm.mmm`` (fractional minutes) ``11:30.5`` ❌
+ =================================== =================== ==============
**Note:** Python datetime objects only have microsecond precision (6 digits). Any additional precision will be truncated.
@@ -291,9 +268,9 @@ Time zone information may be provided in one of the following formats:
.. table::
:widths: auto
- ========== ========== ===========
- Format Example Supported
- ========== ========== ===========
+ ========== ========== ===========
+ Format Example Supported
+ ========== ========== ===========
``Z`` ``Z`` ✅
``z`` ``z`` ✅
``±hh`` ``+11`` ✅
=====================================
benchmarking/Dockerfile
=====================================
@@ -0,0 +1,40 @@
+FROM ubuntu
+
+RUN apt-get update && \
+ apt install -y software-properties-common && \
+ add-apt-repository ppa:deadsnakes/ppa && \
+ apt-get update
+
+# Install the Python versions
+RUN apt install -y python python-dev && \
+ apt install -y python3.5 python3.5-dev python3.5-venv && \
+ apt install -y python3.6 python3.6-dev python3.6-venv && \
+ apt install -y python3.7 python3.7-dev python3.7-venv && \
+ apt install -y python3.8 python3.8-dev python3.8-venv && \
+ apt install -y python3.9 python3.9-dev python3.9-venv
+
+# Install the other dependencies
+RUN apt-get install -y git curl gcc build-essential
+
+# Make Python 3.9 the default `python`
+RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.9 10
+
+# Get pip
+RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
+ python get-pip.py
+
+ADD requirements.txt requirements.txt
+
+# Install benchmarking dependencies
+RUN pip install -r requirements.txt
+
+# Work around https://bugs.launchpad.net/ubuntu/+source/tzdata/+bug/1899343, which messes with `moment`
+RUN echo "Etc/UTC" | tee /etc/timezone && \
+ dpkg-reconfigure --frontend noninteractive tzdata
+
+# Clone the upstream. If you want to use your local copy, run the container with a volume that overwrites this.
+RUN git clone https://github.com/closeio/ciso8601.git && \
+ chmod +x /ciso8601/benchmarking/run_benchmarks.sh
+
+WORKDIR /ciso8601/benchmarking
+ENTRYPOINT ./run_benchmarks.sh
=====================================
benchmarking/README.rst
=====================================
@@ -33,10 +33,20 @@ This runs the benchmarks and generates reStructuredText files. The contents of t
.. _`README.rst`: https://github.com/closeio/ciso8601/blob/master/README.rst
+Running benchmarks for all Python versions
+------------------------------------------
+
+To make it easier to run the benchmarks for all supported Python versions, there is a Dockerfile you can build:
+
+.. code:: bash
+
+ % docker build -t ciso8601_benchmarking .
+ % docker run -it --rm=true -v $(dirname `pwd`):/ciso8601 ciso8601_benchmarking
+
Running custom benchmarks
-------------------------
-Running a custom benchmark is done by supplying `tox`_ with your custom timestamp:
+Running a custom benchmark is done by supplying `tox`_ with your custom timestamp:
.. code:: bash
@@ -46,7 +56,7 @@ Running a custom benchmark is done by supplying `tox`_ with your custom timestam
% tox '2014-01-09T21:48:00'
It calls `perform_comparison.py`_ in each of the supported Python interpreters on your machine.
-This in turn calls `timeit`_ for each of the modules defined in ``ISO_8601_MODULES``.
+This in turn calls `timeit`_ for each of the modules defined in ``ISO_8601_MODULES``.
.. _`tox`: https://tox.readthedocs.io/en/latest/index.html
.. _`timeit`: https://docs.python.org/3/library/timeit.html
@@ -70,6 +80,15 @@ Disclaimer
Because of the way that ``tox`` works (and the way the benchmark is structured more generally), it doesn't make sense to compare the results for a given module across different Python versions.
Comparisons between modules within the same Python version are still valid, and indeed, are the goal of the benchmarks.
+Caching
+-------
+
+`ciso8601` caches the ``tzinfo`` objects it creates, allowing it to reuse those objects for faster creation of subsequent ``datetime`` objects.
+For example, for some types of profiling, it makes sense not to have a cache.
+Caching can be disabled by modifying the `tox.ini`_ and changing ``CISO8601_CACHING_ENABLED`` to ``0``.
+
+.. _`tox.ini`: https://github.com/closeio/ciso8601/blob/master/benchmarking/tox.ini
+
FAQs
----
=====================================
benchmarking/format_results.py
=====================================
@@ -2,13 +2,35 @@ import argparse
import csv
import os
import platform
-import pytablewriter
import re
-import sys
-from collections import defaultdict, namedtuple
+from collections import defaultdict, UserDict
+
+import pytablewriter
+
+class Result:
+ def __init__(self, timing, parsed_value, exception, matched_expected):
+ self.timing = timing
+ self.parsed_value = parsed_value
+ self.exception = exception
+ self.matched_expected = matched_expected
+
+ def formatted_timing(self):
+ return format_duration(self.timing) if self.timing is not None else ""
+
+ def __str__(self):
+ if self.exception:
+ return f"Raised ``{self.exception}`` Exception"
+ elif not self.matched_expected:
+ return f"**Incorrect Result** (``{self.parsed_value}``)"
+ else:
+ return self.formatted_timing()
+
-Result = namedtuple('Result', ['timing', 'parsed_value', 'exception', 'matched_expected'])
+class ModuleResults(UserDict):
+ def most_modern_result(self):
+ non_exception_results = [(_python_version, result) for _python_version, result in self.data.items() if result.exception is None]
+ return sorted(non_exception_results, key=lambda kvp: kvp[0], reverse=True)[0][1]
FILENAME_REGEX_RAW = r"benchmark_timings_python(\d)(\d).csv"
FILENAME_REGEX = re.compile(FILENAME_REGEX_RAW)
@@ -19,7 +41,8 @@ MODULE_VERSION_FILENAME_REGEX = re.compile(MODULE_VERSION_FILENAME_REGEX_RAW)
UNITS = {"nsec": 1e-9, "usec": 1e-6, "msec": 1e-3, "sec": 1.0}
SCALES = sorted([(scale, unit) for unit, scale in UNITS.items()], reverse=True)
-NOT_APPLICABLE = 'N/A'
+NOT_APPLICABLE = "N/A"
+
def format_duration(duration):
# Based on cPython's `timeit` CLI formatting
@@ -28,26 +51,11 @@ def format_duration(duration):
return "%.*g %s" % (precision, duration / scale, unit)
-def format_relative(d1, d2):
- if d1 is None or d2 is None:
+def format_relative(duration1, duration2):
+ if duration1 is None or duration2 is None:
return NOT_APPLICABLE
precision = 1
- return "%.*fx" % (precision, d1 / d2)
-
-
-def determine_used_module_versions(results_directory):
- module_versions_used = defaultdict(dict)
- for parent, _dirs, files in os.walk(results_directory):
- files_to_process = [f for f in files if MODULE_VERSION_FILENAME_REGEX.match(f)]
- for csv_file in files_to_process:
- with open(os.path.join(parent, csv_file), 'r') as fin:
- reader = csv.reader(fin, delimiter=",", quotechar='"')
- major, minor = next(reader)
- for module, version in reader:
- if version not in module_versions_used[module]:
- module_versions_used[module][version] = set()
- module_versions_used[module][version].add('.'.join((major, minor)))
- return module_versions_used
+ return "%.*fx" % (precision, duration1 / duration2)
def format_used_module_versions(module_versions_used):
@@ -60,91 +68,120 @@ def format_used_module_versions(module_versions_used):
return results
-def format_result(result):
- if result == NOT_APPLICABLE:
- return NOT_APPLICABLE
- elif result.exception:
- return f"Raised ``{result.exception}`` Exception"
- elif not result.matched_expected:
- return f"**Incorrect Result** (``{result.parsed_value}``)"
- else:
- return format_duration(result.timing)
+def relative_slowdown(subject, comparison):
+ most_modern_common_version = next(iter(sorted(set(subject.keys()).intersection(set(comparison)), reverse=True)), None)
+
+ if not most_modern_common_version:
+ raise ValueError("No common Python version found")
+
+ return format_relative(subject[most_modern_common_version].timing, comparison[most_modern_common_version].timing)
-def main(results_directory, output_file, compare_to, include_call, module_version_output):
+def filepaths(directory, condition):
+ return [os.path.join(parent, f) for parent, _dirs, files in os.walk(directory) for f in files if condition(f)]
+
+def load_benchmarking_results(results_directory):
calling_code = {}
timestamps = set()
- all_results = defaultdict(dict)
- timing_results = defaultdict(dict)
-
- for parent, _dirs, files in os.walk(results_directory):
- files_to_process = [f for f in files if FILENAME_REGEX.match(f)]
- for csv_file in files_to_process:
- try:
- with open(os.path.join(parent, csv_file), 'r') as fin:
- reader = csv.reader(fin, delimiter=",", quotechar='"')
- major, minor, timestamp = next(reader)
- timestamps.add(timestamp)
- for module, _setup, stmt, parse_result, count, time_taken, matched, exception in reader:
- all_results[(major, minor)][module] = Result(float(time_taken) / int(count),
- parse_result,
- exception,
- True if matched == "True" else False
- )
- timing_results[(major, minor)][module] = all_results[(major, minor)][module].timing
- calling_code[module] = f"``{stmt.format(timestamp=timestamp)}``"
- except:
- print(f"Problem while parsing `{os.path.join(parent, csv_file)}`")
- raise
+ python_versions = set()
+ results = defaultdict(ModuleResults)
+ files_to_process = filepaths(results_directory, FILENAME_REGEX.match)
+ for csv_file in files_to_process:
+ try:
+ with open(csv_file, "r") as fin:
+ reader = csv.reader(fin, delimiter=",", quotechar='"')
+ major, minor, timestamp = next(reader)
+ timestamps.add(timestamp)
+ for module, _setup, stmt, parse_result, count, time_taken, matched, exception in reader:
+ timing = float(time_taken) / int(count) if exception == "" else None
+ exception = exception if exception != "" else None
+ results[module][(major, minor)] = Result(
+ timing,
+ parse_result,
+ exception,
+ matched == "True"
+ )
+ python_versions.add((major, minor))
+ calling_code[module] = f"``{stmt.format(timestamp=timestamp)}``"
+ except Exception:
+ print(f"Problem while parsing `{csv_file}`")
+ raise
if len(timestamps) > 1:
raise NotImplementedError(f"Found a mix of files in the results directory. Found files that represent the parsing of {timestamps}. Support for handling multiple timestamps is not implemented.")
- all_modules = set([module for value in timing_results.values() for module in value.keys()])
- python_versions_by_modernity = sorted(timing_results.keys(), reverse=True)
- most_modern_python = python_versions_by_modernity[0]
- modules_by_modern_speed = sorted(all_modules, key=lambda module: timing_results[most_modern_python][module])
+ python_versions_by_modernity = sorted(python_versions, reverse=True)
+ return results, python_versions_by_modernity, calling_code
+
+
+def write_benchmarking_results(results_directory, output_file, baseline_module, include_call):
+ results, python_versions_by_modernity, calling_code = load_benchmarking_results(results_directory)
+ modules_by_modern_speed = [module for module, results in sorted([*results.items()], key=lambda kvp: kvp[1].most_modern_result().timing)]
writer = pytablewriter.RstGridTableWriter()
formatted_python_versions = ["Python {}".format(".".join(key)) for key in python_versions_by_modernity]
- writer.header_list = ["Module"] + (["Call"] if include_call else []) + formatted_python_versions + [f"Relative Slowdown (versus {compare_to}, {formatted_python_versions[0]})"]
+ writer.header_list = ["Module"] + (["Call"] if include_call else []) + formatted_python_versions + [f"Relative Slowdown (versus {baseline_module}, latest Python)"]
writer.type_hint_list = [pytablewriter.String] * len(writer.header_list)
-
calling_codes = [calling_code[module] for module in modules_by_modern_speed]
- performance_results = [[format_result(all_results[python_version].get(module, NOT_APPLICABLE)) for python_version in python_versions_by_modernity] for module in modules_by_modern_speed]
- relative_slowdowns = [format_relative(timing_results[most_modern_python].get(module), timing_results[most_modern_python].get(compare_to)) if module != compare_to else NOT_APPLICABLE for module in modules_by_modern_speed]
-
+ performance_results = [[results[module].get(python_version, NOT_APPLICABLE) for python_version in python_versions_by_modernity] for module in modules_by_modern_speed]
+ relative_slowdowns = [relative_slowdown(results[module], results[baseline_module]) if module != baseline_module else NOT_APPLICABLE for module in modules_by_modern_speed]
+
writer.value_matrix = [
[module] + ([calling_code[module]] if include_call else []) + performance_by_version + [relative_slowdown] for module, calling_code, performance_by_version, relative_slowdown in zip(modules_by_modern_speed, calling_codes, performance_results, relative_slowdowns)
]
- with open(output_file, 'w') as fout:
+ with open(output_file, "w") as fout:
writer.stream = fout
writer.write_table()
- fout.write('\n')
+ fout.write("\n")
- if modules_by_modern_speed[0] == compare_to:
- fout.write(f"{compare_to} takes {format_duration(timing_results[most_modern_python][compare_to])}, which is **{format_relative(timing_results[most_modern_python][modules_by_modern_speed[1]], timing_results[most_modern_python][compare_to])} faster than {modules_by_modern_speed[1]}**, the next fastest ISO 8601 parser in this comparison.\n")
- else:
- fout.write(f"{compare_to} takes {format_duration(timing_results[most_modern_python][compare_to])}, which is **{format_relative(timing_results[most_modern_python][compare_to], timing_results[most_modern_python][modules_by_modern_speed[0]])} slower than {modules_by_modern_speed[0]}**, the fastest ISO 8601 parser in this comparison.\n")
+ if len(modules_by_modern_speed) > 1:
+ baseline_module_timing = results[baseline_module].most_modern_result().formatted_timing()
+
+ fastest_module, next_fastest_module = modules_by_modern_speed[0:2]
+ if fastest_module == baseline_module:
+ fout.write(f"{baseline_module} takes {baseline_module_timing}, which is **{relative_slowdown(results[next_fastest_module], results[baseline_module])} faster than {next_fastest_module}**, the next fastest ISO 8601 parser in this comparison.\n")
+ else:
+ fout.write(f"{baseline_module} takes {baseline_module_timing}, which is **{relative_slowdown(results[baseline_module], results[fastest_module])} slower than {fastest_module}**, the fastest ISO 8601 parser in this comparison.\n")
- with open(os.path.join(os.path.dirname(output_file), module_version_output), 'w') as fout:
+
+def load_module_version_info(results_directory):
+ module_versions_used = defaultdict(dict)
+ files_to_process = filepaths(results_directory, MODULE_VERSION_FILENAME_REGEX.match)
+ for csv_file in files_to_process:
+ with open(csv_file, "r") as fin:
+ reader = csv.reader(fin, delimiter=",", quotechar='"')
+ major, minor = next(reader)
+ for module, version in reader:
+ if version not in module_versions_used[module]:
+ module_versions_used[module][version] = set()
+ module_versions_used[module][version].add(".".join((major, minor)))
+ return module_versions_used
+
+
+def write_module_version_info(results_directory, output_file):
+ with open(output_file, "w") as fout:
fout.write(f"Tested on {platform.system()} {platform.release()} using the following modules:\n")
- fout.write('\n')
+ fout.write("\n")
fout.write(".. code:: python\n")
- fout.write('\n')
- for module_version_line in format_used_module_versions(determine_used_module_versions(results_directory)):
+ fout.write("\n")
+ for module_version_line in format_used_module_versions(load_module_version_info(results_directory)):
fout.write(f" {module_version_line}\n")
-if __name__ == '__main__':
+def main(results_directory, output_file, baseline_module, include_call, module_version_output):
+ write_benchmarking_results(results_directory, output_file, baseline_module, include_call)
+ write_module_version_info(results_directory, os.path.join(os.path.dirname(output_file), module_version_output))
+
+
+if __name__ == "__main__":
OUTPUT_FILE_HELP = "The filepath to use when outputting the reStructuredText results."
RESULTS_DIR_HELP = f"Which directory the script should look in to find benchmarking results. Will process any file that match the regexes '{FILENAME_REGEX_RAW}' and '{MODULE_VERSION_FILENAME_REGEX_RAW}'."
- BASE_LIBRARY_DEFAULT = "ciso8601"
- BASE_LIBRARY_HELP = f"The module to make all relative calculations relative to (default: \"{BASE_LIBRARY_DEFAULT}\")."
+ BASELINE_LIBRARY_DEFAULT = "ciso8601"
+ BASELINE_LIBRARY_HELP = f'The module to make all relative calculations relative to (default: "{BASELINE_LIBRARY_DEFAULT}").'
INCLUDE_CALL_DEFAULT = False
INCLUDE_CALL_HELP = f"Whether or not to include a column showing the actual code call (default: {INCLUDE_CALL_DEFAULT})."
@@ -155,7 +192,7 @@ if __name__ == '__main__':
parser = argparse.ArgumentParser("Formats the benchmarking results into a nicely formatted block of reStructuredText for use in the README.")
parser.add_argument("RESULTS", help=RESULTS_DIR_HELP)
parser.add_argument("OUTPUT", help=OUTPUT_FILE_HELP)
- parser.add_argument("--base-module", required=False, default=BASE_LIBRARY_DEFAULT, help=BASE_LIBRARY_HELP)
+ parser.add_argument("--baseline-module", required=False, default=BASELINE_LIBRARY_DEFAULT, help=BASELINE_LIBRARY_HELP)
parser.add_argument("--include-call", required=False, type=bool, default=INCLUDE_CALL_DEFAULT, help=INCLUDE_CALL_HELP)
parser.add_argument("--module-version-output", required=False, default=MODULE_VERSION_OUTPUT_FILE_DEFAULT, help=MODULE_VERSION_OUTPUT_FILE_HELP)
@@ -164,4 +201,4 @@ if __name__ == '__main__':
if not os.path.exists(args.RESULTS):
raise ValueError(f'Results directory "{args.RESULTS}" does not exist.')
- main(args.RESULTS, args.OUTPUT, args.base_module, args.include_call, args.module_version_output)
+ main(args.RESULTS, args.OUTPUT, args.baseline_module, args.include_call, args.module_version_output)
=====================================
benchmarking/perform_comparison.py
=====================================
@@ -1,43 +1,72 @@
import argparse
import csv
import os
-import pytz
import sys
import timeit
from datetime import datetime
+import pytz
+
+
try:
from importlib.metadata import version as get_module_version
except ImportError:
from importlib_metadata import version as get_module_version
ISO_8601_MODULES = {
- "aniso8601": ('import aniso8601', "aniso8601.parse_datetime('{timestamp}')"),
- "ciso8601": ('import ciso8601', "ciso8601.parse_datetime('{timestamp}')"),
- "python-dateutil": ('import dateutil.parser', "dateutil.parser.parse('{timestamp}')"),
- "iso8601": ('import iso8601', "iso8601.parse_date('{timestamp}')"),
- "iso8601utils": ('from iso8601utils import parsers', "parsers.datetime('{timestamp}')"),
- "isodate": ('import isodate', "isodate.parse_datetime('{timestamp}')"),
- "maya": ('import maya', "maya.parse('{timestamp}').datetime()"),
- "pendulum": ('from pendulum.parsing import parse_iso8601', "parse_iso8601('{timestamp}')"),
- "PySO8601": ('import PySO8601', "PySO8601.parse('{timestamp}')"),
- "str2date": ('from str2date import str2date', "str2date('{timestamp}')"),
+ "aniso8601": ("import aniso8601", "aniso8601.parse_datetime('{timestamp}')"),
+ "ciso8601": ("import ciso8601", "ciso8601.parse_datetime('{timestamp}')"),
+ "python-dateutil": ("import dateutil.parser", "dateutil.parser.parse('{timestamp}')"),
+ "iso8601": ("import iso8601", "iso8601.parse_date('{timestamp}')"),
+ "isodate": ("import isodate", "isodate.parse_datetime('{timestamp}')"),
+ "maya": ("import maya", "maya.parse('{timestamp}').datetime()"),
+ "pendulum": ("from pendulum.parsing import parse_iso8601", "parse_iso8601('{timestamp}')"),
+ "PySO8601": ("import PySO8601", "PySO8601.parse('{timestamp}')"),
+ "str2date": ("from str2date import str2date", "str2date('{timestamp}')"),
}
-if os.name != 'nt':
+if os.name != "nt" and (sys.version_info.major, sys.version_info.minor) < (3, 9):
# udatetime doesn't support Windows.
- ISO_8601_MODULES["udatetime"] = ('import udatetime', "udatetime.from_string('{timestamp}')")
+ ISO_8601_MODULES["udatetime"] = ("import udatetime", "udatetime.from_string('{timestamp}')")
+
+if (sys.version_info.major, sys.version_info.minor) >= (3, 6):
+ # zulu v2.0.0+ no longer supports Python < 3.6
+ ISO_8601_MODULES["zulu"] = ("import zulu", "zulu.parse('{timestamp}')")
-if sys.version_info.major > 2:
- # zulu no longer supports Python 2.7
- ISO_8601_MODULES["zulu"] = ('import zulu', "zulu.parse('{timestamp}')")
+if (sys.version_info.major, sys.version_info.minor) != (3, 6):
+ # iso8601utils installs enum34, which messes with tox in Python 3.6
+ # https://stackoverflow.com/q/43124775
+ ISO_8601_MODULES["iso8601utils"] = ("from iso8601utils import parsers", "parsers.datetime('{timestamp}')")
if (sys.version_info.major, sys.version_info.minor) != (3, 4):
# arrow no longer supports Python 3.4
- ISO_8601_MODULES["arrow"] = ('import arrow', "arrow.get('{timestamp}').datetime")
+ ISO_8601_MODULES["arrow"] = ("import arrow", "arrow.get('{timestamp}').datetime")
# moment is built on `times`, which is built on `arrow`, which no longer supports Python 3.4
- ISO_8601_MODULES["moment"] = ('import moment', "moment.date('{timestamp}').date")
+ ISO_8601_MODULES["moment"] = ("import moment", "moment.date('{timestamp}').date")
+
+class Result:
+ def __init__(self, module, setup, stmt, parse_result, count, time_taken, matched, exception):
+ self.module = module
+ self.setup = setup
+ self.stmt = stmt
+ self.parse_result = parse_result
+ self.count = count
+ self.time_taken = time_taken
+ self.matched = matched
+ self.exception = exception
+
+ def to_row(self):
+ return [
+ self.module,
+ self.setup,
+ self.stmt,
+ self.parse_result,
+ self.count,
+ self.time_taken,
+ self.matched,
+ self.exception
+ ]
def check_roughly_equivalent(dt1, dt2):
# For the purposes of our benchmarking, we don't care if the datetime
@@ -46,72 +75,102 @@ def check_roughly_equivalent(dt1, dt2):
dt2 = dt2.replace(tzinfo=pytz.UTC) if isinstance(dt2, datetime) and dt2.tzinfo is None else dt2
return dt1 == dt2
+def auto_range_counts(filepath):
+ results = {}
+ if os.path.exists(filepath):
+ with open(filepath, "r") as fin:
+ reader = csv.reader(fin, delimiter=",", quotechar='"')
+ for module, count in reader:
+ results[module] = int(count)
+ return results
+
+def update_auto_range_counts(filepath, results):
+ new_counts = dict([[result.module, result.count] for result in results if result.count is not None])
+ new_auto_range_counts = auto_range_counts(filepath)
+ new_auto_range_counts.update(new_counts)
+ with open(filepath, "w") as fout:
+ auto_range_file_writer = csv.writer(fout, delimiter=",", quotechar='"', lineterminator="\n")
+ for module, count in sorted(new_auto_range_counts.items()):
+ auto_range_file_writer.writerow([module, count])
+
+def write_results(filepath, timestamp, results):
+ with open(filepath, "w") as fout:
+ writer = csv.writer(fout, delimiter=",", quotechar='"', lineterminator="\n")
+ writer.writerow([sys.version_info.major, sys.version_info.minor, timestamp])
+ for result in results:
+ writer.writerow(result.to_row())
+
+def write_module_versions(filepath):
+ with open(filepath, "w") as fout:
+ module_version_writer = csv.writer(fout, delimiter=",", quotechar='"', lineterminator="\n")
+ module_version_writer.writerow([sys.version_info.major, sys.version_info.minor])
+ for module, (_setup, _stmt) in sorted(ISO_8601_MODULES.items(), key=lambda x: x[0].lower()):
+ module_version_writer.writerow([module, get_module_version(module)])
def run_tests(timestamp, results_directory, compare_to):
# `Timer.autorange` only exists in Python 3.6+. We want the tests to run in a reasonable amount of time,
# but we don't want to have to hard-code how many times to run each test.
# So we make sure to call Python 3.6+ versions first. They output a file that the others use to know how many iterations to run.
- test_interation_counts = {}
- auto_range_file_obj = None
- auto_range_file_writer = None
- try:
- if (sys.version_info.major == 3 and sys.version_info.minor >= 6) or sys.version_info.major > 3:
- auto_range_file_obj = open(os.path.join(results_directory, "auto_range_counts.csv"), 'w')
- auto_range_file_writer = csv.writer(auto_range_file_obj, delimiter=',', quotechar='"', lineterminator='\n')
- else:
- with open(os.path.join(results_directory, "auto_range_counts.csv"), "r") as fin:
- reader = csv.reader(fin, delimiter=',', quotechar='"')
- for module, count in reader:
- test_interation_counts[module] = int(count)
-
- exec(ISO_8601_MODULES[compare_to][0])
- expected_parse_result = eval(ISO_8601_MODULES[compare_to][1].format(timestamp=timestamp))
-
- with open(os.path.join(results_directory, "benchmark_timings_python{major}{minor}.csv".format(major=sys.version_info.major, minor=sys.version_info.minor)), 'w') as fout:
- writer = csv.writer(fout, delimiter=',', quotechar='"', lineterminator='\n')
- writer.writerow([sys.version_info.major, sys.version_info.minor, timestamp])
- for module, (setup, stmt) in ISO_8601_MODULES.items():
- count = None
- time_taken = None
- exception = None
- try:
- exec(setup)
- parse_result = eval(stmt.format(timestamp=timestamp))
-
- if module in test_interation_counts:
- count = test_interation_counts[module]
- timer = timeit.Timer(stmt=stmt.format(timestamp=timestamp), setup=setup)
- time_taken = timer.timeit(number=count)
- else:
- timer = timeit.Timer(stmt=stmt.format(timestamp=timestamp), setup=setup)
- count, time_taken = timer.autorange()
- except Exception as exc:
- parse_result = None
- exception = type(exc)
-
- writer.writerow([module, setup, stmt.format(timestamp=timestamp), parse_result if parse_result is not None else "None", count, time_taken, check_roughly_equivalent(parse_result, expected_parse_result), exception])
-
- if auto_range_file_writer is not None:
- auto_range_file_writer.writerow([module, count])
- finally:
- if auto_range_file_obj is not None:
- auto_range_file_obj.close()
-
- with open(os.path.join(results_directory, "module_versions_python{major}{minor}.csv".format(major=sys.version_info.major, minor=sys.version_info.minor)), 'w') as fout:
- module_version_writer = csv.writer(fout, delimiter=',', quotechar='"', lineterminator='\n')
- module_version_writer.writerow([sys.version_info.major, sys.version_info.minor])
- for module, (setup, stmt) in sorted(ISO_8601_MODULES.items(), key=lambda x: x[0].lower()):
- module_version_writer.writerow([module, get_module_version(module)])
-
-
-if __name__ == '__main__':
+ auto_range_count_filepath = os.path.join(results_directory, "auto_range_counts.csv")
+ test_interation_counts = auto_range_counts(auto_range_count_filepath)
+
+ exec(ISO_8601_MODULES[compare_to][0])
+ expected_parse_result = eval(ISO_8601_MODULES[compare_to][1].format(timestamp=timestamp))
+
+ results = []
+
+ for module, (setup, stmt) in ISO_8601_MODULES.items():
+ count = None
+ time_taken = None
+ exception = None
+ try:
+ exec(setup)
+ parse_result = eval(stmt.format(timestamp=timestamp))
+
+ timer = timeit.Timer(stmt=stmt.format(timestamp=timestamp), setup=setup)
+ if hasattr(timer, 'autorange'):
+ count, time_taken = timer.autorange()
+ else:
+ count = test_interation_counts[module]
+ time_taken = timer.timeit(number=count)
+ except Exception as exc:
+ count = None
+ time_taken = None
+ parse_result = None
+ exception = type(exc)
+
+ results.append(
+ Result(
+ module,
+ setup,
+ stmt.format(timestamp=timestamp),
+ parse_result if parse_result is not None else "None",
+ count,
+ time_taken,
+ check_roughly_equivalent(parse_result, expected_parse_result),
+ exception,
+ )
+ )
+
+ update_auto_range_counts(auto_range_count_filepath, results)
+
+ results_filepath = os.path.join(results_directory, "benchmark_timings_python{major}{minor}.csv".format(major=sys.version_info.major, minor=sys.version_info.minor))
+ write_results(results_filepath, timestamp, results)
+
+ module_versions_filepath = os.path.join(results_directory, "module_versions_python{major}{minor}.csv".format(major=sys.version_info.major, minor=sys.version_info.minor))
+ write_module_versions(module_versions_filepath)
+
+def sanitize_timestamp_as_filename(timestamp):
+ return timestamp.replace(":", "")
+
+if __name__ == "__main__":
TIMESTAMP_HELP = "Which ISO 8601 timestamp to parse"
BASE_LIBRARY_DEFAULT = "ciso8601"
- BASE_LIBRARY_HELP = "The module to make correctness decisions relative to (default: \"{default}\").".format(default=BASE_LIBRARY_DEFAULT)
+ BASE_LIBRARY_HELP = 'The module to make correctness decisions relative to (default: "{default}").'.format(default=BASE_LIBRARY_DEFAULT)
RESULTS_DIR_DEFAULT = "benchmark_results"
- RESULTS_DIR_HELP = "Which directory the script should output benchmarking results. (default: \"{0}\")".format(RESULTS_DIR_DEFAULT)
+ RESULTS_DIR_HELP = 'Which directory the script should output benchmarking results. (default: "{0}")'.format(RESULTS_DIR_DEFAULT)
parser = argparse.ArgumentParser("Runs `timeit` to benchmark a variety of ISO 8601 parsers.")
parser.add_argument("TIMESTAMP", help=TIMESTAMP_HELP)
@@ -119,7 +178,7 @@ if __name__ == '__main__':
parser.add_argument("--results", required=False, default=RESULTS_DIR_DEFAULT, help=RESULTS_DIR_HELP)
args = parser.parse_args()
- output_dir = os.path.join(args.results, args.TIMESTAMP.replace(":", ""))
+ output_dir = os.path.join(args.results, sanitize_timestamp_as_filename(args.TIMESTAMP))
if not os.path.exists(output_dir):
os.makedirs(output_dir)
=====================================
benchmarking/rst_include_replace.py
=====================================
@@ -20,12 +20,12 @@ def replace_include(target_filepath, include_file, source_filepath):
start_block_regex = re.compile(INCLUDE_BLOCK_START.format(filename=include_file))
end_block_regex = re.compile(INCLUDE_BLOCK_END.format(filename=include_file))
- with open(source_filepath, 'r') as fin:
+ with open(source_filepath, "r") as fin:
replacement_lines = iter(fin.readlines())
- with open(target_filepath, 'r') as fin:
+ with open(target_filepath, "r") as fin:
target_lines = iter(fin.readlines())
- with open(target_filepath, 'w') as fout:
+ with open(target_filepath, "w") as fout:
for line in target_lines:
if start_block_regex.match(line):
fout.write(line)
@@ -44,7 +44,7 @@ def replace_include(target_filepath, include_file, source_filepath):
fout.write(line)
-if __name__ == '__main__':
+if __name__ == "__main__":
TARGET_HELP = "The filepath you wish to replace tags within."
INCLUDE_TAG_HELP = "The filename within the tag you are hoping to replace. (ex. 'benchmark_with_time_zone.rst')"
SOURCE_HELP = "The filepath whose contents should be included into the TARGET file."
@@ -57,9 +57,9 @@ if __name__ == '__main__':
args = parser.parse_args()
if not os.path.exists(args.TARGET):
- raise ValueError(f'TARGET path {args.TARGET} does not exist')
+ raise ValueError(f"TARGET path {args.TARGET} does not exist")
if not os.path.exists(args.SOURCE):
- raise ValueError(f'SOURCE path {args.SOURCE} does not exist')
+ raise ValueError(f"SOURCE path {args.SOURCE} does not exist")
replace_include(args.TARGET, args.INCLUDE_TAG, args.SOURCE)
=====================================
benchmarking/run_benchmarks.sh
=====================================
@@ -4,4 +4,4 @@ python format_results.py benchmark_results/2014-01-09T214800 benchmark_results/b
python format_results.py benchmark_results/2014-01-09T214800-0530 benchmark_results/benchmark_with_time_zone.rst
python rst_include_replace.py ../README.rst 'benchmark_with_no_time_zone.rst' benchmark_results/benchmark_with_no_time_zone.rst
python rst_include_replace.py ../README.rst 'benchmark_with_time_zone.rst' benchmark_results/benchmark_with_time_zone.rst
-python rst_include_replace.py ../README.rst 'benchmark_module_versions.rst' benchmark_results/benchmark_module_versions.rst
\ No newline at end of file
+python rst_include_replace.py ../README.rst 'benchmark_module_versions.rst' benchmark_results/benchmark_module_versions.rst
=====================================
benchmarking/tox.ini
=====================================
@@ -1,30 +1,35 @@
[tox]
-envlist = py38,py37,py36,py35,py34,py27
+envlist = py39,py38,py37,py36,py35,py34,py27
setupdir=..
[testenv]
+setenv =
+ CISO8601_CACHING_ENABLED = 1
deps=
; The libraries needed to run the benchmarking itself
-rrequirements.txt
; The actual ISO 8601 parsing libraries
aniso8601
- ; arrow no longer supports Python 3.4
+ ; `arrow` no longer supports Python 3.4
arrow; python_version != '3.4'
iso8601
- iso8601utils
+ ; `iso8601utils` installs `enum34`, which messes with tox in Python 3.6
+ ; https://stackoverflow.com/q/43124775
+ iso8601utils; python_version != '3.6'
isodate
maya
- ; moment is built on `times`, which is built on `arrow`, which no longer supports Python 3.4
+ ; `moment` is built on `times`, which is built on `arrow`, which no longer supports Python 3.4
moment; python_version != '3.4'
pendulum
pyso8601
python-dateutil
str2date
- ; udatetime doesn't support Windows
- udatetime; os_name != 'nt'
- ; zulu no longer supports Python 2.7
- zulu; python_version > '2.7'
+ ; `udatetime` doesn't support Windows
+ ; `udatetime` doesn't compile on Python 3.9 (https://github.com/freach/udatetime/issues/32)
+ udatetime; os_name != 'nt' and python_version < '3.9'
+ ; `zulu` v2.0.0+ no longer supports Python < 3.6
+ zulu; python_version >= '3.6'
pytz
commands=
- python -W ignore perform_comparison.py {posargs:DEFAULTS}
\ No newline at end of file
+ python -W ignore perform_comparison.py {posargs:DEFAULTS}
=====================================
generate_test_timestamps.py
=====================================
@@ -27,11 +27,14 @@ NUMBER_FIELDS = {
"second": NumberField(2, 2, 0, 60), # 60 = Leap second
"microsecond": NumberField(1, None, 0, None), # Can have unbounded characters
"tzhour": NumberField(2, 2, 0, 23),
- "tzminute": NumberField(2, 2, 0, 59)
+ "tzminute": NumberField(2, 2, 0, 59),
}
PADDED_NUMBER_FIELD_FORMATS = {
- field_name: "{{{field_name}:0>{max_width}}}".format(field_name=field_name, max_width=field.max_width if field.max_width is not None else 1)
+ field_name: "{{{field_name}:0>{max_width}}}".format(
+ field_name=field_name,
+ max_width=field.max_width if field.max_width is not None else 1,
+ )
for field_name, field in NUMBER_FIELDS.items()
}
@@ -50,12 +53,7 @@ def __generate_valid_formats(year=2014, month=2, day=3, hour=1, minute=23, secon
("{year}-{month}-{day}", set(["year", "month", "day"]), {"year": year, "month": month, "day": day}),
]
- valid_date_and_time_separators = [
- None,
- 'T',
- 't',
- ' '
- ]
+ valid_date_and_time_separators = [None, "T", "t", " "]
valid_basic_time_formats = [
("{hour}", set(["hour"]), {"hour": hour}),
@@ -135,7 +133,7 @@ def generate_valid_timestamp_and_datetime(year=2014, month=2, day=3, hour=1, min
"second": second,
"microsecond": microsecond,
"tzhour": tzhour,
- "tzminute": tzminute
+ "tzminute": tzminute,
}
for timestamp_format, _fields, datetime_params in __generate_valid_formats(**kwargs):
# Pad each field to the appropriate width
@@ -176,7 +174,7 @@ def generate_invalid_timestamp(year=2014, month=2, day=3, hour=1, minute=23, sec
"second": second,
"microsecond": microsecond,
"tzhour": tzhour,
- "tzminute": tzminute
+ "tzminute": tzminute,
}
for timestamp_format, fields, _datetime_params in __generate_valid_formats(**kwargs):
=====================================
module.c
=====================================
@@ -1,12 +1,11 @@
#include <Python.h>
#include <ctype.h>
#include <datetime.h>
+#include "timezone.h"
#define STRINGIZE(x) #x
#define EXPAND_AND_STRINGIZE(x) STRINGIZE(x)
-#define PY_VERSION_AT_LEAST_32 \
- ((PY_MAJOR_VERSION == 3 && PY_MINOR_VERSION >= 2) || PY_MAJOR_VERSION > 3)
#define PY_VERSION_AT_LEAST_33 \
((PY_MAJOR_VERSION == 3 && PY_MINOR_VERSION >= 3) || PY_MAJOR_VERSION > 3)
#define PY_VERSION_AT_LEAST_36 \
@@ -14,12 +13,29 @@
#define PY_VERSION_AT_LEAST_37 \
((PY_MAJOR_VERSION == 3 && PY_MINOR_VERSION >= 7) || PY_MAJOR_VERSION > 3)
-#if !PY_VERSION_AT_LEAST_37
-static PyObject *fixed_offset;
+// PyPy compatibility for cPython 3.7's Timezone API was added to PyPy 7.3.6
+// https://foss.heptapod.net/pypy/pypy/-/merge_requests/826
+#ifdef PYPY_VERSION
+ #define SUPPORTS_37_TIMEZONE_API \
+ (PYPY_VERSION_NUM >= 0x07030600)
+#else
+ #define SUPPORTS_37_TIMEZONE_API \
+ PY_VERSION_AT_LEAST_37
#endif
static PyObject *utc;
+#if CISO8601_CACHING_ENABLED
+/* 2879 = (1439 * 2) + 1, number of offsets from UTC possible in
+ * Python (ie. [-1439, 1439]).
+ *
+ * 0 - 1438 = Negative offsets [-1439..-1]
+ * 1439 = Zero offset
+ * 1440 - 2878 = Positive offsets [1...1439]
+ */
+static PyObject *tz_cache[2879] = {NULL};
+#endif
+
#define PARSE_INTEGER(field, length, field_name) \
for (i = 0; i < length; i++) { \
if (*c >= '0' && *c <= '9') { \
@@ -144,6 +160,9 @@ _parse(PyObject *self, PyObject *args, int parse_any_tzinfo, int rfc3339_only)
usecond = 0;
int time_is_midnight = 0;
int tzhour = 0, tzminute = 0, tzsign = 0;
+#if CISO8601_CACHING_ENABLED
+ int tz_index = 0;
+#endif
PyObject *delta;
PyObject *temp;
int extended_date_format = 0;
@@ -427,34 +446,47 @@ _parse(PyObject *self, PyObject *args, int parse_any_tzinfo, int rfc3339_only)
tzminute += 60 * tzhour;
tzminute *= tzsign;
-#if !PY_VERSION_AT_LEAST_32
- if (fixed_offset == NULL || utc == NULL) {
- PyErr_SetString(PyExc_ImportError,
- "Cannot parse a timestamp with time zone "
- "information without the pytz dependency. "
- "Install it with `pip install pytz`.");
- return NULL;
- }
-#endif
-
if (tzminute == 0) {
tzinfo = utc;
}
- else {
-#if PY_VERSION_AT_LEAST_37
- delta = PyDelta_FromDSU(0, 60 * tzminute, 0);
- tzinfo = PyTimeZone_FromOffset(delta);
+ else if (abs(tzminute) >= 1440) {
+ /* Format the error message as if we were still using pytz
+ * for Python 2 and datetime.timezone for Python 3.
+ * This is done to maintain complete backwards
+ * compatibility with ciso8601 2.0.x. Perhaps change to a
+ * simpler message in ciso8601 v3.0.0.
+ */
+#if PY_MAJOR_VERSION >= 3
+ delta = PyDelta_FromDSU(0, tzminute * 60, 0);
+ PyErr_Format(PyExc_ValueError,
+ "offset must be a timedelta"
+ " strictly between -timedelta(hours=24) and"
+ " timedelta(hours=24),"
+ " not %R.",
+ delta);
Py_DECREF(delta);
-#elif PY_VERSION_AT_LEAST_32
- tzinfo = PyObject_CallFunction(
- fixed_offset, "N",
- PyDelta_FromDSU(0, 60 * tzminute, 0));
#else
- tzinfo =
- PyObject_CallFunction(fixed_offset, "i", tzminute);
+ PyErr_Format(PyExc_ValueError,
+ "('absolute offset is too large', %d)",
+ tzminute);
#endif
+ return NULL;
+ }
+ else {
+#if CISO8601_CACHING_ENABLED
+ tz_index = tzminute + 1439;
+ if ((tzinfo = tz_cache[tz_index]) == NULL) {
+ tzinfo = new_fixed_offset(60 * tzminute);
+
+ if (tzinfo == NULL) /* ie. PyErr_Occurred() */
+ return NULL;
+ tz_cache[tz_index] = tzinfo;
+ }
+#else
+ tzinfo = new_fixed_offset(60 * tzminute);
if (tzinfo == NULL) /* ie. PyErr_Occurred() */
return NULL;
+#endif
}
}
}
@@ -473,6 +505,10 @@ _parse(PyObject *self, PyObject *args, int parse_any_tzinfo, int rfc3339_only)
/* Make sure that there is no more to parse. */
if (*c != '\0') {
PyErr_Format(PyExc_ValueError, "unconverted data remains: '%s'", c);
+#if !CISO8601_CACHING_ENABLED
+ if (tzinfo != Py_None && tzinfo != utc)
+ Py_DECREF(tzinfo);
+#endif
return NULL;
}
@@ -480,8 +516,10 @@ _parse(PyObject *self, PyObject *args, int parse_any_tzinfo, int rfc3339_only)
year, month, day, hour, minute, second, usecond, tzinfo,
PyDateTimeAPI->DateTimeType);
+#if !CISO8601_CACHING_ENABLED
if (tzinfo != Py_None && tzinfo != utc)
Py_DECREF(tzinfo);
+#endif
if (obj && time_is_midnight) {
delta = PyDelta_FromDSU(1, 0, 0); /* 1 day */
@@ -542,12 +580,6 @@ PyInit_ciso8601(void)
initciso8601(void)
#endif
{
-#if !PY_VERSION_AT_LEAST_32
- PyObject *pytz;
-#elif !PY_VERSION_AT_LEAST_37
- PyObject *datetime;
-#endif
-
#if PY_MAJOR_VERSION >= 3
PyObject *module = PyModule_Create(&moduledef);
#else
@@ -558,28 +590,23 @@ initciso8601(void)
EXPAND_AND_STRINGIZE(CISO8601_VERSION));
PyDateTime_IMPORT;
-#if PY_VERSION_AT_LEAST_37
- utc = PyDateTime_TimeZone_UTC;
-#elif PY_VERSION_AT_LEAST_32
- datetime = PyImport_ImportModule("datetime");
- if (datetime == NULL)
- return NULL;
- fixed_offset = PyObject_GetAttrString(datetime, "timezone");
- if (fixed_offset == NULL)
- return NULL;
- utc = PyObject_GetAttrString(fixed_offset, "utc");
- if (utc == NULL)
+
+ // PyMODINIT_FUNC is void in Python 2, returns PyObject* in Python 3
+ if (initialize_timezone_code(module) < 0) {
+#if PY_MAJOR_VERSION >= 3
return NULL;
#else
- pytz = PyImport_ImportModule("pytz");
- if (pytz == NULL) {
- PyErr_Clear();
- }
- else {
- fixed_offset = PyObject_GetAttrString(pytz, "FixedOffset");
- utc = PyObject_GetAttrString(pytz, "UTC");
+ return;
+#endif
}
+
+#if SUPPORTS_37_TIMEZONE_API
+ utc = PyDateTime_TimeZone_UTC;
+#else
+ utc = new_fixed_offset(0);
#endif
+
+// PyMODINIT_FUNC is void in Python 2, returns PyObject* in Python 3
#if PY_MAJOR_VERSION >= 3
return module;
#endif
=====================================
pyproject.toml
=====================================
@@ -0,0 +1,10 @@
+[tool.pylint.'MESSAGES CONTROL']
+max-line-length = 120
+disable = "C0114, C0115, C0116, C0301"
+
+[tool.autopep8]
+max_line_length = 120
+ignore = ["E501"]
+in-place = true
+recursive = true
+aggressive = 3
=====================================
setup.py
=====================================
@@ -1,68 +1,77 @@
import os
from setuptools import setup, Extension
+
# workaround for open() with encoding='' python2/3 compatibility
from io import open
-with open('README.rst', encoding='utf-8') as file:
+with open("README.rst", encoding="utf-8") as file:
long_description = file.read()
# We want to force all warnings to be considered errors. That way we get to catch potential issues during
# development and at PR review time.
# But since ciso8601 is a source distribution, exotic compiler configurations can cause spurious warnings that
# would fail the installation. So we only want to treat warnings as errors during development.
-if os.environ.get("STRICT_WARNINGS", '0') == '1':
+if os.environ.get("STRICT_WARNINGS", "0") == "1":
# We can't use `extra_compile_args`, since the cl.exe (Windows) and gcc compilers don't use the same flags.
# Further, there is not an easy way to tell which compiler is being used.
# Instead we rely on each compiler looking at their appropriate environment variable.
# GCC/Clang
try:
- _ = os.environ['CFLAGS']
+ _ = os.environ["CFLAGS"]
except KeyError:
- os.environ['CFLAGS'] = ""
- os.environ['CFLAGS'] += " -Werror"
+ os.environ["CFLAGS"] = ""
+ os.environ["CFLAGS"] += " -Werror"
# cl.exe
try:
- _ = os.environ['_CL_']
+ _ = os.environ["_CL_"]
except KeyError:
- os.environ['_CL_'] = ""
- os.environ['_CL_'] += " /WX"
+ os.environ["_CL_"] = ""
+ os.environ["_CL_"] += " /WX"
-VERSION = "2.1.3"
+VERSION = "2.2.0"
+CISO8601_CACHING_ENABLED = int(os.environ.get('CISO8601_CACHING_ENABLED', '1') == '1')
setup(
name="ciso8601",
version=VERSION,
- description='Fast ISO8601 date time parser for Python written in C',
+ description="Fast ISO8601 date time parser for Python written in C",
long_description=long_description,
url="https://github.com/closeio/ciso8601",
license="MIT",
- ext_modules=[Extension("ciso8601",
- sources=["module.c"],
- define_macros=[("CISO8601_VERSION", VERSION)]
- )],
+ ext_modules=[
+ Extension(
+ "ciso8601",
+ sources=["module.c", "timezone.c"],
+ define_macros=[
+ ("CISO8601_VERSION", VERSION),
+ ("CISO8601_CACHING_ENABLED", CISO8601_CACHING_ENABLED),
+ ],
+ )
+ ],
packages=["ciso8601"],
package_data={"ciso8601": ["__init__.pyi", "py.typed"]},
- test_suite='tests',
+ test_suite="tests",
tests_require=[
- 'pytz',
- "unittest2 ; python_version < '3'"
+ "pytz",
+ "unittest2 ; python_version < '3'",
],
classifiers=[
- 'Intended Audience :: Developers',
- 'License :: OSI Approved :: MIT License',
- 'Operating System :: OS Independent',
- 'Programming Language :: Python',
- 'Programming Language :: Python :: 2',
- 'Programming Language :: Python :: 2.7',
- 'Programming Language :: Python :: 3',
- 'Programming Language :: Python :: 3.4',
- 'Programming Language :: Python :: 3.5',
- 'Programming Language :: Python :: 3.6',
- 'Programming Language :: Python :: 3.7',
- 'Programming Language :: Python :: 3.8',
- 'Topic :: Software Development :: Libraries :: Python Modules'
- ]
+ "Intended Audience :: Developers",
+ "License :: OSI Approved :: MIT License",
+ "Operating System :: OS Independent",
+ "Programming Language :: Python",
+ "Programming Language :: Python :: 2",
+ "Programming Language :: Python :: 2.7",
+ "Programming Language :: Python :: 3",
+ "Programming Language :: Python :: 3.4",
+ "Programming Language :: Python :: 3.5",
+ "Programming Language :: Python :: 3.6",
+ "Programming Language :: Python :: 3.7",
+ "Programming Language :: Python :: 3.8",
+ "Programming Language :: Python :: 3.9",
+ "Topic :: Software Development :: Libraries :: Python Modules",
+ ],
)
=====================================
tests/__init__.py
=====================================
=====================================
tests/test_timezone.py
=====================================
@@ -0,0 +1,68 @@
+# -*- coding: utf-8 -*-
+
+import sys
+
+from datetime import datetime, timedelta
+
+from ciso8601 import FixedOffset
+
+if sys.version_info.major == 2:
+ # We use unittest2 since it has a backport of the `unittest.TestCase.assertRaisesRegex` method,
+ # which is called `assertRaisesRegexp` in Python 2. This saves us the hassle of monkey-patching
+ # the class ourselves.
+ import unittest2 as unittest
+else:
+ import unittest
+
+
+class TimezoneTestCase(unittest.TestCase):
+ def test_utcoffset(self):
+ if sys.version_info >= (3, 2):
+ from datetime import timezone
+ for minutes in range(-1439, 1440):
+ td = timedelta(minutes=minutes)
+ tz = timezone(td)
+ built_in_dt = datetime(2014, 2, 3, 10, 35, 27, 234567, tzinfo=tz)
+ our_dt = datetime(2014, 2, 3, 10, 35, 27, 234567, tzinfo=FixedOffset(minutes * 60))
+ self.assertEqual(built_in_dt.utcoffset(), our_dt.utcoffset(), "`utcoffset` output did not match for offset: {minutes}".format(minutes=minutes))
+ else:
+ self.assertEqual(FixedOffset(0).utcoffset(), timedelta(minutes=0))
+ self.assertEqual(FixedOffset(+0).utcoffset(), timedelta(minutes=0))
+ self.assertEqual(FixedOffset(-0).utcoffset(), timedelta(minutes=0))
+ self.assertEqual(FixedOffset(-4980).utcoffset(), timedelta(hours=-1, minutes=-23))
+ self.assertEqual(FixedOffset(+45240).utcoffset(), timedelta(hours=12, minutes=34))
+
+ def test_dst(self):
+ if sys.version_info >= (3, 2):
+ from datetime import timezone
+ for minutes in range(-1439, 1440):
+ td = timedelta(minutes=minutes)
+ tz = timezone(td)
+ built_in_dt = datetime(2014, 2, 3, 10, 35, 27, 234567, tzinfo=tz)
+ our_dt = datetime(2014, 2, 3, 10, 35, 27, 234567, tzinfo=FixedOffset(minutes * 60))
+ self.assertEqual(built_in_dt.dst(), our_dt.dst(), "`dst` output did not match for offset: {minutes}".format(minutes=minutes))
+ else:
+ self.assertIsNone(FixedOffset(0).dst(), "UTC")
+ self.assertIsNone(FixedOffset(+0).dst(), "UTC")
+ self.assertIsNone(FixedOffset(-0).dst(), "UTC")
+ self.assertIsNone(FixedOffset(-4980).dst(), "UTC-01:23")
+ self.assertIsNone(FixedOffset(+45240).dst(), "UTC+12:34")
+
+ def test_tzname(self):
+ if sys.version_info >= (3, 2):
+ from datetime import timezone
+ for minutes in range(-1439, 1440):
+ td = timedelta(minutes=minutes)
+ tz = timezone(td)
+ built_in_dt = datetime(2014, 2, 3, 10, 35, 27, 234567, tzinfo=tz)
+ our_dt = datetime(2014, 2, 3, 10, 35, 27, 234567, tzinfo=FixedOffset(minutes * 60))
+ self.assertEqual(built_in_dt.tzname(), our_dt.tzname(), "`tzname` output did not match for offset: {minutes}".format(minutes=minutes))
+ else:
+ self.assertEqual(FixedOffset(0).tzname(), "UTC+00:00")
+ self.assertEqual(FixedOffset(+0).tzname(), "UTC+00:00")
+ self.assertEqual(FixedOffset(-0).tzname(), "UTC+00:00")
+ self.assertEqual(FixedOffset(-4980).tzname(), "UTC-01:23")
+ self.assertEqual(FixedOffset(+45240).tzname(), "UTC+12:34")
+
+if __name__ == '__main__':
+ unittest.main()
=====================================
tests.py → tests/tests.py
=====================================
@@ -1,9 +1,13 @@
# -*- coding: utf-8 -*-
-import ciso8601
+import copy
import datetime
+import pickle
+import platform
+import re
import sys
+from ciso8601 import FixedOffset, parse_datetime, parse_datetime_as_naive, parse_rfc3339
from generate_test_timestamps import generate_valid_timestamp_and_datetime, generate_invalid_timestamp
if sys.version_info.major == 2:
@@ -19,7 +23,7 @@ class ValidTimestampTestCase(unittest.TestCase):
def test_auto_generated_valid_formats(self):
for (timestamp, expected_datetime) in generate_valid_timestamp_and_datetime():
try:
- self.assertEqual(ciso8601.parse_datetime(timestamp), expected_datetime)
+ self.assertEqual(parse_datetime(timestamp), expected_datetime)
except Exception:
print("Had problems parsing: {timestamp}".format(timestamp=timestamp))
raise
@@ -27,15 +31,15 @@ class ValidTimestampTestCase(unittest.TestCase):
def test_parse_as_naive_auto_generated_valid_formats(self):
for (timestamp, expected_datetime) in generate_valid_timestamp_and_datetime():
try:
- self.assertEqual(ciso8601.parse_datetime_as_naive(timestamp), expected_datetime.replace(tzinfo=None))
+ self.assertEqual(parse_datetime_as_naive(timestamp), expected_datetime.replace(tzinfo=None))
except Exception:
print("Had problems parsing: {timestamp}".format(timestamp=timestamp))
raise
def test_excessive_subsecond_precision(self):
self.assertEqual(
- ciso8601.parse_datetime('20140203T103527.234567891234'),
- datetime.datetime(2014, 2, 3, 10, 35, 27, 234567)
+ parse_datetime("20140203T103527.234567891234"),
+ datetime.datetime(2014, 2, 3, 10, 35, 27, 234567),
)
def test_leap_year(self):
@@ -43,16 +47,26 @@ class ValidTimestampTestCase(unittest.TestCase):
# We just want to make sure that they work in general.
for leap_year in (1600, 2000, 2016):
self.assertEqual(
- ciso8601.parse_datetime('{}-02-29'.format(leap_year)),
- datetime.datetime(leap_year, 2, 29, 0, 0, 0, 0)
+ parse_datetime("{}-02-29".format(leap_year)),
+ datetime.datetime(leap_year, 2, 29, 0, 0, 0, 0),
)
def test_special_midnight(self):
self.assertEqual(
- ciso8601.parse_datetime('2014-02-03T24:00:00'),
- datetime.datetime(2014, 2, 4, 0, 0, 0)
+ parse_datetime("2014-02-03T24:00:00"),
+ datetime.datetime(2014, 2, 4, 0, 0, 0),
)
+ def test_returns_built_in_utc_if_available(self):
+ # Python 3.7 added a built-in UTC object at the C level (`PyDateTime_TimeZone_UTC`)
+ # PyPy added support for it in 7.3.6
+
+ timestamp = '2018-01-01T00:00:00.00Z'
+ if (platform.python_implementation() == 'CPython' and sys.version_info >= (3, 7)) or \
+ (platform.python_implementation() == 'PyPy' and sys.pypy_version_info >= (7, 3, 6)):
+ self.assertIs(parse_datetime(timestamp).tzinfo, datetime.timezone.utc)
+ else:
+ self.assertIsInstance(parse_datetime(timestamp).tzinfo, FixedOffset)
class InvalidTimestampTestCase(unittest.TestCase):
# Many invalid test cases are covered by `test_parse_auto_generated_invalid_formats`,
@@ -63,7 +77,7 @@ class InvalidTimestampTestCase(unittest.TestCase):
for timestamp in generate_invalid_timestamp():
try:
with self.assertRaises(ValueError, msg="Timestamp '{0}' was supposed to be invalid, but parsing it didn't raise ValueError.".format(timestamp)):
- ciso8601.parse_datetime(timestamp)
+ parse_datetime(timestamp)
except Exception as exc:
print("Timestamp '{0}' was supposed to raise ValueError, but raised {1} instead".format(timestamp, type(exc).__name__))
raise
@@ -73,166 +87,196 @@ class InvalidTimestampTestCase(unittest.TestCase):
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing date separator \('-'\) \('🐵', Index: 7\)",
- ciso8601.parse_datetime,
- '2019-01🐵01',
+ parse_datetime,
+ "2019-01🐵01",
)
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing day \('🐵', Index: 8\)",
- ciso8601.parse_datetime,
- '2019-01-🐵',
+ parse_datetime,
+ "2019-01-🐵",
)
else:
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing date separator \('-'\) \(Index: 7\)",
- ciso8601.parse_datetime,
- '2019-01🐵01',
+ parse_datetime,
+ "2019-01🐵01",
)
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing day \(Index: 8\)",
- ciso8601.parse_datetime,
- '2019-01-🐵',
+ parse_datetime,
+ "2019-01-🐵",
)
def test_invalid_calendar_separator(self):
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing month",
- ciso8601.parse_datetime,
- '2018=01=01',
+ parse_datetime,
+ "2018=01=01",
)
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing date separator \('-'\) \('=', Index: 7\)",
- ciso8601.parse_datetime,
- '2018-01=01',
+ parse_datetime,
+ "2018-01=01",
)
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing date separator \('-'\) \('0', Index: 7\)",
- ciso8601.parse_datetime,
- '2018-0101',
+ parse_datetime,
+ "2018-0101",
)
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing day \('-', Index: 6\)",
- ciso8601.parse_datetime,
- '201801-01',
+ parse_datetime,
+ "201801-01",
)
def test_invalid_empty_but_required_fields(self):
self.assertRaisesRegex(
ValueError,
r"Unexpected end of string while parsing year. Expected 4 more characters",
- ciso8601.parse_datetime,
- '',
+ parse_datetime,
+ "",
)
self.assertRaisesRegex(
ValueError,
r"Unexpected end of string while parsing month. Expected 2 more characters",
- ciso8601.parse_datetime,
- '2018-',
+ parse_datetime,
+ "2018-",
)
self.assertRaisesRegex(
ValueError,
r"Unexpected end of string while parsing day. Expected 2 more characters",
- ciso8601.parse_datetime,
- '2018-01-',
+ parse_datetime,
+ "2018-01-",
)
self.assertRaisesRegex(
ValueError,
r"Unexpected end of string while parsing hour. Expected 2 more characters",
- ciso8601.parse_datetime,
- '2018-01-01T',
+ parse_datetime,
+ "2018-01-01T",
)
self.assertRaisesRegex(
ValueError,
r"Unexpected end of string while parsing minute. Expected 2 more characters",
- ciso8601.parse_datetime,
- '2018-01-01T00:',
+ parse_datetime,
+ "2018-01-01T00:",
)
self.assertRaisesRegex(
ValueError,
r"Unexpected end of string while parsing second. Expected 2 more characters",
- ciso8601.parse_datetime,
- '2018-01-01T00:00:',
+ parse_datetime,
+ "2018-01-01T00:00:",
)
self.assertRaisesRegex(
ValueError,
r"Unexpected end of string while parsing subsecond. Expected 1 more character",
- ciso8601.parse_datetime,
- '2018-01-01T00:00:00.',
+ parse_datetime,
+ "2018-01-01T00:00:00.",
)
self.assertRaisesRegex(
ValueError,
r"Unexpected end of string while parsing tz hour. Expected 2 more characters",
- ciso8601.parse_datetime,
- '2018-01-01T00:00:00.00+',
+ parse_datetime,
+ "2018-01-01T00:00:00.00+",
)
self.assertRaisesRegex(
ValueError,
r"Unexpected end of string while parsing tz minute. Expected 2 more characters",
- ciso8601.parse_datetime,
- '2018-01-01T00:00:00.00-00:',
+ parse_datetime,
+ "2018-01-01T00:00:00.00-00:",
)
def test_invalid_day_for_month(self):
- for non_leap_year in (1700, 1800, 1900, 2014):
+ if platform.python_implementation() == 'PyPy' and sys.version_info.major >= 3:
+ for non_leap_year in (1700, 1800, 1900, 2014):
+ self.assertRaisesRegex(
+ ValueError,
+ r"('day must be in 1..28', 29)",
+ parse_datetime,
+ "{}-02-29".format(non_leap_year),
+ )
+
self.assertRaisesRegex(
ValueError,
- r"day is out of range for month",
- ciso8601.parse_datetime,
- '{}-02-29'.format(non_leap_year)
+ r"('day must be in 1..31', 32)",
+ parse_datetime,
+ "2014-01-32",
)
- self.assertRaisesRegex(
- ValueError,
- r"day is out of range for month",
- ciso8601.parse_datetime,
- '2014-01-32',
- )
+ self.assertRaisesRegex(
+ ValueError,
+ r"('day must be in 1..30', 31)",
+ parse_datetime,
+ "2014-06-31",
+ )
- self.assertRaisesRegex(
- ValueError,
- r"day is out of range for month",
- ciso8601.parse_datetime,
- '2014-06-31',
- )
+ self.assertRaisesRegex(
+ ValueError,
+ r"('day must be in 1..30', 0)",
+ parse_datetime,
+ "2014-06-00",
+ )
+ else:
+ for non_leap_year in (1700, 1800, 1900, 2014):
+ self.assertRaisesRegex(
+ ValueError,
+ r"day is out of range for month",
+ parse_datetime,
+ "{}-02-29".format(non_leap_year),
+ )
- self.assertRaisesRegex(
- ValueError,
- r"day is out of range for month",
- ciso8601.parse_datetime,
- '2014-06-00',
- )
+ self.assertRaisesRegex(
+ ValueError,
+ r"day is out of range for month",
+ parse_datetime,
+ "2014-01-32",
+ )
+
+ self.assertRaisesRegex(
+ ValueError,
+ r"day is out of range for month",
+ parse_datetime,
+ "2014-06-31",
+ )
+
+ self.assertRaisesRegex(
+ ValueError,
+ r"day is out of range for month",
+ parse_datetime,
+ "2014-06-00",
+ )
def test_invalid_yyyymm_format(self):
self.assertRaisesRegex(
ValueError,
r"Unexpected end of string while parsing day. Expected 2 more characters",
- ciso8601.parse_datetime,
- '201406',
+ parse_datetime,
+ "201406",
)
def test_invalid_date_and_time_separator(self):
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing date and time separator \(ie. 'T' or ' '\) \('_', Index: 10\)",
- ciso8601.parse_datetime,
- '2018-01-01_00:00:00',
+ parse_datetime,
+ "2018-01-01_00:00:00",
)
def test_invalid_hour_24(self):
@@ -240,56 +284,67 @@ class InvalidTimestampTestCase(unittest.TestCase):
self.assertRaisesRegex(
ValueError,
r"hour must be in 0..23",
- ciso8601.parse_datetime,
- '2014-02-03T24:35:27',
+ parse_datetime,
+ "2014-02-03T24:35:27",
)
def test_invalid_time_separator(self):
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing time separator \(':'\) \('=', Index: 16\)",
- ciso8601.parse_datetime,
- '2018-01-01T00:00=00'
+ parse_datetime,
+ "2018-01-01T00:00=00",
)
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing time separator \(':'\) \('0', Index: 16\)",
- ciso8601.parse_datetime,
- '2018-01-01T00:0000'
+ parse_datetime,
+ "2018-01-01T00:0000",
)
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing second \(':', Index: 15\)",
- ciso8601.parse_datetime,
- '2018-01-01T0000:00'
+ parse_datetime,
+ "2018-01-01T0000:00",
)
def test_invalid_tz_minute(self):
self.assertRaisesRegex(
ValueError,
r"tzminute must be in 0..59",
- ciso8601.parse_datetime,
- '2018-01-01T00:00:00.00-00:99',
+ parse_datetime,
+ "2018-01-01T00:00:00.00-00:99",
)
def test_invalid_tz_offsets_too_large(self):
- # The Python interpreter crashes if you give the datetime constructor a TZ offset with an absolute value >= 1440
- # TODO: Determine whether these are valid ISO 8601 values and therefore whether ciso8601 should support them.
- self.assertRaisesRegex(
- ValueError,
+ # The TZ offsets with an absolute value >= 1440 minutes are not supported by the tzinfo spec.
+ # See https://docs.python.org/3/library/datetime.html#datetime.tzinfo.utcoffset
+
+ invalid_offsets = [("-24", -1440), ("+24", 1440), ("-99", -5940), ("+99", 5940)]
+ for offset_string, offset_minutes in invalid_offsets:
# Error message differs whether or not we are using pytz or datetime.timezone
- r"^offset must be a timedelta strictly between" if sys.version_info.major >= 3 else r"\('absolute offset is too large', -5940\)",
- ciso8601.parse_datetime,
- '2018-01-01T00:00:00.00-99',
- )
+ # (and also by which Python version. Python 3.7 has different timedelta.repr())
+ # Of course we no longer use either, but for backwards compatibility
+ # with v2.0.x, we did not change the error messages.
+ if sys.version_info.major >= 3:
+ expected_error_message = re.escape("offset must be a timedelta strictly between -timedelta(hours=24) and timedelta(hours=24), not {0}.".format(repr(datetime.timedelta(minutes=offset_minutes))))
+ else:
+ expected_error_message = re.escape("'absolute offset is too large', {0}".format(offset_minutes))
+
+ self.assertRaisesRegex(
+ ValueError,
+ expected_error_message,
+ parse_datetime,
+ "2018-01-01T00:00:00.00{0}".format(offset_string),
+ )
self.assertRaisesRegex(
ValueError,
r"tzminute must be in 0..59",
- ciso8601.parse_datetime,
- '2018-01-01T00:00:00.00-23:60',
+ parse_datetime,
+ "2018-01-01T00:00:00.00-23:60",
)
def test_mixed_basic_and_extended_formats(self):
@@ -301,15 +356,15 @@ class InvalidTimestampTestCase(unittest.TestCase):
self.assertRaisesRegex(
ValueError,
r"Cannot combine \"extended\" date format with \"basic\" time format",
- ciso8601.parse_datetime,
- '2014-01-02T010203',
+ parse_datetime,
+ "2014-01-02T010203",
),
self.assertRaisesRegex(
ValueError,
r"Cannot combine \"basic\" date format with \"extended\" time format",
- ciso8601.parse_datetime,
- '20140102T01:02:03',
+ parse_datetime,
+ "20140102T01:02:03",
)
@@ -320,18 +375,19 @@ class Rfc3339TestCase(unittest.TestCase):
and produce the same result as parse_datetime.
"""
for string in [
- '2018-01-02T03:04:05Z',
- '2018-01-02t03:04:05z',
- '2018-01-02 03:04:05z',
- '2018-01-02T03:04:05+00:00',
- '2018-01-02T03:04:05-00:00',
- '2018-01-02T03:04:05.12345Z',
- '2018-01-02T03:04:05+01:23',
- '2018-01-02T03:04:05-12:34',
- '2018-01-02T03:04:05-12:34',
+ "2018-01-02T03:04:05Z",
+ "2018-01-02t03:04:05z",
+ "2018-01-02 03:04:05z",
+ "2018-01-02T03:04:05+00:00",
+ "2018-01-02T03:04:05-00:00",
+ "2018-01-02T03:04:05.12345Z",
+ "2018-01-02T03:04:05+01:23",
+ "2018-01-02T03:04:05-12:34",
+ "2018-01-02T03:04:05-12:34",
]:
- self.assertEqual(ciso8601.parse_datetime(string),
- ciso8601.parse_rfc3339(string))
+ self.assertEqual(
+ parse_datetime(string), parse_rfc3339(string)
+ )
def test_invalid_rfc3339_timestamps(self):
"""
@@ -340,22 +396,50 @@ class Rfc3339TestCase(unittest.TestCase):
ValueError explicitly mentions RFC 3339.
"""
for timestamp in [
- "2018-01-02", # Missing mandatory time
- "2018-01-02T03", # Missing mandatory minute and second
- "2018-01-02T03Z", # Missing mandatory minute and second
- "2018-01-02T03:04", # Missing mandatory minute and second
- "2018-01-02T03:04Z", # Missing mandatory minute and second
- "2018-01-02T03:04:01+04", # Missing mandatory offset minute
- "2018-01-02T03:04:05", # Missing mandatory offset
- "2018-01-02T03:04:05.12345", # Missing mandatory offset
- "2018-01-02T24:00:00Z", # 24:00:00 is not valid in RFC 3339
- '20180102T03:04:05-12:34', # Missing mandatory date separators
- '2018-01-02T030405-12:34', # Missing mandatory time separators
- '2018-01-02T03:04:05-1234', # Missing mandatory offset separator
- '2018-01-02T03:04:05,12345Z' # Invalid comma fractional second separator
+ "2018-01-02", # Missing mandatory time
+ "2018-01-02T03", # Missing mandatory minute and second
+ "2018-01-02T03Z", # Missing mandatory minute and second
+ "2018-01-02T03:04", # Missing mandatory minute and second
+ "2018-01-02T03:04Z", # Missing mandatory minute and second
+ "2018-01-02T03:04:01+04", # Missing mandatory offset minute
+ "2018-01-02T03:04:05", # Missing mandatory offset
+ "2018-01-02T03:04:05.12345", # Missing mandatory offset
+ "2018-01-02T24:00:00Z", # 24:00:00 is not valid in RFC 3339
+ "20180102T03:04:05-12:34", # Missing mandatory date separators
+ "2018-01-02T030405-12:34", # Missing mandatory time separators
+ "2018-01-02T03:04:05-1234", # Missing mandatory offset separator
+ "2018-01-02T03:04:05,12345Z", # Invalid comma fractional second separator
]:
with self.assertRaisesRegex(ValueError, r"RFC 3339", msg="Timestamp '{0}' was supposed to be invalid, but parsing it didn't raise ValueError.".format(timestamp)):
- ciso8601.parse_rfc3339(timestamp)
+ parse_rfc3339(timestamp)
+
+
+class FixedOffsetTestCase(unittest.TestCase):
+ def test_all_valid_offsets(self):
+ [FixedOffset(i * 60) for i in range(-1439, 1440)]
+
+ def test_offsets_outside_valid_range(self):
+ invalid_offsets = [-1440, 1440, 10000, -10000]
+ for invalid_offset in invalid_offsets:
+ with self.assertRaises(ValueError, msg="Fixed offset of {0} minutes was supposed to be invalid, but it didn't raise ValueError.".format(invalid_offset)):
+ FixedOffset(invalid_offset * 60)
+
+
+class PicklingTestCase(unittest.TestCase):
+ # Found as a result of https://github.com/movermeyer/backports.datetime_fromisoformat/issues/12
+ def test_basic_pickle_and_copy(self):
+ dt = parse_datetime('2018-11-01 20:42:09')
+ dt2 = pickle.loads(pickle.dumps(dt))
+ self.assertEqual(dt, dt2)
+ dt3 = copy.deepcopy(dt)
+ self.assertEqual(dt, dt3)
+
+ # FixedOffset
+ dt = parse_datetime('2018-11-01 20:42:09+01:30')
+ dt2 = pickle.loads(pickle.dumps(dt))
+ self.assertEqual(dt, dt2)
+ dt3 = copy.deepcopy(dt)
+ self.assertEqual(dt, dt3)
class GithubIssueRegressionTestCase(unittest.TestCase):
@@ -367,80 +451,96 @@ class GithubIssueRegressionTestCase(unittest.TestCase):
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing minute \(':', Index: 14\)",
- ciso8601.parse_datetime,
- '2014-02-03T10::27',
+ parse_datetime,
+ "2014-02-03T10::27",
)
def test_issue_6(self):
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing second \('.', Index: 17\)",
- ciso8601.parse_datetime,
- '2014-02-03 04:05:.123456',
+ parse_datetime,
+ "2014-02-03 04:05:.123456",
)
def test_issue_8(self):
self.assertRaisesRegex(
ValueError,
r"hour must be in 0..23",
- ciso8601.parse_datetime,
- '2001-01-01T24:01:01',
+ parse_datetime,
+ "2001-01-01T24:01:01",
)
self.assertRaisesRegex(
ValueError,
r"month must be in 1..12",
- ciso8601.parse_datetime,
- '07722968',
+ parse_datetime,
+ "07722968",
)
def test_issue_13(self):
self.assertRaisesRegex(
ValueError,
r"month must be in 1..12",
- ciso8601.parse_datetime,
- '2014-13-01',
+ parse_datetime,
+ "2014-13-01",
)
def test_issue_22(self):
- self.assertRaisesRegex(
- ValueError,
- r"day is out of range for month",
- ciso8601.parse_datetime,
- '2016-11-31T12:34:34.521059',
- )
+ if platform.python_implementation() == 'PyPy' and sys.version_info.major >= 3:
+ self.assertRaisesRegex(
+ ValueError,
+ r"('day must be in 1..30', 31)",
+ parse_datetime,
+ "2016-11-31T12:34:34.521059",
+ )
+ else:
+ self.assertRaisesRegex(
+ ValueError,
+ r"day is out of range for month",
+ parse_datetime,
+ "2016-11-31T12:34:34.521059",
+ )
def test_issue_35(self):
self.assertRaisesRegex(
ValueError,
r"Invalid character while parsing date separator \('-'\) \('1', Index: 7\)",
- ciso8601.parse_datetime,
- '2017-0012-27T13:35:19+0200',
+ parse_datetime,
+ "2017-0012-27T13:35:19+0200",
)
def test_issue_42(self):
- self.assertRaisesRegex(
- ValueError,
- r"day is out of range for month",
- ciso8601.parse_datetime,
- '20140200',
- )
+ if platform.python_implementation() == 'PyPy' and sys.version_info.major >= 3:
+ self.assertRaisesRegex(
+ ValueError,
+ r"('day must be in 1..28', 0)",
+ parse_datetime,
+ "20140200",
+ )
+ else:
+ self.assertRaisesRegex(
+ ValueError,
+ r"day is out of range for month",
+ parse_datetime,
+ "20140200",
+ )
def test_issue_71(self):
self.assertRaisesRegex(
ValueError,
r"Cannot combine \"basic\" date format with \"extended\" time format",
- ciso8601.parse_datetime,
- '20010203T04:05:06Z',
+ parse_datetime,
+ "20010203T04:05:06Z",
)
self.assertRaisesRegex(
ValueError,
r"Cannot combine \"basic\" date format with \"extended\" time format",
- ciso8601.parse_datetime,
- '20010203T04:05',
+ parse_datetime,
+ "20010203T04:05",
)
-if __name__ == '__main__':
+if __name__ == "__main__":
unittest.main()
=====================================
timezone.c
=====================================
@@ -0,0 +1,239 @@
+/* This code was originally copied from Pendulum
+(https://github.com/sdispater/pendulum/blob/13ff4a0250177f77e4ff2e7bd1f442d954e66b22/pendulum/parsing/_iso8601.c#L176)
+Pendulum (like ciso8601) is MIT licensed, so we have included a copy of its
+license here.
+*/
+
+/*
+Copyright (c) 2015 Sébastien Eustace
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+*/
+
+#include "timezone.h"
+
+#include <Python.h>
+#include <datetime.h>
+#include <structmember.h>
+
+#define SECS_PER_MIN 60
+#define SECS_PER_HOUR (60 * SECS_PER_MIN)
+#define TWENTY_FOUR_HOURS_IN_SECONDS 86400
+
+#define PY_VERSION_AT_LEAST_36 \
+ ((PY_MAJOR_VERSION == 3 && PY_MINOR_VERSION >= 6) || PY_MAJOR_VERSION > 3)
+
+/*
+ * class FixedOffset(tzinfo):
+ */
+typedef struct {
+ // Seconds offset from UTC.
+ // Must be in range (-86400, 86400) seconds exclusive.
+ // ie. (-1440, 1440) minutes exclusive.
+ PyObject_HEAD int offset;
+} FixedOffset;
+
+/*
+ * def __init__(self, offset):
+ * self.offset = offset
+ */
+static int
+FixedOffset_init(FixedOffset *self, PyObject *args, PyObject *kwargs)
+{
+ int offset;
+ if (!PyArg_ParseTuple(args, "i", &offset))
+ return -1;
+
+ if (abs(offset) >= TWENTY_FOUR_HOURS_IN_SECONDS) {
+ PyErr_Format(PyExc_ValueError,
+ "offset must be an integer in the range (-86400, 86400), "
+ "exclusive");
+ return -1;
+ }
+
+ self->offset = offset;
+ return 0;
+}
+
+/*
+ * def utcoffset(self, dt):
+ * return timedelta(seconds=self.offset * 60)
+ */
+static PyObject *
+FixedOffset_utcoffset(FixedOffset *self, PyObject *args)
+{
+ return PyDelta_FromDSU(0, self->offset, 0);
+}
+
+/*
+ * def dst(self, dt):
+ * return timedelta(seconds=self.offset * 60)
+ */
+static PyObject *
+FixedOffset_dst(FixedOffset *self, PyObject *args)
+{
+ Py_RETURN_NONE;
+}
+
+/*
+ * def tzname(self, dt):
+ * sign = '+'
+ * if self.offset < 0:
+ * sign = '-'
+ * return "%s%d:%d" % (sign, self.offset / 60, self.offset % 60)
+ */
+static PyObject *
+FixedOffset_tzname(FixedOffset *self, PyObject *args)
+{
+
+ int offset = self->offset;
+
+ if (offset == 0){
+#if PY_VERSION_AT_LEAST_36
+ return PyUnicode_FromString("UTC");
+#else
+ return PyUnicode_FromString("UTC+00:00");
+#endif
+ } else {
+ char result_tzname[10] = {0};
+ char sign = '+';
+
+ if (offset < 0) {
+ sign = '-';
+ offset *= -1;
+ }
+ snprintf(result_tzname, 10, "UTC%c%02u:%02u", sign,
+ (offset / SECS_PER_HOUR) & 31,
+ offset / SECS_PER_MIN % SECS_PER_MIN);
+ return PyUnicode_FromString(result_tzname);
+ }
+}
+
+/*
+ * def __repr__(self):
+ * return self.tzname()
+ */
+static PyObject *
+FixedOffset_repr(FixedOffset *self)
+{
+ return FixedOffset_tzname(self, NULL);
+}
+
+/*
+ * def __getinitargs__(self):
+ * return (self.offset,)
+ */
+static PyObject *
+FixedOffset_getinitargs(FixedOffset *self)
+{
+ PyObject *args = PyTuple_Pack(1, PyLong_FromLong(self->offset));
+ return args;
+}
+
+/*
+ * Class member / class attributes
+ */
+static PyMemberDef FixedOffset_members[] = {
+ {"offset", T_INT, offsetof(FixedOffset, offset), 0, "UTC offset"}, {NULL}};
+
+/*
+ * Class methods
+ */
+static PyMethodDef FixedOffset_methods[] = {
+ {"utcoffset", (PyCFunction)FixedOffset_utcoffset, METH_VARARGS, ""},
+ {"dst", (PyCFunction)FixedOffset_dst, METH_VARARGS, ""},
+ {"tzname", (PyCFunction)FixedOffset_tzname, METH_VARARGS, ""},
+ {"__getinitargs__", (PyCFunction)FixedOffset_getinitargs, METH_VARARGS,
+ ""},
+ {NULL}};
+
+static PyTypeObject FixedOffset_type = {
+ PyVarObject_HEAD_INIT(NULL, 0) "ciso8601.FixedOffset", /* tp_name */
+ sizeof(FixedOffset), /* tp_basicsize */
+ 0, /* tp_itemsize */
+ 0, /* tp_dealloc */
+ 0, /* tp_print */
+ 0, /* tp_getattr */
+ 0, /* tp_setattr */
+ 0, /* tp_as_async */
+ (reprfunc)FixedOffset_repr, /* tp_repr */
+ 0, /* tp_as_number */
+ 0, /* tp_as_sequence */
+ 0, /* tp_as_mapping */
+ 0, /* tp_hash */
+ 0, /* tp_call */
+ (reprfunc)FixedOffset_repr, /* tp_str */
+ 0, /* tp_getattro */
+ 0, /* tp_setattro */
+ 0, /* tp_as_buffer */
+ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /* tp_flags */
+ "TZInfo with fixed offset", /* tp_doc */
+};
+
+/*
+ * Instantiate new FixedOffset_type object
+ * Skip overhead of calling PyObject_New and PyObject_Init.
+ * Directly allocate object.
+ * Note that this also doesn't do any validation of the offset parameter..
+ * Callers must ensure that offset is within \
+ * the range (-86400, 86400), exclusive.
+ */
+PyObject *
+new_fixed_offset_ex(int offset, PyTypeObject *type)
+{
+ FixedOffset *self = (FixedOffset *)(type->tp_alloc(type, 0));
+
+ if (self != NULL)
+ self->offset = offset;
+
+ return (PyObject *)self;
+}
+
+PyObject *
+new_fixed_offset(int offset)
+{
+ return new_fixed_offset_ex(offset, &FixedOffset_type);
+}
+
+/* ------------------------------------------------------------- */
+
+int
+initialize_timezone_code(PyObject *module)
+{
+ PyDateTime_IMPORT;
+ FixedOffset_type.tp_new = PyType_GenericNew;
+ FixedOffset_type.tp_base = PyDateTimeAPI->TZInfoType;
+ FixedOffset_type.tp_methods = FixedOffset_methods;
+ FixedOffset_type.tp_members = FixedOffset_members;
+ FixedOffset_type.tp_init = (initproc)FixedOffset_init;
+
+ if (PyType_Ready(&FixedOffset_type) < 0)
+ return -1;
+
+ Py_INCREF(&FixedOffset_type);
+ if (PyModule_AddObject(module, "FixedOffset",
+ (PyObject *)&FixedOffset_type) < 0) {
+ Py_DECREF(module);
+ Py_DECREF(&FixedOffset_type);
+ return -1;
+ }
+
+ return 0;
+}
=====================================
timezone.h
=====================================
@@ -0,0 +1,12 @@
+#ifndef CISO_TZINFO_H
+#define CISO_TZINFO_H
+
+#include <Python.h>
+
+PyObject *
+new_fixed_offset(int offset);
+
+int
+initialize_timezone_code(PyObject *module);
+
+#endif
=====================================
tox.ini
=====================================
@@ -1,10 +1,12 @@
[tox]
-envlist = py27,py34,py35,py36,py37,py38
+envlist = {py27,py34,py35,py36,py37,py38,py39}-caching_{enabled,disabled}
[testenv]
-setenv=
- STRICT_WARNINGS=1
-deps=
+setenv =
+ STRICT_WARNINGS = 1
+ caching_enabled: CISO8601_CACHING_ENABLED = 1
+ caching_disabled: CISO8601_CACHING_ENABLED = 0
+deps =
pytz
nose
unittest2
View it on GitLab: https://salsa.debian.org/med-team/python-ciso8601/-/commit/3faec6b5f5f53d4c5f7751c44f081045e073f826
--
View it on GitLab: https://salsa.debian.org/med-team/python-ciso8601/-/commit/3faec6b5f5f53d4c5f7751c44f081045e073f826
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20210807/d661b5e3/attachment-0001.htm>
More information about the debian-med-commit
mailing list