[Git][debian-gis-team/pyogrio][master] 7 commits: Upate Files-Excluded.

Bas Couwenberg (@sebastic) gitlab at salsa.debian.org
Wed Nov 26 13:33:54 GMT 2025



Bas Couwenberg pushed to branch master at Debian GIS Project / pyogrio


Commits:
55f553c3 by Bas Couwenberg at 2025-11-26T14:21:45+01:00
Upate Files-Excluded.

- - - - -
b26bb6a8 by Bas Couwenberg at 2025-11-26T14:21:48+01:00
New upstream version 0.12.0+ds
- - - - -
668e021f by Bas Couwenberg at 2025-11-26T14:21:49+01:00
Update upstream source from tag 'upstream/0.12.0+ds'

Update to upstream version '0.12.0+ds'
with Debian dir 935452d8048d09865afc300c6f33a37d3bb61e50
- - - - -
d7f4a05e by Bas Couwenberg at 2025-11-26T14:22:11+01:00
New upstream release.

- - - - -
f44a8ae3 by Bas Couwenberg at 2025-11-26T14:24:02+01:00
Drop spelling-errors.patch, applied upstream.

- - - - -
1eb9fc7f by Bas Couwenberg at 2025-11-26T14:27:29+01:00
Make pytest output extra verbose.

- - - - -
d862e405 by Bas Couwenberg at 2025-11-26T14:27:42+01:00
Set distribution to unstable.

- - - - -


29 changed files:

- CHANGES.md
- README.md
- debian/changelog
- debian/copyright
- − debian/patches/series
- − debian/patches/spelling-errors.patch
- debian/rules
- docs/environment.yml
- docs/source/install.md
- docs/source/introduction.md
- pyogrio/_compat.py
- pyogrio/_err.pyx
- pyogrio/_io.pyx
- pyogrio/_ogr.pxd
- pyogrio/_ogr.pyx
- pyogrio/_version.py
- pyogrio/core.py
- pyogrio/geopandas.py
- pyogrio/raw.py
- pyogrio/tests/conftest.py
- + pyogrio/tests/fixtures/list_field_values_file.parquet
- + pyogrio/tests/fixtures/list_nested_struct_file.parquet
- pyogrio/tests/test_arrow.py
- pyogrio/tests/test_core.py
- pyogrio/tests/test_geopandas_io.py
- pyogrio/tests/test_raw_io.py
- pyogrio/util.py
- pyproject.toml
- setup.py


Changes:

=====================================
CHANGES.md
=====================================
@@ -1,5 +1,41 @@
 # CHANGELOG
 
+## 0.12.0 (2025-11-26)
+
+### Potentially breaking changes
+
+-   Return JSON fields (as identified by GDAL) as dicts/lists in `read_dataframe`;
+    these were previously returned as strings (#556).
+-   Drop support for GDAL 3.4 and 3.5 (#584).
+
+### Improvements
+
+-   Add `datetime_as_string` and `mixed_offsets_as_utc` parameters to `read_dataframe`
+    to choose the way datetime columns are returned + several fixes when reading and
+    writing datetimes (#486).
+-   Add listing of GDAL data types and subtypes to `read_info` (#556).
+-   Add support to read list fields without arrow (#558, #597).
+
+### Bug fixes
+
+-   Fix decode error reading an sqlite file on Windows (#568).
+-   Fix wrong layer name when creating .gpkg.zip file (#570).
+-   Fix segfault on providing an invalid value for `layer` in `read_info` (#564).
+-   Fix error when reading data with ``use_arrow=True`` after having used the
+    Parquet driver with GDAL>=3.12 (#601).
+
+### Packaging
+
+-   Wheels are now available for Python 3.14 (#579).
+-   The GDAL library included in the wheels is upgraded from 3.10.3 to 3.11.4 (#578).
+-   Add libkml driver to the wheels for more recent Linux platforms supported
+    by manylinux_2_28, macOS, and Windows (#561).
+-   Add libspatialite to the wheels (#546).
+-   Minimum required Python version is now 3.10 (#557).
+-   Initial support for free-threaded Python builds, with the extension module
+    declaring free-threaded support and wheels for Python 3.13t and 3.14t being
+    built (#562).
+
 ## 0.11.1 (2025-08-02)
 
 ### Bug fixes
@@ -150,7 +186,7 @@
 
 ### Improvements
 
--   Support reading and writing datetimes with timezones (#253).
+-   Support reading and writing datetimes with time zones (#253).
 -   Support writing dataframes without geometry column (#267).
 -   Calculate feature count by iterating over features if GDAL returns an
     unknown count for a data layer (e.g., OSM driver); this may have signficant


=====================================
README.md
=====================================
@@ -26,7 +26,7 @@ Read the documentation for more information:
 
 ## Requirements
 
-Supports Python 3.9 - 3.13 and GDAL 3.4.x - 3.9.x.
+Supports Python 3.10 - 3.14 and GDAL 3.6.x - 3.11.x.
 
 Reading to GeoDataFrames requires `geopandas>=0.12` with `shapely>=2`.
 


=====================================
debian/changelog
=====================================
@@ -1,11 +1,14 @@
-pyogrio (0.11.1+ds-2) UNRELEASED; urgency=medium
+pyogrio (0.12.0+ds-1) unstable; urgency=medium
 
   * Team upload.
+  * New upstream release.
   * Update lintian overrides.
   * Drop Rules-Requires-Root: no, default since dpkg 1.22.13.
   * Use test-build-validate-cleanup instead of test-build-twice.
+  * Drop spelling-errors.patch, applied upstream.
+  * Make pytest output extra verbose.
 
- -- Bas Couwenberg <sebastic at debian.org>  Fri, 12 Sep 2025 17:45:39 +0200
+ -- Bas Couwenberg <sebastic at debian.org>  Wed, 26 Nov 2025 14:27:30 +0100
 
 pyogrio (0.11.1+ds-1) unstable; urgency=medium
 


=====================================
debian/copyright
=====================================
@@ -6,8 +6,6 @@ Comment: The upstream release is repacked in order to exclude
  unneeded CI files and binary data.
 Files-Excluded: .github/workflows
                 ci
-                pyogrio/tests/fixtures/poly_not_enough_points.shp.zip
-                pyogrio/tests/fixtures/test_fgdb.gdb.zip
 
 Files: *
 Copyright: 2020-2024, Brendan C. Ward and pyogrio contributors


=====================================
debian/patches/series deleted
=====================================
@@ -1 +0,0 @@
-spelling-errors.patch


=====================================
debian/patches/spelling-errors.patch deleted
=====================================
@@ -1,17 +0,0 @@
-Description: Fix spelling errors:
- * occuring -> occurring
-Author: Bas Couwenberg <sebastic at debian.org>
-Forwarded: https://github.com/geopandas/pyogrio/pull/554
-Applied-Upstream: https://github.com/geopandas/pyogrio/commit/4ca3db0bc099f22bbae4c12c43951cd87415a1f6
-
---- a/pyogrio/_err.pyx
-+++ b/pyogrio/_err.pyx
-@@ -418,7 +418,7 @@ cdef void stacking_error_handler(
- 
- @contextlib.contextmanager
- def capture_errors():
--    """A context manager that captures all GDAL non-fatal errors occuring.
-+    """A context manager that captures all GDAL non-fatal errors occurring.
- 
-     It adds all errors to a single stack, so it assumes that no more than one
-     GDAL function is called.


=====================================
debian/rules
=====================================
@@ -1,8 +1,9 @@
 #!/usr/bin/make -f
 
 export DEB_BUILD_MAINT_OPTIONS=hardening=+all
+
 export PYBUILD_NAME=pyogrio
-export PYBUILD_TEST_ARGS=\
+export PYBUILD_TEST_ARGS=-vv \
 -k "not test_url \
 and not test_url_dataframe \
 and not test_url_with_zip \


=====================================
docs/environment.yml
=====================================
@@ -2,9 +2,9 @@ name: pyogrio
 channels:
   - conda-forge
 dependencies:
-  - python==3.9.*
+  - python==3.10.*
   - gdal
-  - numpy==1.19.*
+  - numpy==1.24.*
   - numpydoc==1.1.*
   - Cython==0.29.*
   - docutils==0.16.*


=====================================
docs/source/install.md
=====================================
@@ -2,7 +2,7 @@
 
 ## Requirements
 
-Supports Python 3.9 - 3.13 and GDAL 3.4.x - 3.9.x
+Supports Python 3.10 - 3.14 and GDAL 3.6.x - 3.11.x
 
 Reading to GeoDataFrames requires `geopandas>=0.12` with `shapely>=2`.
 
@@ -132,20 +132,20 @@ To build on Windows, you need to provide additional environment variables or
 command-line parameters because the location of the GDAL binaries and headers
 cannot be automatically determined.
 
-Assuming GDAL 3.4.1 is installed to `c:\GDAL`, you can set the `GDAL_INCLUDE_PATH`,
+Assuming GDAL 3.8.3 is installed to `c:\GDAL`, you can set the `GDAL_INCLUDE_PATH`,
 `GDAL_LIBRARY_PATH` and `GDAL_VERSION` environment variables and build as follows:
 
 ```bash
 set GDAL_INCLUDE_PATH=C:\GDAL\include
 set GDAL_LIBRARY_PATH=C:\GDAL\lib
-set GDAL_VERSION=3.4.1
+set GDAL_VERSION=3.8.3
 python -m pip install --no-deps --force-reinstall --no-use-pep517 -e . -v
 ```
 
 Alternatively, you can pass those options also as command-line parameters:
 
 ```bash
-python -m pip install --install-option=build_ext --install-option="-IC:\GDAL\include" --install-option="-lgdal_i" --install-option="-LC:\GDAL\lib" --install-option="--gdalversion=3.4.1" --no-deps --force-reinstall --no-use-pep517 -e . -v
+python -m pip install --install-option=build_ext --install-option="-IC:\GDAL\include" --install-option="-lgdal_i" --install-option="-LC:\GDAL\lib" --install-option="--gdalversion=3.8.3" --no-deps --force-reinstall --no-use-pep517 -e . -v
 ```
 
 The location of the GDAL DLLs must be on your system `PATH`.


=====================================
docs/source/introduction.md
=====================================
@@ -481,13 +481,17 @@ Not all file formats have dedicated support to store datetime data, like ESRI
 Shapefile. For such formats, or if you require precision > ms, a workaround is to
 convert the datetimes to string.
 
-Timezone information is preserved where possible, however GDAL only represents
-time zones as UTC offsets, whilst pandas uses IANA time zones (via `pytz` or
-`zoneinfo`). This means that dataframes with columns containing multiple offsets
-(e.g. when switching from standard time to summer time) will be written correctly,
-but when read via `pyogrio.read_dataframe()` will be returned as a UTC datetime
-column, as there is no way to reconstruct the original timezone from the individual
-offsets present.
+When you have datetime columns with time zone information, it is important to
+note that GDAL only represents time zones as UTC offsets, whilst pandas uses
+IANA time zones (via `pytz` or `zoneinfo`). As a result, even if a column in a
+DataFrame contains datetimes in a single time zone, this will often still result
+in mixed time zone offsets being written for time zones where daylight saving
+time is used (e.g. +01:00 and +02:00 offsets for time zone Europe/Brussels).
+When roundtripping through GDAL, the information about the original time zone
+is lost, only the offsets can be preserved. By default,
+{func}`pyogrio.read_dataframe()` will convert columns with mixed offsets to UTC
+to return a datetime64 column. If you want to preserve the original offsets,
+you can use `datetime_as_string=True` or `mixed_offsets_as_utc=False`.
 
 ## Dataset and layer creation options
 


=====================================
pyogrio/_compat.py
=====================================
@@ -29,7 +29,6 @@ except ImportError:
     pandas = None
 
 
-HAS_ARROW_API = __gdal_version__ >= (3, 6, 0)
 HAS_ARROW_WRITE_API = __gdal_version__ >= (3, 8, 0)
 HAS_PYARROW = pyarrow is not None
 HAS_PYPROJ = pyproj is not None
@@ -42,9 +41,9 @@ HAS_GEOPANDAS = geopandas is not None
 PANDAS_GE_15 = pandas is not None and Version(pandas.__version__) >= Version("1.5.0")
 PANDAS_GE_20 = pandas is not None and Version(pandas.__version__) >= Version("2.0.0")
 PANDAS_GE_22 = pandas is not None and Version(pandas.__version__) >= Version("2.2.0")
+PANDAS_GE_23 = pandas is not None and Version(pandas.__version__) >= Version("2.3.0")
 PANDAS_GE_30 = pandas is not None and Version(pandas.__version__) >= Version("3.0.0dev")
 
-GDAL_GE_352 = __gdal_version__ >= (3, 5, 2)
 GDAL_GE_37 = __gdal_version__ >= (3, 7, 0)
 GDAL_GE_38 = __gdal_version__ >= (3, 8, 0)
 GDAL_GE_311 = __gdal_version__ >= (3, 11, 0)


=====================================
pyogrio/_err.pyx
=====================================
@@ -418,7 +418,7 @@ cdef void stacking_error_handler(
 
 @contextlib.contextmanager
 def capture_errors():
-    """A context manager that captures all GDAL non-fatal errors occuring.
+    """A context manager that captures all GDAL non-fatal errors occurring.
 
     It adds all errors to a single stack, so it assumes that no more than one
     GDAL function is called.


=====================================
pyogrio/_io.pyx
=====================================
@@ -24,11 +24,9 @@ from cpython.pycapsule cimport PyCapsule_New, PyCapsule_GetPointer
 
 import numpy as np
 
-from pyogrio._ogr cimport *
 from pyogrio._err cimport (
     check_last_error, check_int, check_pointer, ErrorHandler
 )
-from pyogrio._vsi cimport *
 from pyogrio._err import (
     CPLE_AppDefinedError,
     CPLE_BaseError,
@@ -38,6 +36,9 @@ from pyogrio._err import (
     capture_errors,
 )
 from pyogrio._geometry cimport get_geometry_type, get_geometry_type_code
+from pyogrio._ogr cimport *
+from pyogrio._ogr import MULTI_EXTENSIONS
+from pyogrio._vsi cimport *
 from pyogrio.errors import (
     CRSError, DataSourceError, DataLayerError, GeometryError, FieldError, FeatureError
 )
@@ -49,11 +50,11 @@ log = logging.getLogger(__name__)
 # (index in array is the integer field type)
 FIELD_TYPES = [
     "int32",           # OFTInteger, Simple 32bit integer
-    None,              # OFTIntegerList, List of 32bit integers, not supported
+    "list(int32)",     # OFTIntegerList, List of 32bit integers
     "float64",         # OFTReal, Double Precision floating point
-    None,              # OFTRealList, List of doubles, not supported
+    "list(float64)",   # OFTRealList, List of doubles
     "object",          # OFTString, String of UTF-8 chars
-    None,              # OFTStringList, Array of strings, not supported
+    "list(str)",       # OFTStringList, Array of strings
     None,              # OFTWideString, deprecated, not supported
     None,              # OFTWideStringList, deprecated, not supported
     "object",          # OFTBinary, Raw Binary data
@@ -61,9 +62,28 @@ FIELD_TYPES = [
     None,              # OFTTime, Time, NOTE: not directly supported in numpy
     "datetime64[ms]",  # OFTDateTime, Date and Time
     "int64",           # OFTInteger64, Single 64bit integer
-    None               # OFTInteger64List, List of 64bit integers, not supported
+    "list(int64)"      # OFTInteger64List, List of 64bit integers, not supported
 ]
 
+# Mapping of OGR integer field types to OGR type names
+# (index in array is the integer field type)
+FIELD_TYPE_NAMES = {
+    OFTInteger: "OFTInteger",                # Simple 32bit integer
+    OFTIntegerList: "OFTIntegerList",        # List of 32bit integers, not supported
+    OFTReal: "OFTReal",                      # Double Precision floating point
+    OFTRealList: "OFTRealList",              # List of doubles, not supported
+    OFTString: "OFTString",                  # String of UTF-8 chars
+    OFTStringList: "OFTStringList",          # Array of strings, not supported
+    OFTWideString: "OFTWideString",          # deprecated, not supported
+    OFTWideStringList: "OFTWideStringList",  # deprecated, not supported
+    OFTBinary: "OFTBinary",                  # Raw Binary data
+    OFTDate: "OFTDate",                      # Date
+    OFTTime: "OFTTime",                      # Time: not directly supported in numpy
+    OFTDateTime: "OFTDateTime",              # Date and Time
+    OFTInteger64: "OFTInteger64",            # Single 64bit integer
+    OFTInteger64List: "OFTInteger64List",    # List of 64bit integers, not supported
+}
+
 FIELD_SUBTYPES = {
     OFSTNone: None,           # No subtype
     OFSTBoolean: "bool",      # Boolean integer
@@ -71,6 +91,16 @@ FIELD_SUBTYPES = {
     OFSTFloat32: "float32",   # Single precision (32 bit) floating point
 }
 
+FIELD_SUBTYPE_NAMES = {
+    OFSTNone: "OFSTNone",             # No subtype
+    OFSTBoolean: "OFSTBoolean",       # Boolean integer
+    OFSTInt16: "OFSTInt16",           # Signed 16-bit integer
+    OFSTFloat32: "OFSTFloat32",       # Single precision (32 bit) floating point
+    OFSTJSON: "OFSTJSON",
+    OFSTUUID: "OFSTUUID",
+    OFSTMaxSubType: "OFSTMaxSubType",
+}
+
 # Mapping of numpy ndarray dtypes to (field type, subtype)
 DTYPE_OGR_FIELD_TYPES = {
     "int8": (OFTInteger, OFSTInt16),
@@ -274,6 +304,10 @@ cdef OGRLayerH get_ogr_layer(GDALDatasetH ogr_dataset, layer) except NULL:
 
         elif isinstance(layer, int):
             ogr_layer = check_pointer(GDALDatasetGetLayer(ogr_dataset, layer))
+        else:
+            raise ValueError(
+                f"'layer' parameter must be a str or int, got {type(layer)}"
+            )
 
     # GDAL does not always raise exception messages in this case
     except NullPointerError:
@@ -610,6 +644,11 @@ cdef detect_encoding(OGRDataSourceH ogr_dataset, OGRLayerH ogr_layer):
         # In old gdal versions, OLCStringsAsUTF8 wasn't advertised yet.
         return "UTF-8"
 
+    if driver == "SQLite":
+        # TestCapability for OLCStringsAsUTF8 returns False for SQLite in GDAL 3.11.3.
+        # Issue opened: https://github.com/OSGeo/gdal/issues/12962
+        return "UTF-8"
+
     return locale.getpreferredencoding()
 
 
@@ -627,8 +666,8 @@ cdef get_fields(OGRLayerH ogr_layer, str encoding, use_arrow=False):
 
     Returns
     -------
-    ndarray(n, 4)
-        array of index, ogr type, name, numpy type
+    ndarray(n, 5)
+        array of index, ogr type, name, numpy type, ogr subtype
     """
     cdef int i
     cdef int field_count
@@ -648,7 +687,7 @@ cdef get_fields(OGRLayerH ogr_layer, str encoding, use_arrow=False):
 
     field_count = OGR_FD_GetFieldCount(ogr_featuredef)
 
-    fields = np.empty(shape=(field_count, 4), dtype=object)
+    fields = np.empty(shape=(field_count, 5), dtype=object)
     fields_view = fields[:, :]
 
     skipped_fields = False
@@ -685,6 +724,7 @@ cdef get_fields(OGRLayerH ogr_layer, str encoding, use_arrow=False):
         fields_view[i, 1] = field_type
         fields_view[i, 2] = field_name
         fields_view[i, 3] = np_type
+        fields_view[i, 4] = field_subtype
 
     if skipped_fields:
         # filter out skipped fields
@@ -879,6 +919,10 @@ cdef process_fields(
     cdef int success
     cdef int field_index
     cdef int ret_length
+    cdef int *ints_c
+    cdef GIntBig *int64s_c
+    cdef double *doubles_c
+    cdef char **strings_c
     cdef GByte *bin_value
     cdef int year = 0
     cdef int month = 0
@@ -936,10 +980,16 @@ cdef process_fields(
 
             if datetime_as_string:
                 # defer datetime parsing to user/ pandas layer
-                # Update to OGR_F_GetFieldAsISO8601DateTime when GDAL 3.7+ only
-                data[i] = get_string(
-                    OGR_F_GetFieldAsString(ogr_feature, field_index), encoding=encoding
-                )
+                IF CTE_GDAL_VERSION >= (3, 7, 0):
+                    data[i] = get_string(
+                        OGR_F_GetFieldAsISO8601DateTime(ogr_feature, field_index, NULL),
+                        encoding=encoding,
+                    )
+                ELSE:
+                    data[i] = get_string(
+                        OGR_F_GetFieldAsString(ogr_feature, field_index),
+                        encoding=encoding,
+                    )
             else:
                 success = OGR_F_GetFieldAsDateTimeEx(
                     ogr_feature,
@@ -969,6 +1019,55 @@ cdef process_fields(
                         year, month, day, hour, minute, second, microsecond
                     ).isoformat()
 
+        elif field_type == OFTIntegerList:
+            # According to GDAL doc, this can return NULL for an empty list, which is a
+            # valid result. So don't use check_pointer as it would throw an exception.
+            ints_c = OGR_F_GetFieldAsIntegerList(ogr_feature, field_index, &ret_length)
+            int_arr = np.ndarray(shape=(ret_length,), dtype=np.int32)
+            for j in range(ret_length):
+                int_arr[j] = ints_c[j]
+            data[i] = int_arr
+
+        elif field_type == OFTInteger64List:
+            # According to GDAL doc, this can return NULL for an empty list, which is a
+            # valid result. So don't use check_pointer as it would throw an exception.
+            int64s_c = OGR_F_GetFieldAsInteger64List(
+                ogr_feature, field_index, &ret_length
+            )
+
+            int_arr = np.ndarray(shape=(ret_length,), dtype=np.int64)
+            for j in range(ret_length):
+                int_arr[j] = int64s_c[j]
+            data[i] = int_arr
+
+        elif field_type == OFTRealList:
+            # According to GDAL doc, this can return NULL for an empty list, which is a
+            # valid result. So don't use check_pointer as it would throw an exception.
+            doubles_c = OGR_F_GetFieldAsDoubleList(
+                ogr_feature, field_index, &ret_length
+            )
+
+            double_arr = np.ndarray(shape=(ret_length,), dtype=np.float64)
+            for j in range(ret_length):
+                double_arr[j] = doubles_c[j]
+            data[i] = double_arr
+
+        elif field_type == OFTStringList:
+            # According to GDAL doc, this can return NULL for an empty list, which is a
+            # valid result. So don't use check_pointer as it would throw an exception.
+            strings_c = OGR_F_GetFieldAsStringList(ogr_feature, field_index)
+
+            string_list_index = 0
+            vals = []
+            if strings_c != NULL:
+                # According to GDAL doc, the list is terminated by a NULL pointer.
+                while strings_c[string_list_index] != NULL:
+                    val = strings_c[string_list_index]
+                    vals.append(get_string(val, encoding=encoding))
+                    string_list_index += 1
+
+            data[i] = np.array(vals)
+
 
 @cython.boundscheck(False)  # Deactivate bounds checking
 @cython.wraparound(False)   # Deactivate negative indexing.
@@ -1012,16 +1111,16 @@ cdef get_features(
     field_indexes = fields[:, 0]
     field_ogr_types = fields[:, 1]
 
-    field_data = [
-        np.empty(
-            shape=(num_features, ),
-            dtype = (
-                "object"
-                if datetime_as_string and fields[field_index, 3].startswith("datetime")
-                else fields[field_index, 3]
-            )
-        ) for field_index in range(n_fields)
-    ]
+    field_data = []
+    for field_index in range(n_fields):
+        if datetime_as_string and fields[field_index, 3].startswith("datetime"):
+            dtype = "object"
+        elif fields[field_index, 3].startswith("list"):
+            dtype = "object"
+        else:
+            dtype = fields[field_index, 3]
+
+        field_data.append(np.empty(shape=(num_features, ), dtype=dtype))
 
     field_data_view = [field_data[field_index][:] for field_index in range(n_fields)]
 
@@ -1413,11 +1512,18 @@ def ogr_read(
                 datetime_as_string=datetime_as_string
             )
 
+        ogr_types = [FIELD_TYPE_NAMES.get(field[1], "Unknown") for field in fields]
+        ogr_subtypes = [
+            FIELD_SUBTYPE_NAMES.get(field[4], "Unknown") for field in fields
+        ]
+
         meta = {
             "crs": crs,
             "encoding": encoding,
             "fields": fields[:, 2],
             "dtypes": fields[:, 3],
+            "ogr_types": ogr_types,
+            "ogr_subtypes": ogr_subtypes,
             "geometry_type": geometry_type,
         }
 
@@ -1502,6 +1608,7 @@ def ogr_open_arrow(
     int return_fids=False,
     int batch_size=0,
     use_pyarrow=False,
+    datetime_as_string=False,
 ):
 
     cdef int err = 0
@@ -1520,9 +1627,6 @@ def ogr_open_arrow(
     cdef ArrowArrayStream* stream
     cdef ArrowSchema schema
 
-    IF CTE_GDAL_VERSION < (3, 6, 0):
-        raise RuntimeError("Need GDAL>=3.6 for Arrow support")
-
     if force_2d:
         raise ValueError("forcing 2D is not supported for Arrow")
 
@@ -1722,6 +1826,12 @@ def ogr_open_arrow(
                 "GEOARROW".encode("UTF-8")
             )
 
+        # Read DateTime fields as strings, as the Arrow DateTime column type is
+        # quite limited regarding support for mixed time zones,...
+        IF CTE_GDAL_VERSION >= (3, 11, 0):
+            if datetime_as_string:
+                options = CSLSetNameValue(options, "DATETIME_AS_STRING", "YES")
+
         # make sure layer is read from beginning
         OGR_L_ResetReading(ogr_layer)
 
@@ -1745,10 +1855,18 @@ def ogr_open_arrow(
         else:
             reader = _ArrowStream(capsule)
 
+        ogr_types = [FIELD_TYPE_NAMES.get(field[1], "Unknown") for field in fields]
+        ogr_subtypes = [
+            FIELD_SUBTYPE_NAMES.get(field[4], "Unknown") for field in fields
+        ]
+
         meta = {
             "crs": crs,
             "encoding": encoding,
             "fields": fields[:, 2],
+            "dtypes": fields[:, 3],
+            "ogr_types": ogr_types,
+            "ogr_subtypes": ogr_subtypes,
             "geometry_type": geometry_type,
             "geometry_name": geometry_name,
             "fid_column": fid_column,
@@ -1905,6 +2023,10 @@ def ogr_read_info(
             encoding = encoding or detect_encoding(ogr_dataset, ogr_layer)
 
         fields = get_fields(ogr_layer, encoding)
+        ogr_types = [FIELD_TYPE_NAMES.get(field[1], "Unknown") for field in fields]
+        ogr_subtypes = [
+            FIELD_SUBTYPE_NAMES.get(field[4], "Unknown") for field in fields
+        ]
 
         meta = {
             "layer_name": get_string(OGR_L_GetName(ogr_layer)),
@@ -1912,6 +2034,8 @@ def ogr_read_info(
             "encoding": encoding,
             "fields": fields[:, 2],
             "dtypes": fields[:, 3],
+            "ogr_types": ogr_types,
+            "ogr_subtypes": ogr_subtypes,
             "fid_column": get_string(OGR_L_GetFIDColumn(ogr_layer)),
             "geometry_name": get_string(OGR_L_GetGeometryColumn(ogr_layer)),
             "geometry_type": get_geometry_type(ogr_layer),
@@ -2236,7 +2360,15 @@ cdef create_ogr_dataset_layer(
     path_exists = os.path.exists(path) if not use_tmp_vsimem else False
 
     if not layer:
-        layer = os.path.splitext(os.path.split(path)[1])[0]
+        # For multi extensions (e.g. ".shp.zip"), strip the full extension
+        for multi_ext in MULTI_EXTENSIONS:
+            if path.endswith(multi_ext):
+                layer = os.path.split(path)[1][:-len(multi_ext)]
+                break
+
+        # If it wasn't a multi-extension, use the file stem
+        if not layer:
+            layer = os.path.splitext(os.path.split(path)[1])[0]
 
     # if shapefile, GeoJSON, or FlatGeobuf, always delete first
     # for other types, check if we can create layers


=====================================
pyogrio/_ogr.pxd
=====================================
@@ -185,6 +185,9 @@ cdef extern from "ogr_core.h":
         OFSTBoolean
         OFSTInt16
         OFSTFloat32
+        OFSTJSON
+        OFSTUUID
+        OFSTMaxSubType
 
     ctypedef void* OGRDataSourceH
     ctypedef void* OGRFeatureDefnH
@@ -256,6 +259,7 @@ cdef extern from "arrow_bridge.h" nogil:
 
 
 cdef extern from "ogr_api.h":
+    ctypedef signed long long GIntBig
     int             OGRGetDriverCount()
     OGRSFDriverH    OGRGetDriver(int)
 
@@ -283,6 +287,14 @@ cdef extern from "ogr_api.h":
     int             OGR_F_GetFieldAsInteger(OGRFeatureH feature, int n)
     int64_t         OGR_F_GetFieldAsInteger64(OGRFeatureH feature, int n)
     const char*     OGR_F_GetFieldAsString(OGRFeatureH feature, int n)
+    char **         OGR_F_GetFieldAsStringList(OGRFeatureH feature, int n)
+    const int *     OGR_F_GetFieldAsIntegerList(
+                        OGRFeatureH feature, int n, int* pnCount)
+    const GIntBig * OGR_F_GetFieldAsInteger64List(
+                        OGRFeatureH feature, int n, int* pnCount)
+    const double *  OGR_F_GetFieldAsDoubleList(
+                        OGRFeatureH feature, int n, int* pnCount)
+
     int             OGR_F_IsFieldSetAndNotNull(OGRFeatureH feature, int n)
 
     void OGR_F_SetFieldDateTime(OGRFeatureH feature,
@@ -406,12 +418,16 @@ cdef extern from "ogr_api.h":
     const char*     OLCFastGetExtent
     const char*     OLCTransactions
 
+cdef extern from "ogr_api.h":
+    bint OGR_L_GetArrowStream(
+        OGRLayerH hLayer, ArrowArrayStream *out_stream, char** papszOptions
+    )
 
-IF CTE_GDAL_VERSION >= (3, 6, 0):
+IF CTE_GDAL_VERSION >= (3, 7, 0):
 
     cdef extern from "ogr_api.h":
-        bint OGR_L_GetArrowStream(
-            OGRLayerH hLayer, ArrowArrayStream *out_stream, char** papszOptions
+        const char* OGR_F_GetFieldAsISO8601DateTime(
+            OGRFeatureH feature, int n, char** papszOptions
         )
 
 


=====================================
pyogrio/_ogr.pyx
=====================================
@@ -1,12 +1,13 @@
 import os
 import sys
-from uuid import uuid4
 import warnings
 
 from pyogrio._err cimport check_pointer
 from pyogrio._err import CPLE_BaseError, NullPointerError
 from pyogrio.errors import DataSourceError
 
+MULTI_EXTENSIONS = (".gpkg.zip", ".shp.zip")
+
 
 cdef get_string(const char *c_str, str encoding="UTF-8"):
     """Get Python string from a char *.
@@ -42,21 +43,16 @@ def get_gdal_version_string():
     return get_string(version)
 
 
-IF CTE_GDAL_VERSION >= (3, 4, 0):
-
-    cdef extern from "ogr_api.h":
-        bint OGRGetGEOSVersion(int *pnMajor, int *pnMinor, int *pnPatch)
+cdef extern from "ogr_api.h":
+    bint OGRGetGEOSVersion(int *pnMajor, int *pnMinor, int *pnPatch)
 
 
 def get_gdal_geos_version():
     cdef int major, minor, revision
 
-    IF CTE_GDAL_VERSION >= (3, 4, 0):
-        if not OGRGetGEOSVersion(&major, &minor, &revision):
-            return None
-        return (major, minor, revision)
-    ELSE:
+    if not OGRGetGEOSVersion(&major, &minor, &revision):
         return None
+    return (major, minor, revision)
 
 
 def set_gdal_config_options(dict options):
@@ -165,7 +161,7 @@ def get_gdal_data_path():
     """
     cdef const char *path_c = CPLFindFile("gdal", "header.dxf")
     if path_c != NULL:
-        return get_string(path_c).rstrip("header.dxf")
+        return get_string(path_c).replace("header.dxf", "")
     return None
 
 
@@ -336,10 +332,10 @@ def _get_drivers_for_path(path):
 
     # allow specific drivers to have a .zip extension to match GDAL behavior
     if ext == "zip":
-        if path.endswith(".shp.zip"):
-            ext = "shp.zip"
-        elif path.endswith(".gpkg.zip"):
-            ext = "gpkg.zip"
+        for multi_ext in MULTI_EXTENSIONS:
+            if path.endswith(multi_ext):
+                ext = multi_ext[1:]  # strip leading dot
+                break
 
     drivers = []
     for i in range(OGRGetDriverCount()):


=====================================
pyogrio/_version.py
=====================================
@@ -25,9 +25,9 @@ def get_keywords():
     # setup.py/versioneer.py will grep for the variable names, so they must
     # each be defined on a line of their own. _version.py will just call
     # get_keywords().
-    git_refnames = " (HEAD -> main, tag: v0.11.1)"
-    git_full = "d3ff55ba80ea5f1744d40f7502adec3658d91b15"
-    git_date = "2025-08-02 21:41:37 +0200"
+    git_refnames = " (HEAD -> main, tag: v0.12.0)"
+    git_full = "ea9a97b6aef45c921ea36b599666e7e83b84070c"
+    git_date = "2025-11-26 10:18:55 +0100"
     keywords = {"refnames": git_refnames, "full": git_full, "date": git_date}
     return keywords
 


=====================================
pyogrio/core.py
=====================================
@@ -1,7 +1,6 @@
 """Core functions to interact with OGR data sources."""
 
 from pathlib import Path
-from typing import Optional, Union
 
 from pyogrio._env import GDALEnv
 from pyogrio.util import (
@@ -237,9 +236,9 @@ def read_info(
     ----------
     path_or_buffer : str, pathlib.Path, bytes, or file-like
         A dataset path or URI, raw buffer, or file-like object with a read method.
-    layer : [type], optional
+    layer : str or int, optional
         Name or index of layer in data source.  Reads the first layer by default.
-    encoding : [type], optional (default: None)
+    encoding : str, optional (default: None)
         If present, will be used as the encoding for reading string values from
         the data source, unless encoding can be inferred directly from the data
         source.
@@ -261,6 +260,8 @@ def read_info(
                 "crs": "<crs>",
                 "fields": <ndarray of field names>,
                 "dtypes": <ndarray of field dtypes>,
+                "ogr_types": <ndarray of OGR field types>,
+                "ogr_subtypes": <ndarray of OGR field subtypes>,
                 "encoding": "<encoding>",
                 "fid_column": "<fid column name or "">",
                 "geometry_name": "<geometry column name or "">",
@@ -336,7 +337,7 @@ def get_gdal_data_path():
     return _get_gdal_data_path()
 
 
-def vsi_listtree(path: Union[str, Path], pattern: Optional[str] = None):
+def vsi_listtree(path: str | Path, pattern: str | None = None):
     """Recursively list the contents of a VSI directory.
 
     An fnmatch pattern can be specified to filter the directories/files
@@ -356,7 +357,7 @@ def vsi_listtree(path: Union[str, Path], pattern: Optional[str] = None):
     return ogr_vsi_listtree(path, pattern=pattern)
 
 
-def vsi_rmtree(path: Union[str, Path]):
+def vsi_rmtree(path: str | Path):
     """Recursively remove VSI directory.
 
     Parameters
@@ -371,7 +372,7 @@ def vsi_rmtree(path: Union[str, Path]):
     ogr_vsi_rmtree(path)
 
 
-def vsi_unlink(path: Union[str, Path]):
+def vsi_unlink(path: str | Path):
     """Remove a VSI file.
 
     Parameters


=====================================
pyogrio/geopandas.py
=====================================
@@ -1,17 +1,21 @@
 """Functions for reading and writing GeoPandas dataframes."""
 
+import json
 import os
 import warnings
+from datetime import datetime
 
 import numpy as np
 
 from pyogrio._compat import (
     HAS_GEOPANDAS,
+    HAS_PYARROW,
     PANDAS_GE_15,
     PANDAS_GE_20,
     PANDAS_GE_22,
     PANDAS_GE_30,
     PYARROW_GE_19,
+    __gdal_version__,
 )
 from pyogrio.errors import DataSourceError
 from pyogrio.raw import (
@@ -37,33 +41,87 @@ def _stringify_path(path):
     return path
 
 
-def _try_parse_datetime(ser):
+def _try_parse_datetime(ser, datetime_as_string: bool, mixed_offsets_as_utc: bool):
     import pandas as pd  # only called when pandas is known to be installed
+    from pandas.api.types import is_string_dtype
+
+    datetime_kwargs = {}
+    if datetime_as_string:
+        if not is_string_dtype(ser.dtype):
+            # Support to return datetimes as strings using arrow only available for
+            # GDAL >= 3.11, so convert to string here if needed.
+            res = ser.astype("str")
+            if not PANDAS_GE_30:
+                # astype("str") also stringifies missing values in pandas < 3
+                res[ser.isna()] = None
+            res = res.str.replace(" ", "T")
+            return res
+        if __gdal_version__ < (3, 7, 0):
+            # GDAL < 3.7 doesn't return datetimes in ISO8601 format, so fix that
+            return ser.str.replace(" ", "T").str.replace("/", "-")
+        return ser
 
     if PANDAS_GE_22:
-        datetime_kwargs = {"format": "ISO8601"}
+        datetime_kwargs["format"] = "ISO8601"
     elif PANDAS_GE_20:
-        datetime_kwargs = {"format": "ISO8601", "errors": "ignore"}
+        datetime_kwargs["format"] = "ISO8601"
+        datetime_kwargs["errors"] = "ignore"
     else:
-        datetime_kwargs = {"yearfirst": True}
+        datetime_kwargs["yearfirst"] = True
+
     with warnings.catch_warnings():
         warnings.filterwarnings(
             "ignore",
             ".*parsing datetimes with mixed time zones will raise.*",
             FutureWarning,
         )
-        # pre-emptive try catch for when pandas will raise
-        # (can tighten the exception type in future when it does)
+
+        warning = "Error parsing datetimes, original strings are returned: {message}"
         try:
             res = pd.to_datetime(ser, **datetime_kwargs)
-        except Exception:
-            res = ser
-    # if object dtype, try parse as utc instead
-    if res.dtype in ("object", "string"):
+
+            # With pandas >2 and <3, mixed time zones were returned as pandas
+            # Timestamps, so convert them to datetime objects.
+            if not mixed_offsets_as_utc and PANDAS_GE_20 and res.dtype == "object":
+                res = res.map(lambda x: x.to_pydatetime(), na_action="ignore")
+
+        except Exception as ex:
+            if isinstance(ex, ValueError) and "Mixed timezones detected" in str(ex):
+                # Parsing mixed time zones with to_datetime is not supported
+                # anymore in pandas >= 3.0, leading to a ValueError.
+                if mixed_offsets_as_utc:
+                    # Convert mixed time zone datetimes to UTC.
+                    try:
+                        res = pd.to_datetime(ser, utc=True, **datetime_kwargs)
+                    except Exception as ex:
+                        warnings.warn(warning.format(message=str(ex)), stacklevel=3)
+                        return ser
+                else:
+                    # Using map seems to be the fastest way to convert the strings to
+                    # datetime objects.
+                    try:
+                        res = ser.map(datetime.fromisoformat, na_action="ignore")
+                    except Exception as ex:
+                        warnings.warn(warning.format(message=str(ex)), stacklevel=3)
+                        return ser
+
+            else:
+                # If the error is not related to mixed time zones, log it and return
+                # the original series.
+                warnings.warn(warning.format(message=str(ex)), stacklevel=3)
+                if __gdal_version__ < (3, 7, 0):
+                    # GDAL < 3.7 doesn't return datetimes in ISO8601 format, so fix that
+                    return ser.str.replace(" ", "T").str.replace("/", "-")
+
+                return ser
+
+    # For pandas < 3.0, to_datetime converted mixed time zone data to datetime objects.
+    # For mixed_offsets_as_utc they should be converted to UTC though...
+    if mixed_offsets_as_utc and res.dtype in ("object", "string"):
         try:
             res = pd.to_datetime(ser, utc=True, **datetime_kwargs)
-        except Exception:
-            pass
+        except Exception as ex:
+            warnings.warn(warning.format(message=str(ex)), stacklevel=3)
 
     if res.dtype.kind == "M":  # any datetime64
         # GDAL only supports ms precision, convert outputs to match.
@@ -73,6 +131,7 @@ def _try_parse_datetime(ser):
             res = res.dt.as_unit("ms")
         else:
             res = res.dt.round(freq="ms")
+
     return res
 
 
@@ -96,6 +155,8 @@ def read_dataframe(
     use_arrow=None,
     on_invalid="raise",
     arrow_to_pandas_kwargs=None,
+    datetime_as_string=False,
+    mixed_offsets_as_utc=True,
     **kwargs,
 ):
     """Read from an OGR data source to a GeoPandas GeoDataFrame or Pandas DataFrame.
@@ -103,6 +164,9 @@ def read_dataframe(
     If the data source does not have a geometry column or ``read_geometry`` is False,
     a DataFrame will be returned.
 
+    If you read data with datetime columns containing time zone information, check out
+    the notes below.
+
     Requires ``geopandas`` >= 0.8.
 
     Parameters
@@ -223,14 +287,55 @@ def read_dataframe(
     arrow_to_pandas_kwargs : dict, optional (default: None)
         When `use_arrow` is True, these kwargs will be passed to the `to_pandas`_
         call for the arrow to pandas conversion.
+    datetime_as_string : bool, optional (default: False)
+        If True, will return datetime columns as detected by GDAL as ISO8601
+        strings and ``mixed_offsets_as_utc`` will be ignored.
+    mixed_offsets_as_utc: bool, optional (default: True)
+        By default, datetime columns are read as the pandas datetime64 dtype.
+        This can represent the data as-is in the case that the column contains
+        only naive datetimes (without time zone information), only UTC datetimes,
+        or if all datetimes in the column have the same time zone offset. Note
+        that in time zones with daylight saving time, datetimes will have
+        different offsets throughout the year!
+
+        For columns that don't comply with the above, i.e. columns that contain
+        mixed offsets, the behavior depends on the value of this parameter:
+
+        - If ``True`` (default), such datetimes are converted to UTC. In the case
+          of a mixture of time zone aware and naive datetimes, the naive
+          datetimes are assumed to be in UTC already. Datetime columns returned
+          will always be pandas datetime64.
+        - If ``False``, such datetimes with mixed offsets are returned with
+          those offsets preserved. Because pandas datetime64 columns don't
+          support mixed time zone offsets, such columns are returned as object
+          columns with python datetime values with fixed offsets. If you want
+          to roundtrip datetimes without data loss, this is the recommended
+          option, but you lose the functionality of a datetime64 column.
+
+        If ``datetime_as_string`` is True, this option is ignored.
+
     **kwargs
-        Additional driver-specific dataset open options passed to OGR.  Invalid
+        Additional driver-specific dataset open options passed to OGR. Invalid
         options will trigger a warning.
 
     Returns
     -------
     GeoDataFrame or DataFrame (if no geometry is present)
 
+    Notes
+    -----
+    When you have datetime columns with time zone information, it is important to
+    note that GDAL only represents time zones as UTC offsets, whilst pandas uses
+    IANA time zones (via `pytz` or `zoneinfo`). As a result, even if a column in a
+    DataFrame contains datetimes in a single time zone, this will often still result
+    in mixed time zone offsets being written for time zones where daylight saving
+    time is used (e.g. +01:00 and +02:00 offsets for time zone Europe/Brussels). When
+    roundtripping through GDAL, the information about the original time zone is
+    lost, only the offsets can be preserved. By default, `pyogrio.read_dataframe()`
+    will convert columns with mixed offsets to UTC to return a datetime64 column. If
+    you want to preserve the original offsets, you can use `datetime_as_string=True`
+    or `mixed_offsets_as_utc=False`.
+
     .. _OGRSQL:
 
         https://gdal.org/user/ogr_sql_dialect.html#ogr-sql-dialect
@@ -267,11 +372,13 @@ def read_dataframe(
 
     read_func = read_arrow if use_arrow else read
     gdal_force_2d = False if use_arrow else force_2d
-    if not use_arrow:
-        # For arrow, datetimes are read as is.
-        # For numpy IO, datetimes are read as string values to preserve timezone info
-        # as numpy does not directly support timezones.
-        kwargs["datetime_as_string"] = True
+
+    # Always read datetimes as string values to preserve (mixed) time zone info
+    # correctly. If arrow is not used, it is needed because numpy does not
+    # directly support time zones + performance is also a lot better. If arrow
+    # is used, needed because datetime columns don't support mixed time zone
+    # offsets + e.g. for .fgb files time zone info isn't handled correctly even
+    # for unique time zone offsets if datetimes are not read as string.
     result = read_func(
         path_or_buffer,
         layer=layer,
@@ -288,6 +395,7 @@ def read_dataframe(
         sql=sql,
         sql_dialect=sql_dialect,
         return_fids=fid_as_index,
+        datetime_as_string=True,
         **kwargs,
     )
 
@@ -330,6 +438,26 @@ def read_dataframe(
 
         del table
 
+        # convert datetime columns that were read as string to datetime
+        for dtype, column in zip(meta["dtypes"], meta["fields"]):
+            if dtype is not None and dtype.startswith("datetime"):
+                df[column] = _try_parse_datetime(
+                    df[column], datetime_as_string, mixed_offsets_as_utc
+                )
+        for ogr_subtype, c in zip(meta["ogr_subtypes"], meta["fields"]):
+            if ogr_subtype == "OFSTJSON":
+                # When reading .parquet files with arrow, JSON fields are already
+                # parsed, so only parse if strings.
+                dtype = pd.api.types.infer_dtype(df[c])
+                if dtype == "string":
+                    try:
+                        df[c] = df[c].map(json.loads, na_action="ignore")
+                    except Exception:
+                        warnings.warn(
+                            f"Could not parse column '{c}' as JSON; leaving as string",
+                            stacklevel=2,
+                        )
+
         if fid_as_index:
             df = df.set_index(meta["fid_column"])
             df.index.names = ["fid"]
@@ -341,8 +469,18 @@ def read_dataframe(
         elif geometry_name in df.columns:
             wkb_values = df.pop(geometry_name)
             if PANDAS_GE_15 and wkb_values.dtype != object:
-                # for example ArrowDtype will otherwise create numpy array with pd.NA
-                wkb_values = wkb_values.to_numpy(na_value=None)
+                if (
+                    HAS_PYARROW
+                    and isinstance(wkb_values.dtype, pd.ArrowDtype)
+                    and isinstance(wkb_values.dtype.pyarrow_dtype, pa.BaseExtensionType)
+                ):
+                    # handle BaseExtensionType(extension<geoarrow.wkb>)
+                    wkb_values = pa.array(wkb_values.array).to_numpy(
+                        zero_copy_only=False
+                    )
+                else:
+                    # for example ArrowDtype will otherwise give numpy array with pd.NA
+                    wkb_values = wkb_values.to_numpy(na_value=None)
             df["geometry"] = shapely.from_wkb(wkb_values, on_invalid=on_invalid)
             if force_2d:
                 df["geometry"] = shapely.force_2d(df["geometry"])
@@ -361,7 +499,18 @@ def read_dataframe(
     df = pd.DataFrame(data, columns=columns, index=index)
     for dtype, c in zip(meta["dtypes"], df.columns):
         if dtype.startswith("datetime"):
-            df[c] = _try_parse_datetime(df[c])
+            df[c] = _try_parse_datetime(df[c], datetime_as_string, mixed_offsets_as_utc)
+    for ogr_subtype, c in zip(meta["ogr_subtypes"], meta["fields"]):
+        if ogr_subtype == "OFSTJSON":
+            dtype = pd.api.types.infer_dtype(df[c])
+            if dtype == "string":
+                try:
+                    df[c] = df[c].map(json.loads, na_action="ignore")
+                except Exception:
+                    warnings.warn(
+                        f"Could not parse column '{c}' as JSON; leaving as string",
+                        stacklevel=2,
+                    )
 
     if geometry is None or not read_geometry:
         return df
@@ -480,6 +629,18 @@ def write_dataframe(
         do this (for example if an option exists as both dataset and layer
         option).
 
+    Notes
+    -----
+    When you have datetime columns with time zone information, it is important to
+    note that GDAL only represents time zones as UTC offsets, whilst pandas uses
+    IANA time zones (via `pytz` or `zoneinfo`). As a result, even if a column in a
+    DataFrame contains datetimes in a single time zone, this will often still result
+    in mixed time zone offsets being written for time zones where daylight saving
+    time is used (e.g. +01:00 and +02:00 offsets for time zone Europe/Brussels).
+
+    Object dtype columns containing `datetime` or `pandas.Timestamp` objects will
+    also be written as datetime fields, preserving time zone information where possible.
+
     """
     # TODO: add examples to the docstring (e.g. OGR kwargs)
 
@@ -584,6 +745,7 @@ def write_dataframe(
             crs = geometry.crs.to_wkt("WKT1_GDAL")
 
     if use_arrow:
+        import pandas as pd  # only called when pandas is known to be installed
         import pyarrow as pa
 
         from pyogrio.raw import write_arrow
@@ -619,8 +781,35 @@ def write_dataframe(
             df = pd.DataFrame(df, copy=False)
             df[geometry_column] = geometry
 
+        # Arrow doesn't support datetime columns with mixed time zones, and GDAL only
+        # supports time zone offsets. Hence, to avoid data loss, convert columns that
+        # can contain datetime values with different offsets to strings.
+        # Also pass a list of these columns on to GDAL so it can still treat them as
+        # datetime columns when writing the dataset.
+        datetime_cols = []
+        for name, dtype in df.dtypes.items():
+            if dtype == "object":
+                # An object column with datetimes can contain multiple offsets.
+                if pd.api.types.infer_dtype(df[name]) == "datetime":
+                    df[name] = df[name].astype("string")
+                    datetime_cols.append(name)
+
+            elif isinstance(dtype, pd.DatetimeTZDtype) and str(dtype.tz) != "UTC":
+                # A pd.datetime64 column with a time zone different than UTC can contain
+                # data with different offsets because of summer/winter time.
+                df[name] = df[name].astype("string")
+                datetime_cols.append(name)
+
         table = pa.Table.from_pandas(df, preserve_index=False)
 
+        # Add metadata to datetime columns so GDAL knows they are datetimes.
+        table = _add_column_metadata(
+            table,
+            column_metadata={
+                col: {"GDAL:OGR:type": "DateTime"} for col in datetime_cols
+            },
+        )
+
         # Null arrow columns are not supported by GDAL, so convert to string
         for field_index, field in enumerate(table.schema):
             if field.type == pa.null():
@@ -678,26 +867,39 @@ def write_dataframe(
     gdal_tz_offsets = {}
     for name in fields:
         col = df[name]
+        values = None
+
         if isinstance(col.dtype, pd.DatetimeTZDtype):
-            # Deal with datetimes with timezones by passing down timezone separately
+            # Deal with datetimes with time zones by passing down time zone separately
             # pass down naive datetime
             naive = col.dt.tz_localize(None)
             values = naive.values
             # compute offset relative to UTC explicitly
             tz_offset = naive - col.dt.tz_convert("UTC").dt.tz_localize(None)
-            # Convert to GDAL timezone offset representation.
+            # Convert to GDAL time zone offset representation.
             # GMT is represented as 100 and offsets are represented by adding /
             # subtracting 1 for every 15 minutes different from GMT.
             # https://gdal.org/development/rfc/rfc56_millisecond_precision.html#core-changes
             # Convert each row offset to a signed multiple of 15m and add to GMT value
             gdal_offset_representation = tz_offset // pd.Timedelta("15m") + 100
             gdal_tz_offsets[name] = gdal_offset_representation.values
-        else:
+
+        elif col.dtype == "object":
+            # Column of Timestamp/datetime objects, split in naive datetime and tz.
+            if pd.api.types.infer_dtype(df[name]) == "datetime":
+                tz_offset = col.map(lambda x: x.utcoffset(), na_action="ignore")
+                gdal_offset_repr = tz_offset // pd.Timedelta("15m") + 100
+                gdal_tz_offsets[name] = gdal_offset_repr.values
+                naive = col.map(lambda x: x.replace(tzinfo=None), na_action="ignore")
+                values = naive.values
+
+        if values is None:
             values = col.values
+
         if isinstance(values, pd.api.extensions.ExtensionArray):
             from pandas.arrays import BooleanArray, FloatingArray, IntegerArray
 
-            if isinstance(values, (IntegerArray, FloatingArray, BooleanArray)):
+            if isinstance(values, IntegerArray | FloatingArray | BooleanArray):
                 field_data.append(values._data)
                 field_mask.append(values._mask)
             else:
@@ -729,3 +931,48 @@ def write_dataframe(
         gdal_tz_offsets=gdal_tz_offsets,
         **kwargs,
     )
+
+
+def _add_column_metadata(table, column_metadata: dict = {}):
+    """Add or update column-level metadata to an arrow table.
+
+    Parameters
+    ----------
+    table : pyarrow.Table
+        The table to add the column metadata to.
+    column_metadata : dict
+        A dictionary with column metadata in the form
+            {
+                "column_1": {"some": "data"},
+                "column_2": {"more": "stuff"},
+            }
+
+    Returns
+    -------
+    pyarrow.Table: table with the updated column metadata.
+    """
+    import pyarrow as pa
+
+    if not column_metadata:
+        return table
+
+    # Create updated column fields with new metadata
+    fields = []
+    for col in table.schema.names:
+        if col in column_metadata:
+            # Add/update column metadata
+            metadata = table.field(col).metadata or {}
+            for key, value in column_metadata[col].items():
+                metadata[key] = value
+            # Update field with updated metadata
+            fields.append(table.field(col).with_metadata(metadata))
+        else:
+            fields.append(table.field(col))
+
+    # Create new schema with the updated field metadata
+    schema = pa.schema(fields, metadata=table.schema.metadata)
+
+    # Build new table with updated schema (shouldn't copy data)
+    table = table.cast(schema)
+
+    return table


=====================================
pyogrio/raw.py
=====================================
@@ -4,7 +4,7 @@ import warnings
 from io import BytesIO
 from pathlib import Path
 
-from pyogrio._compat import HAS_ARROW_API, HAS_ARROW_WRITE_API, HAS_PYARROW
+from pyogrio._compat import HAS_ARROW_WRITE_API, HAS_PYARROW
 from pyogrio._env import GDALEnv
 from pyogrio.core import detect_write_driver
 from pyogrio.errors import DataSourceError
@@ -151,7 +151,7 @@ def read(
         If True, will return the FIDs of the feature that were read.
     datetime_as_string : bool, optional (default: False)
         If True, will return datetime dtypes as detected by GDAL as a string
-        array (which can be used to extract timezone info), instead of
+        array (which can be used to extract time zone info), instead of
         a datetime64 array.
 
     **kwargs
@@ -171,9 +171,11 @@ def read(
         Meta is: {
             "crs": "<crs>",
             "fields": <ndarray of field names>,
-            "dtypes": <ndarray of numpy dtypes corresponding to fields>
+            "dtypes": <ndarray of numpy dtypes corresponding to fields>,
+            "ogr_types": <ndarray of OGR types corresponding to fields>,
+            "ogr_subtypes": <ndarray of OGR subtypes corresponding to fields>,
             "encoding": "<encoding>",
-            "geometry_type": "<geometry type>"
+            "geometry_type": "<geometry type>",
         }
 
     .. _OGRSQL:
@@ -233,6 +235,7 @@ def read_arrow(
     sql=None,
     sql_dialect=None,
     return_fids=False,
+    datetime_as_string=False,
     **kwargs,
 ):
     """Read OGR data source into a pyarrow Table.
@@ -249,9 +252,13 @@ def read_arrow(
         Meta is: {
             "crs": "<crs>",
             "fields": <ndarray of field names>,
+            "dtypes": <ndarray of numpy dtypes corresponding to fields>,
+            "ogr_types": <ndarray of OGR types corresponding to fields>,
+            "ogr_subtypes": <ndarray of OGR subtypes corresponding to fields>,
             "encoding": "<encoding>",
             "geometry_type": "<geometry_type>",
             "geometry_name": "<name of geometry column in arrow table>",
+            "fid_column": "<name of FID column in arrow table>"
         }
 
     """
@@ -303,6 +310,7 @@ def read_arrow(
         skip_features=gdal_skip_features,
         batch_size=batch_size,
         use_pyarrow=True,
+        datetime_as_string=datetime_as_string,
         **kwargs,
     ) as source:
         meta, reader = source
@@ -358,6 +366,7 @@ def open_arrow(
     return_fids=False,
     batch_size=65_536,
     use_pyarrow=False,
+    datetime_as_string=False,
     **kwargs,
 ):
     """Open OGR data source as a stream of Arrow record batches.
@@ -386,6 +395,9 @@ def open_arrow(
         ArrowStream object. In the default case, this stream object needs
         to be passed to another library supporting the Arrow PyCapsule
         Protocol to consume the stream of data.
+    datetime_as_string : bool, optional (default: False)
+        If True, will return datetime dtypes as detected by GDAL as strings,
+        as Arrow doesn't support e.g. mixed time zones.
 
     Examples
     --------
@@ -423,15 +435,16 @@ def open_arrow(
         Meta is: {
             "crs": "<crs>",
             "fields": <ndarray of field names>,
+            "dtypes": <ndarray of numpy dtypes corresponding to fields>,
+            "ogr_types": <ndarray of OGR types corresponding to fields>,
+            "ogr_subtypes": <ndarray of OGR subtypes corresponding to fields>,
             "encoding": "<encoding>",
             "geometry_type": "<geometry_type>",
             "geometry_name": "<name of geometry column in arrow table>",
+            "fid_column": "<name of FID column in arrow table>"
         }
 
     """
-    if not HAS_ARROW_API:
-        raise RuntimeError("GDAL>= 3.6 required to read using arrow")
-
     dataset_kwargs = _preprocess_options_key_value(kwargs) if kwargs else {}
 
     return ogr_open_arrow(
@@ -453,6 +466,7 @@ def open_arrow(
         dataset_kwargs=dataset_kwargs,
         batch_size=batch_size,
         use_pyarrow=use_pyarrow,
+        datetime_as_string=datetime_as_string,
     )
 
 
@@ -575,12 +589,6 @@ def _get_write_path_driver(path, driver, append=False):
             f"{get_gdal_version_string()}"
         )
 
-    # prevent segfault from: https://github.com/OSGeo/gdal/issues/5739
-    if append and driver == "FlatGeobuf" and get_gdal_version() <= (3, 5, 0):
-        raise RuntimeError(
-            "append to FlatGeobuf is not supported for GDAL <= 3.5.0 due to segfault"
-        )
-
     return path, driver
 
 
@@ -685,15 +693,17 @@ def write(
         Layer creation options (format specific) passed to OGR. Specify as
         a key-value dictionary.
     gdal_tz_offsets : dict, optional (default: None)
-        Used to handle GDAL timezone offsets for each field contained in dict.
+        Used to handle GDAL time zone offsets for each field contained in dict.
     **kwargs
         Additional driver-specific dataset creation options passed to OGR. Invalid
         options will trigger a warning.
 
     """
-    # if dtypes is given, remove it from kwargs (dtypes is included in meta returned by
+    # remove some unneeded kwargs (e.g. dtypes is included in meta returned by
     # read, and it is convenient to pass meta directly into write for round trip tests)
     kwargs.pop("dtypes", None)
+    kwargs.pop("ogr_types", None)
+    kwargs.pop("ogr_subtypes", None)
 
     path, driver = _get_write_path_driver(path, driver, append=append)
 


=====================================
pyogrio/tests/conftest.py
=====================================
@@ -1,16 +1,14 @@
+"""Module with helper functions, fixtures, and common test data for pyogrio tests."""
+
 from io import BytesIO
 from pathlib import Path
 from zipfile import ZIP_DEFLATED, ZipFile
 
 import numpy as np
 
-from pyogrio import (
-    __gdal_version_string__,
-    __version__,
-    list_drivers,
-)
+from pyogrio import __gdal_version_string__, __version__, list_drivers
 from pyogrio._compat import (
-    HAS_ARROW_API,
+    GDAL_GE_37,
     HAS_ARROW_WRITE_API,
     HAS_GDAL_GEOS,
     HAS_PYARROW,
@@ -51,6 +49,8 @@ START_FID = {
     ".shp": 0,
 }
 
+GDAL_HAS_PARQUET_DRIVER = "Parquet" in list_drivers()
+
 
 def pytest_report_header(config):
     drivers = ", ".join(
@@ -65,10 +65,7 @@ def pytest_report_header(config):
 
 
 # marks to skip tests if optional dependecies are not present
-requires_arrow_api = pytest.mark.skipif(not HAS_ARROW_API, reason="GDAL>=3.6 required")
-requires_pyarrow_api = pytest.mark.skipif(
-    not HAS_ARROW_API or not HAS_PYARROW, reason="GDAL>=3.6 and pyarrow required"
-)
+requires_pyarrow_api = pytest.mark.skipif(not HAS_PYARROW, reason="pyarrow required")
 
 requires_pyproj = pytest.mark.skipif(not HAS_PYPROJ, reason="pyproj required")
 
@@ -85,6 +82,9 @@ requires_shapely = pytest.mark.skipif(not HAS_SHAPELY, reason="Shapely >= 2.0 re
 
 
 def prepare_testfile(testfile_path, dst_dir, ext):
+    if ext == ".gpkg.zip" and not GDAL_GE_37:
+        pytest.skip(".gpkg.zip support requires GDAL >= 3.7")
+
     if ext == testfile_path.suffix:
         return testfile_path
 
@@ -100,7 +100,7 @@ def prepare_testfile(testfile_path, dst_dir, ext):
         # allow mixed Polygons/MultiPolygons type
         meta["geometry_type"] = "Unknown"
 
-    elif ext == ".gpkg":
+    elif ext in (".gpkg", ".gpkg.zip"):
         # For .gpkg, spatial_index=False to avoid the rows being reordered
         meta["spatial_index"] = False
         meta["geometry_type"] = "MultiPolygon"
@@ -201,36 +201,70 @@ def no_geometry_file(tmp_path):
     return filename
 
 
- at pytest.fixture(scope="function")
-def list_field_values_file(tmp_path):
+def list_field_values_geojson_file(tmp_path):
     # Create a GeoJSON file with list values in a property
     list_geojson = """{
         "type": "FeatureCollection",
         "features": [
             {
                 "type": "Feature",
-                "properties": { "int64": 1, "list_int64": [0, 1] },
+                "properties": {
+                    "int": 1,
+                    "list_int": [0, 1],
+                    "list_double": [0.0, 1.0],
+                    "list_string": ["string1", "string2"],
+                    "list_int_with_null": [0, null],
+                    "list_string_with_null": ["string1", null]
+                },
                 "geometry": { "type": "Point", "coordinates": [0, 2] }
             },
             {
                 "type": "Feature",
-                "properties": { "int64": 2, "list_int64": [2, 3] },
+                "properties": {
+                    "int": 2,
+                    "list_int": [2, 3],
+                    "list_double": [2.0, 3.0],
+                    "list_string": ["string3", "string4", ""],
+                    "list_int_with_null": [2, 3],
+                    "list_string_with_null": ["string3", "string4", ""]
+                },
                 "geometry": { "type": "Point", "coordinates": [1, 2] }
             },
             {
                 "type": "Feature",
-                "properties": { "int64": 3, "list_int64": [4, 5] },
+                "properties": {
+                    "int": 3,
+                    "list_int": [],
+                    "list_double": [],
+                    "list_string": [],
+                    "list_int_with_null": [],
+                    "list_string_with_null": []
+                },
                 "geometry": { "type": "Point", "coordinates": [2, 2] }
             },
             {
                 "type": "Feature",
-                "properties": { "int64": 4, "list_int64": [6, 7] },
-                "geometry": { "type": "Point", "coordinates": [3, 2] }
+                "properties": {
+                    "int": 4,
+                    "list_int": null,
+                    "list_double": null,
+                    "list_string": null,
+                    "list_int_with_null": null,
+                    "list_string_with_null": null
+                },
+                "geometry": { "type": "Point", "coordinates": [2, 2] }
             },
             {
                 "type": "Feature",
-                "properties": { "int64": 5, "list_int64": [8, 9] },
-                "geometry": { "type": "Point", "coordinates": [4, 2] }
+                "properties": {
+                    "int": 5,
+                    "list_int": null,
+                    "list_double": null,
+                    "list_string": [""],
+                    "list_int_with_null": null,
+                    "list_string_with_null": [""]
+                },
+                "geometry": { "type": "Point", "coordinates": [2, 2] }
             }
         ]
     }"""
@@ -242,6 +276,66 @@ def list_field_values_file(tmp_path):
     return filename
 
 
+def list_field_values_parquet_file():
+    """Return the path to a Parquet file with list values in a property.
+
+    Because in the CI environments pyarrow.parquet is typically not available, we save
+    the file in the test data directory instead of always creating it from scratch.
+
+    The code to create it is here though, in case it needs to be recreated later.
+    """
+    # Check if the file already exists in the test data dir
+    fixture_path = _data_dir / "list_field_values_file.parquet"
+    if fixture_path.exists():
+        return fixture_path
+
+    # The file doesn't exist, so create it
+    try:
+        import pyarrow as pa
+        from pyarrow import parquet as pq
+
+        import shapely
+    except ImportError as ex:
+        raise RuntimeError(
+            f"test file {fixture_path} does not exist, but error importing: {ex}."
+        )
+
+    table = pa.table(
+        {
+            "geometry": shapely.to_wkb(shapely.points(np.ones((5, 2)))),
+            "int": [1, 2, 3, 4, 5],
+            "list_int": [[0, 1], [2, 3], [], None, None],
+            "list_double": [[0.0, 1.0], [2.0, 3.0], [], None, None],
+            "list_string": [
+                ["string1", "string2"],
+                ["string3", "string4", ""],
+                [],
+                None,
+                [""],
+            ],
+            "list_int_with_null": [[0, None], [2, 3], [], None, None],
+            "list_string_with_null": [
+                ["string1", None],
+                ["string3", "string4", ""],
+                [],
+                None,
+                [""],
+            ],
+        }
+    )
+    pq.write_table(table, fixture_path)
+
+    return fixture_path
+
+
+ at pytest.fixture(scope="function", params=[".geojson", ".parquet"])
+def list_field_values_files(tmp_path, request):
+    if request.param == ".geojson":
+        return list_field_values_geojson_file(tmp_path)
+    elif request.param == ".parquet":
+        return list_field_values_parquet_file()
+
+
 @pytest.fixture(scope="function")
 def nested_geojson_file(tmp_path):
     # create GeoJSON file with nested properties
@@ -271,6 +365,45 @@ def nested_geojson_file(tmp_path):
     return filename
 
 
+ at pytest.fixture(scope="function")
+def list_nested_struct_parquet_file(tmp_path):
+    """Create a Parquet file in tmp_path with nested values in a property.
+
+    Because in the CI environments pyarrow.parquet is typically not available, we save
+    the file in the test data directory instead of always creating it from scratch.
+
+    The code to create it is here though, in case it needs to be recreated later.
+    """
+    # Check if the file already exists in the test data dir
+    fixture_path = _data_dir / "list_nested_struct_file.parquet"
+    if fixture_path.exists():
+        return fixture_path
+
+    # The file doesn't exist, so create it
+    try:
+        import pyarrow as pa
+        from pyarrow import parquet as pq
+
+        import shapely
+    except ImportError as ex:
+        raise RuntimeError(
+            f"test file {fixture_path} does not exist, but error importing: {ex}."
+        )
+
+    table = pa.table(
+        {
+            "geometry": shapely.to_wkb(shapely.points(np.ones((3, 2)))),
+            "col_flat": [0, 1, 2],
+            "col_struct": [{"a": 1, "b": 2}] * 3,
+            "col_nested": [[{"a": 1, "b": 2}] * 2] * 3,
+            "col_list": [[1, 2, 3]] * 3,
+        }
+    )
+    pq.write_table(table, fixture_path)
+
+    return fixture_path
+
+
 @pytest.fixture(scope="function")
 def datetime_file(tmp_path):
     # create GeoJSON file with millisecond precision
@@ -299,7 +432,7 @@ def datetime_file(tmp_path):
 
 @pytest.fixture(scope="function")
 def datetime_tz_file(tmp_path):
-    # create GeoJSON file with datetimes with timezone
+    # create GeoJSON file with datetimes with time zone
     datetime_tz_geojson = """{
         "type": "FeatureCollection",
         "features": [
@@ -340,6 +473,27 @@ def geojson_bytes(tmp_path):
     return bytes_buffer
 
 
+ at pytest.fixture(scope="function")
+def geojson_datetime_long_ago(tmp_path):
+    # create GeoJSON file with datetimes from long ago
+    datetime_tz_geojson = """{
+        "type": "FeatureCollection",
+        "features": [
+            {
+                "type": "Feature",
+                "properties": { "datetime_col": "1670-01-01T09:00:00" },
+                "geometry": { "type": "Point", "coordinates": [1, 1] }
+            }
+        ]
+    }"""
+
+    filename = tmp_path / "test_datetime_long_ago.geojson"
+    with open(filename, "w") as f:
+        f.write(datetime_tz_geojson)
+
+    return filename
+
+
 @pytest.fixture(scope="function")
 def geojson_filelike(tmp_path):
     """Extracts first 3 records from naturalearth_lowres and writes to GeoJSON,
@@ -355,6 +509,34 @@ def geojson_filelike(tmp_path):
         yield f
 
 
+ at pytest.fixture(scope="function")
+def kml_file(tmp_path):
+    # create KML file
+    kml_data = """<?xml version="1.0" encoding="utf-8" ?>
+        <kml xmlns="http://www.opengis.net/kml/2.2">
+        <Document id="root_doc">
+            <Schema name="interfaces1" id="interfaces1">
+                <SimpleField name="id" type="float"></SimpleField>
+                <SimpleField name="formation" type="string"></SimpleField>
+            </Schema>
+            <Folder><name>interfaces1</name>
+                <Placemark>
+                    <ExtendedData><SchemaData schemaUrl="#interfaces1">
+                        <SimpleData name="formation">Ton</SimpleData>
+                    </SchemaData></ExtendedData>
+                    <Point><coordinates>19.1501280458077,293.313485355882</coordinates></Point>
+                </Placemark>
+            </Folder>
+        </Document>
+        </kml>
+    """
+    filename = tmp_path / "test.kml"
+    with open(filename, "w") as f:
+        _ = f.write(kml_data)
+
+    return filename
+
+
 @pytest.fixture(scope="function")
 def nonseekable_bytes(tmp_path):
     # mock a non-seekable byte stream, such as a zstandard handle


=====================================
pyogrio/tests/fixtures/list_field_values_file.parquet
=====================================
Binary files /dev/null and b/pyogrio/tests/fixtures/list_field_values_file.parquet differ


=====================================
pyogrio/tests/fixtures/list_nested_struct_file.parquet
=====================================
Binary files /dev/null and b/pyogrio/tests/fixtures/list_nested_struct_file.parquet differ


=====================================
pyogrio/tests/test_arrow.py
=====================================
@@ -1,7 +1,6 @@
 import contextlib
 import json
 import math
-import os
 import sys
 from io import BytesIO
 from packaging.version import Version
@@ -133,13 +132,6 @@ def test_read_arrow_ignore_geometry(naturalearth_lowres):
     assert_frame_equal(result, expected)
 
 
-def test_read_arrow_nested_types(list_field_values_file):
-    # with arrow, list types are supported
-    result = read_dataframe(list_field_values_file, use_arrow=True)
-    assert "list_int64" in result.columns
-    assert result["list_int64"][0].tolist() == [0, 1]
-
-
 def test_read_arrow_to_pandas_kwargs(no_geometry_file):
     # with arrow, list types are supported
     arrow_to_pandas_kwargs = {"strings_to_categorical": True}
@@ -300,29 +292,6 @@ def test_open_arrow_capsule_protocol_without_pyarrow(naturalearth_lowres):
     assert result.equals(expected)
 
 
- at contextlib.contextmanager
-def use_arrow_context():
-    original = os.environ.get("PYOGRIO_USE_ARROW", None)
-    os.environ["PYOGRIO_USE_ARROW"] = "1"
-    yield
-    if original:
-        os.environ["PYOGRIO_USE_ARROW"] = original
-    else:
-        del os.environ["PYOGRIO_USE_ARROW"]
-
-
-def test_enable_with_environment_variable(list_field_values_file):
-    # list types are only supported with arrow, so don't work by default and work
-    # when arrow is enabled through env variable
-    result = read_dataframe(list_field_values_file)
-    assert "list_int64" not in result.columns
-
-    with use_arrow_context():
-        result = read_dataframe(list_field_values_file)
-
-    assert "list_int64" in result.columns
-
-
 @pytest.mark.skipif(
     __gdal_version__ < (3, 8, 3), reason="Arrow bool value bug fixed in GDAL >= 3.8.3"
 )
@@ -507,10 +476,6 @@ def test_write_geojson(tmp_path, naturalearth_lowres):
 
 
 @requires_arrow_write_api
- at pytest.mark.skipif(
-    __gdal_version__ < (3, 6, 0),
-    reason="OpenFileGDB write support only available for GDAL >= 3.6.0",
-)
 @pytest.mark.parametrize(
     "write_int64",
     [
@@ -674,7 +639,7 @@ def test_write_append(request, tmp_path, naturalearth_lowres, ext):
     assert read_info(filename)["features"] == 354
 
 
- at pytest.mark.parametrize("driver,ext", [("GML", ".gml"), ("GeoJSONSeq", ".geojsons")])
+ at pytest.mark.parametrize("driver,ext", [("GML", ".gml")])
 @requires_arrow_write_api
 def test_write_append_unsupported(tmp_path, naturalearth_lowres, driver, ext):
     meta, table = read_arrow(naturalearth_lowres)
@@ -992,9 +957,6 @@ def test_write_memory_driver_required(naturalearth_lowres):
 @requires_arrow_write_api
 @pytest.mark.parametrize("driver", ["ESRI Shapefile", "OpenFileGDB"])
 def test_write_memory_unsupported_driver(naturalearth_lowres, driver):
-    if driver == "OpenFileGDB" and __gdal_version__ < (3, 6, 0):
-        pytest.skip("OpenFileGDB write support only available for GDAL >= 3.6.0")
-
     meta, table = read_arrow(naturalearth_lowres, max_features=1)
 
     buffer = BytesIO()


=====================================
pyogrio/tests/test_core.py
=====================================
@@ -22,7 +22,12 @@ from pyogrio._compat import GDAL_GE_38
 from pyogrio._env import GDALEnv
 from pyogrio.errors import DataLayerError, DataSourceError
 from pyogrio.raw import read, write
-from pyogrio.tests.conftest import START_FID, prepare_testfile, requires_shapely
+from pyogrio.tests.conftest import (
+    DRIVERS,
+    START_FID,
+    prepare_testfile,
+    requires_shapely,
+)
 
 import pytest
 
@@ -135,11 +140,7 @@ def test_list_drivers():
     # verify that the core drivers are present
     for name in ("ESRI Shapefile", "GeoJSON", "GeoJSONSeq", "GPKG", "OpenFileGDB"):
         assert name in all_drivers
-
         expected_capability = "rw"
-        if name == "OpenFileGDB" and __gdal_version__ < (3, 6, 0):
-            expected_capability = "r"
-
         assert all_drivers[name] == expected_capability
 
     drivers = list_drivers(read=True)
@@ -391,10 +392,6 @@ def test_read_bounds_mask(naturalearth_lowres_all_ext, mask, expected):
     assert array_equal(fids, fids_expected)
 
 
- at pytest.mark.skipif(
-    __gdal_version__ < (3, 4, 0),
-    reason="Cannot determine if GEOS is present or absent for GDAL < 3.4",
-)
 def test_read_bounds_bbox_intersects_vs_envelope_overlaps(naturalearth_lowres_all_ext):
     # If GEOS is present and used by GDAL, bbox filter will be based on intersection
     # of bbox and actual geometries; if GEOS is absent or not used by GDAL, it
@@ -415,7 +412,9 @@ def test_read_bounds_bbox_intersects_vs_envelope_overlaps(naturalearth_lowres_al
         assert array_equal(fids, fids_expected)
 
 
- at pytest.mark.parametrize("naturalearth_lowres", [".shp", ".gpkg"], indirect=True)
+ at pytest.mark.parametrize(
+    "naturalearth_lowres", [".shp", ".shp.zip", ".gpkg", ".gpkg.zip"], indirect=True
+)
 def test_read_info(naturalearth_lowres):
     meta = read_info(naturalearth_lowres)
 
@@ -427,11 +426,12 @@ def test_read_info(naturalearth_lowres):
     assert meta["features"] == 177
     assert allclose(meta["total_bounds"], (-180, -90, 180, 83.64513))
     assert meta["capabilities"]["random_read"] is True
+    # The GPKG test files are created without spatial index
     assert meta["capabilities"]["fast_spatial_filter"] is False
     assert meta["capabilities"]["fast_feature_count"] is True
     assert meta["capabilities"]["fast_total_bounds"] is True
 
-    if naturalearth_lowres.suffix == ".gpkg":
+    if naturalearth_lowres.name.endswith((".gpkg", ".gpkg.zip")):
         assert meta["fid_column"] == "fid"
         assert meta["geometry_name"] == "geom"
         assert meta["geometry_type"] == "MultiPolygon"
@@ -439,7 +439,7 @@ def test_read_info(naturalearth_lowres):
         if GDAL_GE_38:
             # this capability is only True for GPKG if GDAL >= 3.8
             assert meta["capabilities"]["fast_set_next_by_index"] is True
-    elif naturalearth_lowres.suffix == ".shp":
+    elif naturalearth_lowres.name.endswith((".shp", ".shp.zip")):
         # fid_column == "" for formats where fid is not physically stored
         assert meta["fid_column"] == ""
         # geometry_name == "" for formats where geometry column name cannot be
@@ -452,6 +452,14 @@ def test_read_info(naturalearth_lowres):
         raise ValueError(f"test not implemented for ext {naturalearth_lowres.suffix}")
 
 
+ at pytest.mark.parametrize(
+    "naturalearth_lowres", [*DRIVERS.keys(), ".sqlite"], indirect=True
+)
+def test_read_info_encoding(naturalearth_lowres):
+    meta = read_info(naturalearth_lowres)
+    assert meta["encoding"].upper() == "UTF-8"
+
+
 @pytest.mark.parametrize(
     "testfile", ["naturalearth_lowres_vsimem", "naturalearth_lowres_vsi"]
 )
@@ -567,12 +575,24 @@ def test_read_info_force_total_bounds(
         assert info["total_bounds"] is None
 
 
+def test_read_info_jsonfield(nested_geojson_file):
+    """Test if JSON fields types are returned correctly."""
+    meta = read_info(nested_geojson_file)
+    assert meta["ogr_types"] == ["OFTString", "OFTString"]
+    assert meta["ogr_subtypes"] == ["OFSTNone", "OFSTJSON"]
+
+
 def test_read_info_unspecified_layer_warning(data_dir):
     """Reading a multi-layer file without specifying a layer gives a warning."""
     with pytest.warns(UserWarning, match="More than one layer found "):
         read_info(data_dir / "sample.osm.pbf")
 
 
+def test_read_info_invalid_layer(naturalearth_lowres):
+    with pytest.raises(ValueError, match="'layer' parameter must be a str or int"):
+        read_bounds(naturalearth_lowres, layer=["list_arg_is_invalid"])
+
+
 def test_read_info_without_geometry(no_geometry_file):
     assert read_info(no_geometry_file)["total_bounds"] is None
 


=====================================
pyogrio/tests/test_geopandas_io.py
=====================================
@@ -1,5 +1,7 @@
 import contextlib
 import locale
+import os
+import re
 import warnings
 from datetime import datetime
 from io import BytesIO
@@ -19,10 +21,10 @@ from pyogrio import (
 from pyogrio._compat import (
     GDAL_GE_37,
     GDAL_GE_311,
-    GDAL_GE_352,
     HAS_ARROW_WRITE_API,
     HAS_PYPROJ,
     PANDAS_GE_15,
+    PANDAS_GE_23,
     PANDAS_GE_30,
     SHAPELY_GE_21,
 )
@@ -35,6 +37,7 @@ from pyogrio.raw import (
 from pyogrio.tests.conftest import (
     ALL_EXTS,
     DRIVERS,
+    GDAL_HAS_PARQUET_DRIVER,
     START_FID,
     requires_arrow_write_api,
     requires_gdal_geos,
@@ -48,6 +51,7 @@ try:
     import geopandas as gp
     import pandas as pd
     from geopandas.array import from_wkt
+    from pandas.api.types import is_datetime64_dtype, is_object_dtype, is_string_dtype
 
     import shapely  # if geopandas is present, shapely is expected to be present
     from shapely.geometry import Point
@@ -93,14 +97,22 @@ def skip_if_no_arrow_write_api(request):
         pytest.skip("GDAL>=3.8 required for Arrow write API")
 
 
-def spatialite_available(path):
-    try:
-        _ = read_dataframe(
-            path, sql="select spatialite_version();", sql_dialect="SQLITE"
-        )
-        return True
-    except Exception:
-        return False
+ at contextlib.contextmanager
+def use_arrow_context():
+    original = os.environ.get("PYOGRIO_USE_ARROW", None)
+    os.environ["PYOGRIO_USE_ARROW"] = "1"
+    yield
+    if original:
+        os.environ["PYOGRIO_USE_ARROW"] = original
+    else:
+        del os.environ["PYOGRIO_USE_ARROW"]
+
+
+def test_spatialite_available(test_gpkg_nulls):
+    """Check if SpatiaLite is available by running a simple SQL query."""
+    _ = read_dataframe(
+        test_gpkg_nulls, sql="select spatialite_version();", sql_dialect="SQLITE"
+    )
 
 
 @pytest.mark.parametrize(
@@ -259,10 +271,6 @@ def test_read_force_2d(tmp_path, use_arrow):
     assert not df.iloc[0].geometry.has_z
 
 
- at pytest.mark.skipif(
-    not GDAL_GE_352,
-    reason="gdal >= 3.5.2 needed to use OGR_GEOJSON_MAX_OBJ_SIZE with a float value",
-)
 def test_read_geojson_error(naturalearth_lowres_geojson, use_arrow):
     try:
         set_gdal_config_options({"OGR_GEOJSON_MAX_OBJ_SIZE": 0.01})
@@ -275,6 +283,22 @@ def test_read_geojson_error(naturalearth_lowres_geojson, use_arrow):
         set_gdal_config_options({"OGR_GEOJSON_MAX_OBJ_SIZE": None})
 
 
+ at pytest.mark.skipif(
+    "LIBKML" not in list_drivers(),
+    reason="LIBKML driver is not available and is needed to read simpledata element",
+)
+def test_read_kml_simpledata(kml_file, use_arrow):
+    """Test reading a KML file with a simpledata element.
+
+    Simpledata elements are only read by the LibKML driver, not the KML driver.
+    """
+    gdf = read_dataframe(kml_file, use_arrow=use_arrow)
+
+    # Check if the simpledata column is present.
+    assert "formation" in gdf.columns
+    assert gdf["formation"].iloc[0] == "Ton"
+
+
 def test_read_layer(tmp_path, use_arrow):
     filename = tmp_path / "test.gpkg"
 
@@ -333,29 +357,162 @@ def test_read_datetime(datetime_file, use_arrow):
         assert df.col.dtype.name == "datetime64[ns]"
 
 
- at pytest.mark.filterwarnings("ignore: Non-conformant content for record 1 in column ")
- at pytest.mark.requires_arrow_write_api
-def test_read_datetime_tz(datetime_tz_file, tmp_path, use_arrow):
-    df = read_dataframe(datetime_tz_file)
-    # Make the index non-consecutive to test this case as well. Added for issue
-    # https://github.com/geopandas/pyogrio/issues/324
-    df = df.set_index(np.array([0, 2]))
-    raw_expected = ["2020-01-01T09:00:00.123-05:00", "2020-01-01T10:00:00-05:00"]
+def test_read_list_types(list_field_values_files, use_arrow):
+    """Test reading a geojson file containing fields with lists."""
+    if list_field_values_files.suffix == ".parquet" and not GDAL_HAS_PARQUET_DRIVER:
+        pytest.skip(
+            "Skipping test for parquet as the GDAL Parquet driver is not available"
+        )
 
-    if PANDAS_GE_20:
-        expected = pd.to_datetime(raw_expected, format="ISO8601").as_unit("ms")
+    info = read_info(list_field_values_files)
+    suffix = list_field_values_files.suffix
+
+    result = read_dataframe(list_field_values_files, use_arrow=use_arrow)
+
+    # Check list_int column
+    assert "list_int" in result.columns
+    assert info["fields"][1] == "list_int"
+    assert info["ogr_types"][1] in ("OFTIntegerList", "OFTInteger64List")
+    assert result["list_int"][0].tolist() == [0, 1]
+    assert result["list_int"][1].tolist() == [2, 3]
+    assert result["list_int"][2].tolist() == []
+    assert result["list_int"][3] is None
+    assert result["list_int"][4] is None
+
+    # Check list_double column
+    assert "list_double" in result.columns
+    assert info["fields"][2] == "list_double"
+    assert info["ogr_types"][2] == "OFTRealList"
+    assert result["list_double"][0].tolist() == [0.0, 1.0]
+    assert result["list_double"][1].tolist() == [2.0, 3.0]
+    assert result["list_double"][2].tolist() == []
+    assert result["list_double"][3] is None
+    assert result["list_double"][4] is None
+
+    # Check list_string column
+    assert "list_string" in result.columns
+    assert info["fields"][3] == "list_string"
+    assert info["ogr_types"][3] == "OFTStringList"
+    assert result["list_string"][0].tolist() == ["string1", "string2"]
+    assert result["list_string"][1].tolist() == ["string3", "string4", ""]
+    assert result["list_string"][2].tolist() == []
+    assert result["list_string"][3] is None
+    assert result["list_string"][4] == [""]
+
+    # Check list_int_with_null column
+    if suffix == ".geojson":
+        # Once any row of a column contains a null value in a list, the column isn't
+        # recognized as a list column anymore for .geojson files, but as a JSON column.
+        # Because JSON columns containing JSON Arrays are also parsed to python lists,
+        # the end result is the same...
+        exp_type = "OFTString"
+        exp_subtype = "OFSTJSON"
+        exp_list_int_with_null_value = [0, None]
     else:
-        expected = pd.to_datetime(raw_expected)
-    expected = pd.Series(expected, name="datetime_col")
-    assert_series_equal(df.datetime_col, expected, check_index=False)
-    # test write and read round trips
-    fpath = tmp_path / "test.gpkg"
-    write_dataframe(df, fpath, use_arrow=use_arrow)
-    df_read = read_dataframe(fpath, use_arrow=use_arrow)
-    if use_arrow:
-        # with Arrow, the datetimes are always read as UTC
-        expected = expected.dt.tz_convert("UTC")
-    assert_series_equal(df_read.datetime_col, expected)
+        # For .parquet files, the list column is preserved as a list column.
+        exp_type = "OFTInteger64List"
+        exp_subtype = "OFSTNone"
+        if use_arrow:
+            exp_list_int_with_null_value = [0.0, np.nan]
+        else:
+            exp_list_int_with_null_value = [0, 0]
+            # xfail: when reading a list of int with None values without Arrow from a
+            # .parquet file, the None values become 0, which is wrong.
+            # https://github.com/OSGeo/gdal/issues/13448
+
+    assert "list_int_with_null" in result.columns
+    assert info["fields"][4] == "list_int_with_null"
+    assert info["ogr_types"][4] == exp_type
+    assert info["ogr_subtypes"][4] == exp_subtype
+    assert result["list_int_with_null"][0][0] == 0
+    if exp_list_int_with_null_value[1] == 0:
+        assert result["list_int_with_null"][0][1] == exp_list_int_with_null_value[1]
+    else:
+        assert pd.isna(result["list_int_with_null"][0][1])
+
+    if suffix == ".geojson":
+        # For .geojson, the lists are already python lists
+        assert result["list_int_with_null"][1] == [2, 3]
+        assert result["list_int_with_null"][2] == []
+    else:
+        # For .parquet, the lists are numpy arrays
+        assert result["list_int_with_null"][1].tolist() == [2, 3]
+        assert result["list_int_with_null"][2].tolist() == []
+
+    assert pd.isna(result["list_int_with_null"][3])
+    assert pd.isna(result["list_int_with_null"][4])
+
+    # Check list_string_with_null column
+    if suffix == ".geojson":
+        # Once any row of a column contains a null value in a list, the column isn't
+        # recognized as a list column anymore for .geojson files, but as a JSON column.
+        # Because JSON columns containing JSON Arrays are also parsed to python lists,
+        # the end result is the same...
+        exp_type = "OFTString"
+        exp_subtype = "OFSTJSON"
+    else:
+        # For .parquet files, the list column is preserved as a list column.
+        exp_type = "OFTStringList"
+        exp_subtype = "OFSTNone"
+
+    assert "list_string_with_null" in result.columns
+    assert info["fields"][5] == "list_string_with_null"
+    assert info["ogr_types"][5] == exp_type
+    assert info["ogr_subtypes"][5] == exp_subtype
+
+    if suffix == ".geojson":
+        # For .geojson, the lists are already python lists
+        assert result["list_string_with_null"][0] == ["string1", None]
+        assert result["list_string_with_null"][1] == ["string3", "string4", ""]
+        assert result["list_string_with_null"][2] == []
+    else:
+        # For .parquet, the lists are numpy arrays
+        # When use_arrow=False, the None becomes an empty string, which is wrong.
+        exp_value = ["string1", ""] if not use_arrow else ["string1", None]
+        assert result["list_string_with_null"][0].tolist() == exp_value
+        assert result["list_string_with_null"][1].tolist() == ["string3", "string4", ""]
+        assert result["list_string_with_null"][2].tolist() == []
+
+    assert pd.isna(result["list_string_with_null"][3])
+    assert result["list_string_with_null"][4] == [""]
+
+
+ at pytest.mark.requires_arrow_write_api
+ at pytest.mark.skipif(
+    not GDAL_HAS_PARQUET_DRIVER, reason="Parquet driver is not available"
+)
+def test_read_list_nested_struct_parquet_file(
+    list_nested_struct_parquet_file, use_arrow
+):
+    """Test reading a Parquet file containing nested struct and list types."""
+    if not use_arrow:
+        pytest.skip(
+            "When use_arrow=False, gdal flattens nested columns to seperate columns. "
+            "Not sure how we want to deal with this case, but for now just skip."
+        )
+
+    result = read_dataframe(list_nested_struct_parquet_file, use_arrow=use_arrow)
+
+    assert "col_flat" in result.columns
+    assert np.array_equal(result["col_flat"].to_numpy(), np.array([0, 1, 2]))
+
+    assert "col_list" in result.columns
+    assert result["col_list"].dtype == object
+    assert result["col_list"][0].tolist() == [1, 2, 3]
+    assert result["col_list"][1].tolist() == [1, 2, 3]
+    assert result["col_list"][2].tolist() == [1, 2, 3]
+
+    assert "col_nested" in result.columns
+    assert result["col_nested"].dtype == object
+    assert result["col_nested"][0].tolist() == [{"a": 1, "b": 2}, {"a": 1, "b": 2}]
+    assert result["col_nested"][1].tolist() == [{"a": 1, "b": 2}, {"a": 1, "b": 2}]
+    assert result["col_nested"][2].tolist() == [{"a": 1, "b": 2}, {"a": 1, "b": 2}]
+
+    assert "col_struct" in result.columns
+    assert result["col_struct"].dtype == object
+    assert result["col_struct"][0] == {"a": 1, "b": 2}
+    assert result["col_struct"][1] == {"a": 1, "b": 2}
+    assert result["col_struct"][2] == {"a": 1, "b": 2}
 
 
 @pytest.mark.filterwarnings(
@@ -371,39 +528,511 @@ def test_write_datetime_mixed_offset(tmp_path, use_arrow):
     if PANDAS_GE_20:
         utc_col = utc_col.dt.as_unit("ms")
 
+
+ at pytest.mark.parametrize("datetime_as_string", [False, True])
+ at pytest.mark.parametrize("mixed_offsets_as_utc", [False, True])
+def test_read_datetime_long_ago(
+    geojson_datetime_long_ago, use_arrow, mixed_offsets_as_utc, datetime_as_string
+):
+    """Test writing/reading a column with a datetime far in the past.
+    Dates from before 1678-1-1 aren't parsed correctly by pandas < 3.0, so they
+    stay strings.
+    Reported in https://github.com/geopandas/pyogrio/issues/553.
+    """
+    handler = contextlib.nullcontext()
+    overflow_occured = False
+    if not datetime_as_string and not PANDAS_GE_30 and (not use_arrow or GDAL_GE_311):
+        # When datetimes should not be returned as string and arrow is not used or
+        # arrow is used with GDAL >= 3.11, `pandas.to_datetime` is used to parse the
+        # datetimes. However, when using pandas < 3.0, this raises an
+        # "Out of bounds nanosecond timestamp" error for very old dates.
+        # As a result, `read_dataframe` gives a warning and the datetimes stay strings.
+        handler = pytest.warns(
+            UserWarning, match="Error parsing datetimes, original strings are returned"
+        )
+        overflow_occured = True
+        # XFAIL: datetimes before 1678-1-1 give overflow with arrow=False and pandas<3.0
+    elif use_arrow and not PANDAS_GE_20 and not GDAL_GE_311:
+        # When arrow is used with pandas < 2.0 and GDAL < 3.11, an overflow occurs in
+        # pyarrow.to_pandas().
+        handler = pytest.raises(
+            Exception,
+            match=re.escape("Casting from timestamp[ms] to timestamp[ns] would result"),
+        )
+        overflow_occured = True
+        # XFAIL: datetimes before 1678-1-1 give overflow with arrow=True and pandas<2.0
+
+    with handler:
+        df = read_dataframe(
+            geojson_datetime_long_ago,
+            use_arrow=use_arrow,
+            datetime_as_string=datetime_as_string,
+            mixed_offsets_as_utc=mixed_offsets_as_utc,
+        )
+
+        exp_dates_str = pd.Series(["1670-01-01T09:00:00"], name="datetime_col")
+        if datetime_as_string:
+            assert is_string_dtype(df.datetime_col.dtype)
+            assert_series_equal(df.datetime_col, exp_dates_str)
+        else:
+            # It is a single naive datetime, so regardless of mixed_offsets_as_utc the
+            # expected "ideal" result is the same: a datetime64 without time zone info.
+            if overflow_occured:
+                # Strings are returned because of an overflow.
+                assert is_string_dtype(df.datetime_col.dtype)
+                assert_series_equal(df.datetime_col, exp_dates_str)
+            else:
+                # With use_arrow or pandas >= 3.0, old datetimes are parsed correctly.
+                assert is_datetime64_dtype(df.datetime_col)
+                assert df.datetime_col.iloc[0] == pd.Timestamp(1670, 1, 1, 9, 0, 0)
+                assert df.datetime_col.iloc[0].unit == "ms"
+
+
+ at pytest.mark.parametrize("ext", [ext for ext in ALL_EXTS if ext != ".shp"])
+ at pytest.mark.parametrize("datetime_as_string", [False, True])
+ at pytest.mark.parametrize("mixed_offsets_as_utc", [False, True])
+ at pytest.mark.requires_arrow_write_api
+def test_write_read_datetime_no_tz(
+    tmp_path, ext, datetime_as_string, mixed_offsets_as_utc, use_arrow
+):
+    """Test writing/reading a column with naive datetimes (no time zone information)."""
+    dates_raw = ["2020-01-01T09:00:00.123", "2020-01-01T10:00:00", np.nan]
+    if PANDAS_GE_20:
+        dates = pd.to_datetime(dates_raw, format="ISO8601").as_unit("ms")
+    else:
+        dates = pd.to_datetime(dates_raw)
     df = gp.GeoDataFrame(
-        {"dates": localised_col, "geometry": [Point(1, 1), Point(1, 1)]},
+        {"dates": dates, "geometry": [Point(1, 1)] * 3}, crs="EPSG:4326"
+    )
+
+    fpath = tmp_path / f"test{ext}"
+    write_dataframe(df, fpath, use_arrow=use_arrow)
+    result = read_dataframe(
+        fpath,
+        use_arrow=use_arrow,
+        datetime_as_string=datetime_as_string,
+        mixed_offsets_as_utc=mixed_offsets_as_utc,
+    )
+
+    if use_arrow and ext == ".gpkg" and __gdal_version__ < (3, 11, 0):
+        # With GDAL < 3.11 with arrow, columns with naive datetimes are written
+        # correctly, but when read they are wrongly interpreted as being in UTC.
+        # The reason is complicated, but more info can be found e.g. here:
+        # https://github.com/geopandas/pyogrio/issues/487#issuecomment-2423762807
+        exp_dates = df.dates.dt.tz_localize("UTC")
+        if datetime_as_string:
+            exp_dates = exp_dates.astype("str").str.replace(" ", "T")
+            exp_dates[2] = np.nan
+            assert_series_equal(result.dates, exp_dates)
+        elif not mixed_offsets_as_utc:
+            assert_series_equal(result.dates, exp_dates)
+        # XFAIL: naive datetimes read wrong in GPKG with GDAL < 3.11 via arrow
+
+    elif datetime_as_string:
+        assert is_string_dtype(result.dates.dtype)
+        if use_arrow and __gdal_version__ < (3, 11, 0):
+            dates_str = df.dates.astype("str").str.replace(" ", "T")
+            dates_str[2] = np.nan
+        else:
+            dates_str = pd.Series(dates_raw, name="dates")
+        assert_series_equal(result.dates, dates_str)
+    else:
+        assert is_datetime64_dtype(result.dates.dtype)
+        assert_geodataframe_equal(result, df)
+
+
+ at pytest.mark.parametrize("ext", [ext for ext in ALL_EXTS if ext != ".shp"])
+ at pytest.mark.parametrize("datetime_as_string", [False, True])
+ at pytest.mark.parametrize("mixed_offsets_as_utc", [False, True])
+ at pytest.mark.filterwarnings("ignore: Non-conformant content for record 1 in column ")
+ at pytest.mark.requires_arrow_write_api
+def test_write_read_datetime_tz(
+    request, tmp_path, ext, datetime_as_string, mixed_offsets_as_utc, use_arrow
+):
+    """Write and read file with all equal time zones.
+
+    This should result in the result being in pandas datetime64 dtype column.
+    """
+    if use_arrow and __gdal_version__ < (3, 10, 0) and ext in (".geojson", ".geojsonl"):
+        # With GDAL < 3.10 with arrow, the time zone offset was applied to the datetime
+        # as well as retaining the time zone.
+        # This was fixed in https://github.com/OSGeo/gdal/pull/11049
+        request.node.add_marker(
+            pytest.mark.xfail(
+                reason="Wrong datetimes read in GeoJSON with GDAL < 3.10 via arrow"
+            )
+        )
+
+    dates_raw = ["2020-01-01T09:00:00.123-05:00", "2020-01-01T10:00:00-05:00", np.nan]
+    if PANDAS_GE_20:
+        dates = pd.to_datetime(dates_raw, format="ISO8601").as_unit("ms")
+    else:
+        dates = pd.to_datetime(dates_raw)
+
+    # Make the index non-consecutive to test this case as well. Added for issue
+    # https://github.com/geopandas/pyogrio/issues/324
+    df = gp.GeoDataFrame(
+        {"dates": dates, "geometry": [Point(1, 1)] * 3},
+        index=[0, 2, 3],
         crs="EPSG:4326",
     )
-    fpath = tmp_path / "test.gpkg"
+    assert isinstance(df.dates.dtype, pd.DatetimeTZDtype)
+
+    fpath = tmp_path / f"test{ext}"
+    write_dataframe(df, fpath, use_arrow=use_arrow)
+    result = read_dataframe(
+        fpath,
+        use_arrow=use_arrow,
+        datetime_as_string=datetime_as_string,
+        mixed_offsets_as_utc=mixed_offsets_as_utc,
+    )
+
+    # With some older versions, the offset is represented slightly differently
+    if result.dates.dtype.name.endswith(", pytz.FixedOffset(-300)]"):
+        result.dates = result.dates.astype(df.dates.dtype)
+
+    if use_arrow and ext in (".fgb", ".gpkg") and __gdal_version__ < (3, 11, 0):
+        # With GDAL < 3.11 with arrow, datetime columns are written as string type
+        df_exp = df.copy()
+        df_exp.dates = df_exp[df_exp.dates.notna()].dates.astype(str)
+        assert_series_equal(result.dates, df_exp.dates, check_index=False)
+        # XFAIL: datetime columns written as string with GDAL < 3.11 via arrow
+    elif datetime_as_string:
+        assert is_string_dtype(result.dates.dtype)
+        if use_arrow and __gdal_version__ < (3, 11, 0):
+            dates_str = df.dates.astype("str").str.replace(" ", "T")
+            dates_str.iloc[2] = np.nan
+        elif __gdal_version__ < (3, 7, 0):
+            # With GDAL < 3.7, time zone minutes aren't included in the string
+            dates_str = [x[:-3] for x in dates_raw if pd.notna(x)] + [np.nan]
+            dates_str = pd.Series(dates_str, name="dates")
+        else:
+            dates_str = pd.Series(dates_raw, name="dates")
+        assert_series_equal(result.dates, dates_str, check_index=False)
+    else:
+        assert_series_equal(result.dates, df.dates, check_index=False)
+
+
+ at pytest.mark.parametrize("ext", [ext for ext in ALL_EXTS if ext != ".shp"])
+ at pytest.mark.parametrize("datetime_as_string", [False, True])
+ at pytest.mark.parametrize("mixed_offsets_as_utc", [False, True])
+ at pytest.mark.filterwarnings(
+    "ignore: Non-conformant content for record 1 in column dates"
+)
+ at pytest.mark.requires_arrow_write_api
+def test_write_read_datetime_tz_localized_mixed_offset(
+    tmp_path, ext, datetime_as_string, mixed_offsets_as_utc, use_arrow
+):
+    """Test with localized dates across a different summer/winter time zone offset."""
+    # Australian Summer Time AEDT (GMT+11), Standard Time AEST (GMT+10)
+    dates_raw = ["2023-01-01 11:00:01.111", "2023-06-01 10:00:01.111", np.nan]
+    dates_naive = pd.Series(pd.to_datetime(dates_raw), name="dates")
+    dates_local = dates_naive.dt.tz_localize("Australia/Sydney")
+    dates_local_offsets_str = dates_local.astype(str)
+    if datetime_as_string:
+        exp_dates = dates_local_offsets_str.str.replace(" ", "T")
+        exp_dates = exp_dates.str.replace(".111000", ".111")
+        if __gdal_version__ < (3, 7, 0):
+            # With GDAL < 3.7, time zone minutes aren't included in the string
+            exp_dates = exp_dates.str.slice(0, -3)
+    elif mixed_offsets_as_utc:
+        exp_dates = dates_local.dt.tz_convert("UTC")
+        if PANDAS_GE_20:
+            exp_dates = exp_dates.dt.as_unit("ms")
+    else:
+        exp_dates = dates_local_offsets_str.apply(
+            lambda x: pd.Timestamp(x) if pd.notna(x) else None
+        )
+
+    df = gp.GeoDataFrame(
+        {"dates": dates_local, "geometry": [Point(1, 1)] * 3}, crs="EPSG:4326"
+    )
+    fpath = tmp_path / f"test{ext}"
+    write_dataframe(df, fpath, use_arrow=use_arrow)
+    result = read_dataframe(
+        fpath,
+        use_arrow=use_arrow,
+        datetime_as_string=datetime_as_string,
+        mixed_offsets_as_utc=mixed_offsets_as_utc,
+    )
+
+    if use_arrow and __gdal_version__ < (3, 11, 0):
+        if ext in (".geojson", ".geojsonl"):
+            # With GDAL < 3.11 with arrow, GDAL converts mixed time zone datetimes to
+            # UTC when read as the arrow datetime column type does not support mixed tz.
+            dates_utc = dates_local.dt.tz_convert("UTC")
+            if PANDAS_GE_20:
+                dates_utc = dates_utc.dt.as_unit("ms")
+            if datetime_as_string:
+                assert is_string_dtype(result.dates.dtype)
+                dates_utc = dates_utc.astype(str).str.replace(" ", "T")
+            assert pd.isna(result.dates[2])
+            assert_series_equal(result.dates.head(2), dates_utc.head(2))
+            # XFAIL: mixed tz datetimes converted to UTC with GDAL < 3.11 + arrow
+            return
+
+        elif ext in (".gpkg", ".fgb"):
+            # With GDAL < 3.11 with arrow, datetime columns written as string type
+            assert pd.isna(result.dates[2])
+            assert_series_equal(result.dates.head(2), dates_local_offsets_str.head(2))
+            # XFAIL: datetime columns written as string with GDAL < 3.11 + arrow
+            return
+
+    # GDAL tz only encodes offsets, not time zones
+    if datetime_as_string:
+        assert is_string_dtype(result.dates.dtype)
+    elif mixed_offsets_as_utc:
+        assert isinstance(result.dates.dtype, pd.DatetimeTZDtype)
+    else:
+        assert is_object_dtype(result.dates.dtype)
+
+    # Check isna for the third value seperately as depending on versions this is
+    # different + pandas 3.0 assert_series_equal becomes strict about this.
+    assert pd.isna(result.dates[2])
+    assert_series_equal(result.dates.head(2), exp_dates.head(2))
+
+
+ at pytest.mark.parametrize("ext", [ext for ext in ALL_EXTS if ext != ".shp"])
+ at pytest.mark.parametrize("datetime_as_string", [False, True])
+ at pytest.mark.parametrize("mixed_offsets_as_utc", [False, True])
+ at pytest.mark.filterwarnings(
+    "ignore: Non-conformant content for record 1 in column dates"
+)
+ at pytest.mark.requires_arrow_write_api
+def test_write_read_datetime_tz_mixed_offsets(
+    tmp_path, ext, datetime_as_string, mixed_offsets_as_utc, use_arrow
+):
+    """Test with dates with mixed time zone offsets."""
+    # Pandas datetime64 column types doesn't support mixed time zone offsets, so
+    # it needs to be a list of pandas.Timestamp objects instead.
+    dates = [
+        pd.Timestamp("2023-01-01 11:00:01.111+01:00"),
+        pd.Timestamp("2023-06-01 10:00:01.111+05:00"),
+        np.nan,
+    ]
+
+    df = gp.GeoDataFrame(
+        {"dates": dates, "geometry": [Point(1, 1)] * 3}, crs="EPSG:4326"
+    )
+    fpath = tmp_path / f"test{ext}"
     write_dataframe(df, fpath, use_arrow=use_arrow)
-    result = read_dataframe(fpath, use_arrow=use_arrow)
-    # GDAL tz only encodes offsets, not timezones
-    # check multiple offsets are read as utc datetime instead of string values
-    assert_series_equal(result["dates"], utc_col)
+    result = read_dataframe(
+        fpath,
+        use_arrow=use_arrow,
+        datetime_as_string=datetime_as_string,
+        mixed_offsets_as_utc=mixed_offsets_as_utc,
+    )
+
+    if use_arrow and __gdal_version__ < (3, 11, 0):
+        if ext in (".geojson", ".geojsonl"):
+            # With GDAL < 3.11 with arrow, GDAL converts mixed time zone datetimes to
+            # UTC when read as the arrow datetime column type does not support mixed tz.
+            df_exp = df.copy()
+            df_exp.dates = pd.to_datetime(dates, utc=True)
+            if PANDAS_GE_20:
+                df_exp.dates = df_exp.dates.dt.as_unit("ms")
+            if datetime_as_string:
+                df_exp.dates = df_exp.dates.astype("str").str.replace(" ", "T")
+            df_exp.loc[2, "dates"] = pd.NA
+            assert_geodataframe_equal(result, df_exp)
+            # XFAIL: mixed tz datetimes converted to UTC with GDAL < 3.11 + arrow
+            return
+
+        elif ext in (".gpkg", ".fgb"):
+            # With arrow and GDAL < 3.11, mixed time zone datetimes are written as
+            # string type columns, so no proper roundtrip possible.
+            df_exp = df.copy()
+            df_exp.dates = df_exp.dates.astype("string").astype("O")
+            assert_geodataframe_equal(result, df_exp)
+            # XFAIL: mixed tz datetimes converted to UTC with GDAL < 3.11 + arrow
+            return
+
+    if datetime_as_string:
+        assert is_string_dtype(result.dates.dtype)
+        dates_str = df.dates.map(
+            lambda x: x.isoformat(timespec="milliseconds") if pd.notna(x) else np.nan
+        )
+        if __gdal_version__ < (3, 7, 0):
+            # With GDAL < 3.7, time zone minutes aren't included in the string
+            dates_str = dates_str.str.slice(0, -3)
+        assert_series_equal(result.dates, dates_str)
+    elif mixed_offsets_as_utc:
+        assert isinstance(result.dates.dtype, pd.DatetimeTZDtype)
+        exp_dates = pd.to_datetime(df.dates, utc=True)
+        if PANDAS_GE_20:
+            exp_dates = exp_dates.dt.as_unit("ms")
+        assert_series_equal(result.dates, exp_dates)
+    else:
+        assert is_object_dtype(result.dates.dtype)
+        assert_geodataframe_equal(result, df)
 
 
+ at pytest.mark.parametrize("ext", [ext for ext in ALL_EXTS if ext != ".shp"])
+ at pytest.mark.parametrize(
+    "dates_raw",
+    [
+        (
+            pd.Timestamp("2020-01-01T09:00:00.123-05:00"),
+            pd.Timestamp("2020-01-01T10:00:00-05:00"),
+            np.nan,
+        ),
+        (
+            datetime.fromisoformat("2020-01-01T09:00:00.123-05:00"),
+            datetime.fromisoformat("2020-01-01T10:00:00-05:00"),
+            np.nan,
+        ),
+    ],
+)
+ at pytest.mark.parametrize("datetime_as_string", [False, True])
+ at pytest.mark.parametrize("mixed_offsets_as_utc", [False, True])
 @pytest.mark.filterwarnings(
     "ignore: Non-conformant content for record 1 in column dates"
 )
 @pytest.mark.requires_arrow_write_api
-def test_read_write_datetime_tz_with_nulls(tmp_path, use_arrow):
-    dates_raw = ["2020-01-01T09:00:00.123-05:00", "2020-01-01T10:00:00-05:00", pd.NaT]
+def test_write_read_datetime_tz_objects(
+    tmp_path, dates_raw, ext, use_arrow, datetime_as_string, mixed_offsets_as_utc
+):
+    """Datetime objects with equal offsets are read as datetime64."""
+    dates = pd.Series(dates_raw, dtype="O")
+    df = gp.GeoDataFrame(
+        {"dates": dates, "geometry": [Point(1, 1)] * 3}, crs="EPSG:4326"
+    )
+
+    fpath = tmp_path / f"test{ext}"
+    write_dataframe(df, fpath, use_arrow=use_arrow)
+    result = read_dataframe(
+        fpath,
+        use_arrow=use_arrow,
+        datetime_as_string=datetime_as_string,
+        mixed_offsets_as_utc=mixed_offsets_as_utc,
+    )
+
+    # Check result
+    if PANDAS_GE_20:
+        exp_dates = pd.to_datetime(dates_raw, format="ISO8601").as_unit("ms")
+    else:
+        exp_dates = pd.to_datetime(dates_raw)
+    exp_df = df.copy()
+    exp_df["dates"] = pd.Series(exp_dates, name="dates")
+
+    # With some older versions, the offset is represented slightly differently
+    if result.dates.dtype.name.endswith(", pytz.FixedOffset(-300)]"):
+        result["dates"] = result.dates.astype(exp_df.dates.dtype)
+
+    if use_arrow and __gdal_version__ < (3, 10, 0) and ext in (".geojson", ".geojsonl"):
+        # XFAIL: Wrong datetimes read in GeoJSON with GDAL < 3.10 via arrow.
+        # The time zone offset was applied to the datetime as well as retaining
+        # the time zone. This was fixed in https://github.com/OSGeo/gdal/pull/11049
+
+        # Subtract 5 hours from the expected datetimes to match the wrong result.
+        if datetime_as_string:
+            exp_df["dates"] = pd.Series(
+                [
+                    "2020-01-01T04:00:00.123000-05:00",
+                    "2020-01-01T05:00:00-05:00",
+                    np.nan,
+                ]
+            )
+        else:
+            exp_df["dates"] = exp_df.dates - pd.Timedelta(hours=5)
+            if PANDAS_GE_20:
+                # The unit needs to be applied again apparently
+                exp_df["dates"] = exp_df.dates.dt.as_unit("ms")
+        assert_geodataframe_equal(result, exp_df)
+        return
+
+    if use_arrow and __gdal_version__ < (3, 11, 0) and ext in (".fgb", ".gpkg"):
+        # XFAIL: datetime columns are written as string with GDAL < 3.11 + arrow
+        # -> custom formatting because the df column is object dtype and thus
+        # astype(str) converted the datetime objects one by one
+        exp_df["dates"] = pd.Series(
+            ["2020-01-01 09:00:00.123000-05:00", "2020-01-01 10:00:00-05:00", np.nan]
+        )
+        assert_geodataframe_equal(result, exp_df)
+        return
+
+    if datetime_as_string:
+        assert is_string_dtype(result.dates.dtype)
+        if use_arrow and __gdal_version__ < (3, 11, 0):
+            # With GDAL < 3.11 with arrow, datetime columns are written as string type
+            exp_df["dates"] = pd.Series(
+                [
+                    "2020-01-01T09:00:00.123000-05:00",
+                    "2020-01-01T10:00:00-05:00",
+                    np.nan,
+                ]
+            )
+        else:
+            exp_df["dates"] = pd.Series(
+                ["2020-01-01T09:00:00.123-05:00", "2020-01-01T10:00:00-05:00", np.nan]
+            )
+            if __gdal_version__ < (3, 7, 0):
+                # With GDAL < 3.7, time zone minutes aren't included in the string
+                exp_df["dates"] = exp_df.dates.str.slice(0, -3)
+    elif mixed_offsets_as_utc:
+        # the offsets are all -05:00, so the result retains the offset and not UTC
+        assert isinstance(result.dates.dtype, pd.DatetimeTZDtype)
+        assert str(result.dates.dtype.tz) in ("UTC-05:00", "pytz.FixedOffset(-300)")
+    else:
+        assert isinstance(result.dates.dtype, pd.DatetimeTZDtype)
+
+    assert_geodataframe_equal(result, exp_df)
+
+
+ at pytest.mark.parametrize("ext", [ext for ext in ALL_EXTS if ext != ".shp"])
+ at pytest.mark.parametrize("datetime_as_string", [False, True])
+ at pytest.mark.parametrize("mixed_offsets_as_utc", [False, True])
+ at pytest.mark.requires_arrow_write_api
+def test_write_read_datetime_utc(
+    tmp_path, ext, use_arrow, datetime_as_string, mixed_offsets_as_utc
+):
+    """Test writing/reading a column with UTC datetimes."""
+    dates_raw = ["2020-01-01T09:00:00.123Z", "2020-01-01T10:00:00Z", np.nan]
     if PANDAS_GE_20:
         dates = pd.to_datetime(dates_raw, format="ISO8601").as_unit("ms")
     else:
         dates = pd.to_datetime(dates_raw)
     df = gp.GeoDataFrame(
-        {"dates": dates, "geometry": [Point(1, 1), Point(1, 1), Point(1, 1)]},
-        crs="EPSG:4326",
+        {"dates": dates, "geometry": [Point(1, 1)] * 3}, crs="EPSG:4326"
     )
-    fpath = tmp_path / "test.gpkg"
+    assert df.dates.dtype.name in ("datetime64[ms, UTC]", "datetime64[ns, UTC]")
+
+    fpath = tmp_path / f"test{ext}"
     write_dataframe(df, fpath, use_arrow=use_arrow)
-    result = read_dataframe(fpath, use_arrow=use_arrow)
-    if use_arrow:
-        # with Arrow, the datetimes are always read as UTC
-        df["dates"] = df["dates"].dt.tz_convert("UTC")
-    assert_geodataframe_equal(df, result)
+    result = read_dataframe(
+        fpath,
+        use_arrow=use_arrow,
+        datetime_as_string=datetime_as_string,
+        mixed_offsets_as_utc=mixed_offsets_as_utc,
+    )
+
+    if use_arrow and ext == ".fgb" and __gdal_version__ < (3, 11, 0):
+        # With GDAL < 3.11 with arrow, time zone information is dropped when reading
+        # .fgb
+        if datetime_as_string:
+            assert is_string_dtype(result.dates.dtype)
+            dates_str = pd.Series(
+                ["2020-01-01T09:00:00.123", "2020-01-01T10:00:00.000", np.nan],
+                name="dates",
+            )
+            assert_series_equal(result.dates, dates_str)
+        else:
+            assert_series_equal(result.dates, df.dates.dt.tz_localize(None))
+        # XFAIL: UTC datetimes read wrong in .fgb with GDAL < 3.11 via arrow
+    elif datetime_as_string:
+        assert is_string_dtype(result.dates.dtype)
+        if use_arrow and __gdal_version__ < (3, 11, 0):
+            dates_str = df.dates.astype("str").str.replace(" ", "T")
+            dates_str[2] = np.nan
+        else:
+            dates_str = pd.Series(dates_raw, name="dates")
+            if __gdal_version__ < (3, 7, 0):
+                # With GDAL < 3.7, datetime ends with +00 for UTC, not Z
+                dates_str = dates_str.str.replace("Z", "+00")
+        assert_series_equal(result.dates, dates_str)
+    else:
+        assert result.dates.dtype.name in ("datetime64[ms, UTC]", "datetime64[ns, UTC]")
+        assert_geodataframe_equal(result, df)
 
 
 def test_read_null_values(tmp_path, use_arrow):
@@ -1291,7 +1920,9 @@ def test_write_None_string_column(tmp_path, use_arrow):
     assert filename.exists()
 
     result_gdf = read_dataframe(filename, use_arrow=use_arrow)
-    if PANDAS_GE_30 and use_arrow:
+    if (
+        PANDAS_GE_30 or (PANDAS_GE_23 and pd.options.future.infer_string)
+    ) and use_arrow:
         assert result_gdf.object_col.dtype == "str"
         gdf["object_col"] = gdf["object_col"].astype("str")
     else:
@@ -1349,12 +1980,6 @@ def test_write_dataframe_gpkg_multiple_layers(tmp_path, naturalearth_lowres, use
 @pytest.mark.parametrize("ext", ALL_EXTS)
 @pytest.mark.requires_arrow_write_api
 def test_write_dataframe_append(request, tmp_path, naturalearth_lowres, ext, use_arrow):
-    if ext == ".fgb" and __gdal_version__ <= (3, 5, 0):
-        pytest.skip("Append to FlatGeobuf fails for GDAL <= 3.5.0")
-
-    if ext in (".geojsonl", ".geojsons") and __gdal_version__ <= (3, 6, 0):
-        pytest.skip("Append to GeoJSONSeq only available for GDAL >= 3.6.0")
-
     if use_arrow and ext.startswith(".geojson"):
         # Bug in GDAL when appending int64 to GeoJSON
         # (https://github.com/OSGeo/gdal/issues/9792)
@@ -2005,7 +2630,7 @@ def test_read_dataset_kwargs(nested_geojson_file, use_arrow):
     expected = gp.GeoDataFrame(
         {
             "top_level": ["A"],
-            "intermediate_level": ['{ "bottom_level": "B" }'],
+            "intermediate_level": [{"bottom_level": "B"}],
         },
         geometry=[shapely.Point(0, 0)],
         crs="EPSG:4326",
@@ -2198,6 +2823,29 @@ def test_arrow_bool_exception(tmp_path, ext):
         _ = read_dataframe(filename, use_arrow=True)
 
 
+ at requires_pyarrow_api
+def test_arrow_enable_with_environment_variable(tmp_path):
+    """Test if arrow can be enabled via an environment variable."""
+    # Latin 1 / Western European
+    encoding = "CP1252"
+    text = "ÿ"
+    test_path = tmp_path / "test.gpkg"
+
+    df = gp.GeoDataFrame({text: [text], "geometry": [Point(0, 0)]}, crs="EPSG:4326")
+    write_dataframe(df, test_path, encoding=encoding)
+
+    # Without arrow, specifying the encoding is supported
+    result = read_dataframe(test_path, encoding="cp1252")
+    assert result is not None
+
+    # With arrow enabled, specifying the encoding is not supported
+    with use_arrow_context():
+        with pytest.raises(
+            ValueError, match="non-UTF-8 encoding is not supported for Arrow"
+        ):
+            _ = read_dataframe(test_path, encoding="cp1252")
+
+
 @pytest.mark.filterwarnings("ignore:File /vsimem:RuntimeWarning")
 @pytest.mark.parametrize("driver", ["GeoJSON", "GPKG"])
 def test_write_memory(naturalearth_lowres, driver):
@@ -2242,9 +2890,6 @@ def test_write_memory_driver_required(naturalearth_lowres):
 
 @pytest.mark.parametrize("driver", ["ESRI Shapefile", "OpenFileGDB"])
 def test_write_memory_unsupported_driver(naturalearth_lowres, driver):
-    if driver == "OpenFileGDB" and __gdal_version__ < (3, 6, 0):
-        pytest.skip("OpenFileGDB write support only available for GDAL >= 3.6.0")
-
     df = read_dataframe(naturalearth_lowres)
 
     buffer = BytesIO()
@@ -2479,27 +3124,43 @@ def test_write_kml_file_coordinate_order(tmp_path, use_arrow):
 
     assert np.array_equal(gdf_in.geometry.values, points)
 
-    if "LIBKML" in list_drivers():
-        # test appending to the existing file only if LIBKML is available
-        # as it appears to fall back on LIBKML driver when appending.
-        points_append = [Point(7, 8), Point(9, 10), Point(11, 12)]
-        gdf_append = gp.GeoDataFrame(geometry=points_append, crs="EPSG:4326")
 
-        write_dataframe(
-            gdf_append,
-            output_path,
-            layer="tmp_layer",
-            driver="KML",
-            use_arrow=use_arrow,
-            append=True,
-        )
-        # force_2d used to only compare xy geometry as z-dimension is undesirably
-        # introduced when the kml file is over-written.
-        gdf_in_appended = read_dataframe(
-            output_path, use_arrow=use_arrow, force_2d=True
-        )
+ at pytest.mark.requires_arrow_write_api
+ at pytest.mark.skipif(
+    "LIBKML" not in list_drivers(),
+    reason="LIBKML driver is not available and is needed to append to .kml",
+)
+def test_write_kml_append(tmp_path, use_arrow):
+    """Append features to an existing KML file.
 
-        assert np.array_equal(gdf_in_appended.geometry.values, points + points_append)
+    Appending is only supported by the LIBKML driver, and the driver isn't
+    included in the GDAL ubuntu-small images, so skip if not available.
+    """
+    points = [Point(10, 20), Point(30, 40), Point(50, 60)]
+    gdf = gp.GeoDataFrame(geometry=points, crs="EPSG:4326")
+    output_path = tmp_path / "test.kml"
+    write_dataframe(
+        gdf, output_path, layer="tmp_layer", driver="KML", use_arrow=use_arrow
+    )
+
+    # test appending to the existing file only if LIBKML is available
+    # as it appears to fall back on LIBKML driver when appending.
+    points_append = [Point(7, 8), Point(9, 10), Point(11, 12)]
+    gdf_append = gp.GeoDataFrame(geometry=points_append, crs="EPSG:4326")
+
+    write_dataframe(
+        gdf_append,
+        output_path,
+        layer="tmp_layer",
+        driver="KML",
+        use_arrow=use_arrow,
+        append=True,
+    )
+    # force_2d is used to only compare the xy dimensions of the geometry, as the LIBKML
+    # driver always adds the z-dimension when the kml file is over-written.
+    gdf_in_appended = read_dataframe(output_path, use_arrow=use_arrow, force_2d=True)
+
+    assert np.array_equal(gdf_in_appended.geometry.values, points + points_append)
 
 
 @pytest.mark.requires_arrow_write_api
@@ -2537,3 +3198,21 @@ def test_write_geojson_rfc7946_coordinates(tmp_path, use_arrow):
 
     gdf_in_appended = read_dataframe(output_path, use_arrow=use_arrow)
     assert np.array_equal(gdf_in_appended.geometry.values, points + points_append)
+
+
+ at pytest.mark.requires_arrow_api
+ at pytest.mark.skipif(
+    not GDAL_HAS_PARQUET_DRIVER, reason="Parquet driver is not available"
+)
+def test_parquet_driver(tmp_path, use_arrow):
+    """
+    Simple test verifying the Parquet driver works if available
+    """
+    gdf = gp.GeoDataFrame(
+        {"col": [1, 2, 3], "geometry": [Point(0, 0), Point(1, 1), Point(2, 2)]},
+        crs="EPSG:4326",
+    )
+    output_path = tmp_path / "test.parquet"
+    write_dataframe(gdf, output_path, use_arrow=use_arrow)
+    result = read_dataframe(output_path, use_arrow=use_arrow)
+    assert_geodataframe_equal(result, gdf)


=====================================
pyogrio/tests/test_raw_io.py
=====================================
@@ -24,7 +24,6 @@ from pyogrio.tests.conftest import (
     DRIVER_EXT,
     DRIVERS,
     prepare_testfile,
-    requires_arrow_api,
     requires_pyarrow_api,
     requires_shapely,
 )
@@ -616,10 +615,6 @@ def test_write_no_geom_no_fields():
         write("test.gpkg", geometry=None, field_data=None, fields=None)
 
 
- at pytest.mark.skipif(
-    __gdal_version__ < (3, 6, 0),
-    reason="OpenFileGDB write support only available for GDAL >= 3.6.0",
-)
 @pytest.mark.parametrize(
     "write_int64",
     [
@@ -698,12 +693,6 @@ def test_write_openfilegdb(tmp_path, write_int64):
 
 @pytest.mark.parametrize("ext", DRIVERS)
 def test_write_append(tmp_path, naturalearth_lowres, ext):
-    if ext == ".fgb" and __gdal_version__ <= (3, 5, 0):
-        pytest.skip("Append to FlatGeobuf fails for GDAL <= 3.5.0")
-
-    if ext in (".geojsonl", ".geojsons") and __gdal_version__ < (3, 6, 0):
-        pytest.skip("Append to GeoJSONSeq only available for GDAL >= 3.6.0")
-
     if ext == ".gpkg.zip":
         pytest.skip("Append to .gpkg.zip is not supported")
 
@@ -725,11 +714,8 @@ def test_write_append(tmp_path, naturalearth_lowres, ext):
     assert read_info(filename)["features"] == 354
 
 
- at pytest.mark.parametrize("driver,ext", [("GML", ".gml"), ("GeoJSONSeq", ".geojsons")])
+ at pytest.mark.parametrize("driver,ext", [("GML", ".gml")])
 def test_write_append_unsupported(tmp_path, naturalearth_lowres, driver, ext):
-    if ext == ".geojsons" and __gdal_version__ >= (3, 6, 0):
-        pytest.skip("Append to GeoJSONSeq supported for GDAL >= 3.6.0")
-
     meta, _, geometry, field_data = read(naturalearth_lowres)
 
     # GML does not support append functionality
@@ -744,27 +730,6 @@ def test_write_append_unsupported(tmp_path, naturalearth_lowres, driver, ext):
         write(filename, geometry, field_data, driver=driver, append=True, **meta)
 
 
- at pytest.mark.skipif(
-    __gdal_version__ > (3, 5, 0),
-    reason="segfaults on FlatGeobuf limited to GDAL <= 3.5.0",
-)
-def test_write_append_prevent_gdal_segfault(tmp_path, naturalearth_lowres):
-    """GDAL <= 3.5.0 segfaults when appending to FlatGeobuf; this test
-    verifies that we catch that before segfault"""
-    meta, _, geometry, field_data = read(naturalearth_lowres)
-    meta["geometry_type"] = "MultiPolygon"
-
-    filename = tmp_path / "test.fgb"
-    write(filename, geometry, field_data, **meta)
-
-    assert filename.exists()
-
-    with pytest.raises(
-        RuntimeError,  # match="append to FlatGeobuf is not supported for GDAL <= 3.5.0"
-    ):
-        write(filename, geometry, field_data, append=True, **meta)
-
-
 @pytest.mark.parametrize(
     "driver",
     {
@@ -794,16 +759,14 @@ def test_write_supported(tmp_path, naturalearth_lowres, driver):
     assert filename.exists()
 
 
- at pytest.mark.skipif(
-    __gdal_version__ >= (3, 6, 0), reason="OpenFileGDB supports write for GDAL >= 3.6.0"
-)
 def test_write_unsupported(tmp_path, naturalearth_lowres):
+    """Test writing using a driver that does not support writing."""
     meta, _, geometry, field_data = read(naturalearth_lowres)
 
-    filename = tmp_path / "test.gdb"
+    filename = tmp_path / "test.topojson"
 
     with pytest.raises(DataSourceError, match="does not support write functionality"):
-        write(filename, geometry, field_data, driver="OpenFileGDB", **meta)
+        write(filename, geometry, field_data, driver="TopoJSON", **meta)
 
 
 def test_write_gdalclose_error(naturalearth_lowres):
@@ -1005,15 +968,6 @@ def test_read_data_types_numeric_with_null(test_gpkg_nulls):
             assert field.dtype == "float64"
 
 
-def test_read_unsupported_types(list_field_values_file):
-    fields = read(list_field_values_file)[3]
-    # list field gets skipped, only integer field is read
-    assert len(fields) == 1
-
-    fields = read(list_field_values_file, columns=["int64"])[3]
-    assert len(fields) == 1
-
-
 def test_read_datetime_millisecond(datetime_file):
     field = read(datetime_file)[3][0]
     assert field.dtype == "datetime64[ms]"
@@ -1047,15 +1001,20 @@ def test_read_unsupported_ext_with_prefix(tmp_path):
 def test_read_datetime_as_string(datetime_tz_file):
     field = read(datetime_tz_file)[3][0]
     assert field.dtype == "datetime64[ms]"
-    # timezone is ignored in numpy layer
+    # time zone is ignored in numpy layer
     assert field[0] == np.datetime64("2020-01-01 09:00:00.123")
     assert field[1] == np.datetime64("2020-01-01 10:00:00.000")
 
     field = read(datetime_tz_file, datetime_as_string=True)[3][0]
     assert field.dtype == "object"
-    # GDAL doesn't return strings in ISO format (yet)
-    assert field[0] == "2020/01/01 09:00:00.123-05"
-    assert field[1] == "2020/01/01 10:00:00-05"
+
+    if __gdal_version__ < (3, 7, 0):
+        # With GDAL < 3.7, datetimes are not returned as ISO8601 strings
+        assert field[0] == "2020/01/01 09:00:00.123-05"
+        assert field[1] == "2020/01/01 10:00:00-05"
+    else:
+        assert field[0] == "2020-01-01T09:00:00.123-05:00"
+        assert field[1] == "2020-01-01T10:00:00-05:00"
 
 
 @pytest.mark.parametrize("ext", ["gpkg", "geojson"])
@@ -1187,9 +1146,6 @@ def test_write_memory_driver_required(naturalearth_lowres):
 
 @pytest.mark.parametrize("driver", ["ESRI Shapefile", "OpenFileGDB"])
 def test_write_memory_unsupported_driver(naturalearth_lowres, driver):
-    if driver == "OpenFileGDB" and __gdal_version__ < (3, 6, 0):
-        pytest.skip("OpenFileGDB write support only available for GDAL >= 3.6.0")
-
     meta, _, geometry, field_data = read(naturalearth_lowres)
 
     buffer = BytesIO()
@@ -1491,7 +1447,6 @@ def test_write_with_mask(tmp_path):
         write(filename, geometry, field_data, fields, field_mask, **meta)
 
 
- at requires_arrow_api
 def test_open_arrow_capsule_protocol_without_pyarrow(naturalearth_lowres):
     # this test is included here instead of test_arrow.py to ensure we also run
     # it when pyarrow is not installed
@@ -1509,7 +1464,6 @@ def test_open_arrow_capsule_protocol_without_pyarrow(naturalearth_lowres):
 
 
 @pytest.mark.skipif(HAS_PYARROW, reason="pyarrow is installed")
- at requires_arrow_api
 def test_open_arrow_error_no_pyarrow(naturalearth_lowres):
     # this test is included here instead of test_arrow.py to ensure we run
     # it when pyarrow is not installed


=====================================
pyogrio/util.py
=====================================
@@ -4,13 +4,11 @@ import re
 import sys
 from packaging.version import Version
 from pathlib import Path
-from typing import Union
 from urllib.parse import urlparse
 
+from pyogrio._ogr import MULTI_EXTENSIONS
 from pyogrio._vsi import vsimem_rmtree_toplevel as _vsimem_rmtree_toplevel
 
-MULTI_EXTENSIONS = (".gpkg.zip", ".shp.zip")
-
 
 def get_vsi_path_or_buffer(path_or_buffer):
     """Get VSI-prefixed path or bytes buffer depending on type of path_or_buffer.
@@ -54,7 +52,7 @@ def get_vsi_path_or_buffer(path_or_buffer):
     return vsi_path(str(path_or_buffer))
 
 
-def vsi_path(path: Union[str, Path]) -> str:
+def vsi_path(path: str | Path) -> str:
     """Ensure path is a local path or a GDAL-compatible VSI path."""
     # Convert Path objects to string, but for VSI paths, keep posix style path.
     if isinstance(path, Path):
@@ -237,7 +235,7 @@ def _mask_to_wkb(mask):
     return shapely.to_wkb(mask)
 
 
-def vsimem_rmtree_toplevel(path: Union[str, Path]):
+def vsimem_rmtree_toplevel(path: str | Path):
     """Remove the parent directory of the file path recursively.
 
     This is used for final cleanup of an in-memory dataset, which may have been


=====================================
pyproject.toml
=====================================
@@ -1,7 +1,7 @@
 [build-system]
 requires = [
     "setuptools",
-    "Cython>=0.29",
+    "Cython>=3.1",
     "versioneer[toml]==0.28",
     # tomli is used by versioneer
     "tomli; python_version < '3.11'",
@@ -26,12 +26,13 @@ classifiers = [
     "Operating System :: OS Independent",
     "Programming Language :: Python :: 3",
     "Topic :: Scientific/Engineering :: GIS",
+    "Programming Language :: Python :: Free Threading :: 2 - Beta",
 ]
-requires-python = ">=3.9"
+requires-python = ">=3.10"
 dependencies = ["certifi", "numpy", "packaging"]
 
 [project.optional-dependencies]
-dev = ["cython"]
+dev = ["cython>=3.1"]
 test = ["pytest", "pytest-cov"]
 benchmark = ["pytest-benchmark"]
 geopandas = ["geopandas"]
@@ -41,17 +42,18 @@ Home = "https://pyogrio.readthedocs.io/"
 Repository = "https://github.com/geopandas/pyogrio"
 
 [tool.cibuildwheel]
-skip = ["pp*", "*musllinux*"]
+skip = ["*musllinux*"]
 archs = ["auto64"]
 manylinux-x86_64-image = "manylinux-x86_64-vcpkg-gdal:latest"
 manylinux-aarch64-image = "manylinux-aarch64-vcpkg-gdal:latest"
 build-verbosity = 3
+enable = ["cpython-freethreading"]
 
 [tool.cibuildwheel.linux.environment]
 VCPKG_INSTALL = "$VCPKG_INSTALLATION_ROOT/installed/$VCPKG_DEFAULT_TRIPLET"
 GDAL_INCLUDE_PATH = "$VCPKG_INSTALL/include"
 GDAL_LIBRARY_PATH = "$VCPKG_INSTALL/lib"
-GDAL_VERSION = "3.10.3"
+GDAL_VERSION = "3.11.4"
 PYOGRIO_PACKAGE_DATA = 1
 GDAL_DATA = "$VCPKG_INSTALL/share/gdal"
 PROJ_LIB = "$VCPKG_INSTALL/share/proj"
@@ -66,7 +68,7 @@ repair-wheel-command = [
 VCPKG_INSTALL = "$VCPKG_INSTALLATION_ROOT/installed/$VCPKG_DEFAULT_TRIPLET"
 GDAL_INCLUDE_PATH = "$VCPKG_INSTALL/include"
 GDAL_LIBRARY_PATH = "$VCPKG_INSTALL/lib"
-GDAL_VERSION = "3.10.3"
+GDAL_VERSION = "3.11.4"
 PYOGRIO_PACKAGE_DATA = 1
 GDAL_DATA = "$VCPKG_INSTALL/share/gdal"
 PROJ_LIB = "$VCPKG_INSTALL/share/proj"
@@ -80,7 +82,7 @@ repair-wheel-command = "delvewheel repair --add-path C:/vcpkg/installed/x64-wind
 VCPKG_INSTALL = "$VCPKG_INSTALLATION_ROOT/installed/x64-windows-dynamic-release"
 GDAL_INCLUDE_PATH = "$VCPKG_INSTALL/include"
 GDAL_LIBRARY_PATH = "$VCPKG_INSTALL/lib"
-GDAL_VERSION = "3.10.3"
+GDAL_VERSION = "3.11.4"
 PYOGRIO_PACKAGE_DATA = 1
 GDAL_DATA = "$VCPKG_INSTALL/share/gdal"
 PROJ_LIB = "$VCPKG_INSTALL/share/proj"


=====================================
setup.py
=====================================
@@ -20,12 +20,12 @@ except ImportError:
 logger = logging.getLogger(__name__)
 
 
-MIN_PYTHON_VERSION = (3, 9, 0)
+MIN_PYTHON_VERSION = (3, 10, 0)
 MIN_GDAL_VERSION = (2, 4, 0)
 
 
 if sys.version_info < MIN_PYTHON_VERSION:
-    raise RuntimeError("Python >= 3.9 is required")
+    raise RuntimeError("Python >= 3.10 is required")
 
 
 def copy_data_tree(datadir, destdir):
@@ -169,7 +169,7 @@ else:
             Extension("pyogrio._ogr", ["pyogrio/_ogr.pyx"], **ext_options),
             Extension("pyogrio._vsi", ["pyogrio/_vsi.pyx"], **ext_options),
         ],
-        compiler_directives={"language_level": "3"},
+        compiler_directives={"language_level": "3", "freethreading_compatible": True},
         compile_time_env=compile_time_env,
     )
 



View it on GitLab: https://salsa.debian.org/debian-gis-team/pyogrio/-/compare/f8c2b41acb543c1a4b686e41c5d776fcb9619c5b...d862e40593134a74df9c047247b785cfb9f8fbe1

-- 
View it on GitLab: https://salsa.debian.org/debian-gis-team/pyogrio/-/compare/f8c2b41acb543c1a4b686e41c5d776fcb9619c5b...d862e40593134a74df9c047247b785cfb9f8fbe1
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-grass-devel/attachments/20251126/c9012e83/attachment-0001.htm>


More information about the Pkg-grass-devel mailing list