[Git][debian-gis-team/pyshp][upstream] New upstream version 2.3.0
Bas Couwenberg (@sebastic)
gitlab at salsa.debian.org
Sat Apr 30 17:43:42 BST 2022
Bas Couwenberg pushed to branch upstream at Debian GIS Project / pyshp
Commits:
2752b0da by Bas Couwenberg at 2022-04-30T18:34:40+02:00
New upstream version 2.3.0
- - - - -
27 changed files:
- .github/workflows/deploy.yml
- README.md
- changelog.txt
- + pyproject.toml
- + pytest.ini
- + setup.cfg
- setup.py
- shapefile.py
- shapefiles/test/balancing.dbf
- shapefiles/test/contextwriter.dbf
- + shapefiles/test/corrupt_too_long.dbf
- + shapefiles/test/corrupt_too_long.shp
- + shapefiles/test/corrupt_too_long.shx
- shapefiles/test/dtype.dbf
- shapefiles/test/edit.dbf
- shapefiles/test/line.dbf
- shapefiles/test/linem.dbf
- shapefiles/test/linez.dbf
- shapefiles/test/merge.dbf
- shapefiles/test/multipatch.dbf
- shapefiles/test/multipoint.dbf
- shapefiles/test/onlydbf.dbf
- shapefiles/test/point.dbf
- shapefiles/test/polygon.dbf
- shapefiles/test/shapetype.dbf
- shapefiles/test/testfile.dbf
- test_shapefile.py
Changes:
=====================================
.github/workflows/deploy.yml
=====================================
@@ -28,7 +28,7 @@ jobs:
python -m pip install --upgrade pip
pip install build
- name: Build package
- run: python -m build --sdist --wheel --outdir dist/
+ run: python -m build
- name: Publish package
uses: pypa/gh-action-pypi-publish at 27b31702a0e7fc50959f5ad993c78deac1bdfc29
with:
=====================================
README.md
=====================================
@@ -8,8 +8,8 @@ The Python Shapefile Library (PyShp) reads and writes ESRI Shapefiles in pure Py
- **Author**: [Joel Lawhead](https://github.com/GeospatialPython)
- **Maintainers**: [Karim Bahgat](https://github.com/karimbahgat)
-- **Version**: 2.2.0
-- **Date**: 2 February, 2022
+- **Version**: 2.3.0
+- **Date**: 30 April, 2022
- **License**: [MIT](https://github.com/GeospatialPython/pyshp/blob/master/LICENSE.TXT)
## Contents
@@ -38,7 +38,9 @@ The Python Shapefile Library (PyShp) reads and writes ESRI Shapefiles in pure Py
- [Adding Geometry](#adding-geometry)
- [Geometry and Record Balancing](#geometry-and-record-balancing)
- [Advanced Use](#advanced-use)
- - [Shapefile Language and Character Encoding](#shapefile-language-and-character-encoding)
+ - [Common Errors and Fixes](#common-errors-and-fixes)
+ - [Warnings and Logging](#warnings-and-logging)
+ - [Shapefile Encoding Errors](#shapefile-encoding-errors)
- [Reading Large Shapefiles](#reading-large-shapefiles)
- [Iterating through a shapefile](#iterating-through-a-shapefile)
- [Limiting which fields to read](#limiting-which-fields-to-read)
@@ -48,6 +50,9 @@ The Python Shapefile Library (PyShp) reads and writes ESRI Shapefiles in pure Py
- [Merging multiple shapefiles](#merging-multiple-shapefiles)
- [Editing shapefiles](#editing-shapefiles)
- [3D and Other Geometry Types](#3d-and-other-geometry-types)
+ - [Shapefiles with measurement (M) values](#shapefiles-with-measurement-m-values)
+ - [Shapefiles with elevation (Z) values](#shapefiles-with-elevation-z-values)
+ - [3D MultiPatch Shapefiles](#3d-multipatch-shapefiles)
- [Testing](#testing)
- [Contributors](#contributors)
@@ -90,6 +95,26 @@ part of your geospatial project.
# Version Changes
+## 2.3.0
+
+### New Features:
+
+- Added support for pathlib and path-like shapefile filepaths (@mwtoews).
+- Allow reading individual file extensions via filepaths.
+
+### Improvements:
+
+- Simplified setup and deployment (@mwtoews)
+- Faster shape access when missing shx file
+- Switch to named logger (see #240)
+
+### Bug fixes:
+
+- More robust handling of corrupt shapefiles (fixes #235)
+- Fix errors when writing to individual file-handles (fixes #237)
+- Revert previous decision to enforce geojson output ring orientation (detailed explanation at https://github.com/SciTools/cartopy/issues/2012)
+- Fix test issues in environments without network access (@sebastic, @musicinmybrain).
+
## 2.2.0
### New Features:
@@ -103,6 +128,7 @@ part of your geospatial project.
- More examples and restructuring of README.
- More informative Shape to geojson warnings (see #219).
+- Add shapefile.VERBOSE flag to control warnings verbosity (default True).
- Shape object information when calling repr().
- Faster ring orientation checks, enforce geojson output ring orientation.
@@ -243,7 +269,11 @@ OR
>>> sf = shapefile.Reader("shapefiles/blockgroups.dbf")
OR any of the other 5+ formats which are potentially part of a shapefile. The
-library does not care about file extensions.
+library does not care about file extensions. You can also specify that you only
+want to read some of the file extensions through the use of keyword arguments:
+
+
+ >>> sf = shapefile.Reader(dbf="shapefiles/blockgroups.dbf")
#### Reading Shapefiles from Zip Files
@@ -535,6 +565,16 @@ attribute:
... ["UNITS3_9", "N", 8, 0], ["UNITS10_49", "N", 8, 0],
... ["UNITS50_UP", "N", 8, 0], ["MOBILEHOME", "N", 7, 0]]
+The first field of a dbf file is always a 1-byte field called "DeletionFlag",
+which indicates records that have been deleted but not removed. However,
+since this flag is very rarely used, PyShp currently will return all records
+regardless of their deletion flag, and the flag is also not included in the list of
+record values. In other words, the DeletionFlag field has no real purpose, and
+should in most cases be ignored. For instance, to get a list of all fieldnames:
+
+
+ >>> fieldnames = [f[0] for f in sf.fields[1:]]
+
You can get a list of the shapefile's records by calling the records() method:
@@ -549,7 +589,7 @@ To read a single record call the record() method with the record's index:
>>> rec = sf.record(3)
Each record is a list-like Record object containing the values corresponding to each field in
-the field list. A record's values can be accessed by positional indexing or slicing.
+the field list (except the DeletionFlag). A record's values can be accessed by positional indexing or slicing.
For example in the blockgroups shapefile the 2nd and 3rd fields are the blockgroup id
and the 1990 population count of that San Francisco blockgroup:
@@ -615,18 +655,14 @@ To get the 4th shape record from the blockgroups shapefile use the third index:
>>> shapeRec = sf.shapeRecord(3)
+ >>> shapeRec.record[1:3]
+ ['060750601001', 4715]
-Each individual shape record also supports the _\_geo_interface\_\_ to convert it to a GeoJSON:
+Each individual shape record also supports the _\_geo_interface\_\_ to convert it to a GeoJSON feature:
>>> shapeRec.__geo_interface__['type']
'Feature'
-
-The blockgroup key and population count:
-
-
- >>> shapeRec.record[1:3]
- ['060750601001', 4715]
## Writing Shapefiles
@@ -1028,11 +1064,30 @@ most shapefile software.
# Advanced Use
-## Shapefile Language and Character Encoding
+## Common Errors and Fixes
+
+Below we list some commonly encountered errors and ways to fix them.
+
+### Warnings and Logging
+
+By default, PyShp chooses to be transparent and provide the user with logging information and warnings about non-critical issues when reading or writing shapefiles. This behavior is controlled by the module constant `VERBOSE` (which defaults to True). If you would rather suppress this information, you can simply set this to False:
+
+
+ >>> shapefile.VERBOSE = False
+
+All logging happens under the namespace `shapefile`. So another way to suppress all PyShp warnings would be to alter the logging behavior for that namespace:
+
+
+ >>> import logging
+ >>> logging.getLogger('shapefile').setLevel(logging.ERROR)
+
+### Shapefile Encoding Errors
PyShp supports reading and writing shapefiles in any language or character encoding, and provides several options for decoding and encoding text.
-Most shapefiles are written in UTF-8 encoding, PyShp's default encoding, so in most cases you don't
-have to specify the encoding. For reading shapefiles in any other encoding, such as Latin-1, just
+Most shapefiles are written in UTF-8 encoding, PyShp's default encoding, so in most cases you don't have to specify the encoding.
+If you encounter an encoding error when reading a shapefile, this means the shapefile was likely written in a non-utf8 encoding.
+For instance, when working with English language shapefiles, a common reason for encoding errors is that the shapefile was written in Latin-1 encoding.
+For reading shapefiles in any non-utf8 encoding, such as Latin-1, just
supply the encoding option when creating the Reader class.
@@ -1240,7 +1295,7 @@ file one record at a time, modify or filter the contents, and write it back out.
Most shapefiles store conventional 2D points, lines, or polygons. But the shapefile format is also capable
of storing various other types of geometries as well, including complex 3D surfaces and objects.
-**Shapefiles with measurement (M) values**
+### Shapefiles with measurement (M) values
Measured shape types are shapes that include a measurement value at each vertex, for instance
speed measurements from a GPS device. Shapes with measurement (M) values are added with the following
@@ -1272,7 +1327,7 @@ Shapefiles containing M-values can be examined in several ways:
[0.0, None, 3.0, None, 0.0, None, None]
-**Shapefiles with elevation (Z) values**
+### Shapefiles with elevation (Z) values
Elevation shape types are shapes that include an elevation value at each vertex, for instance elevation from a GPS device.
Shapes with elevation (Z) values are added with the following methods: "pointz", "multipointz", "linez", and "polyz".
@@ -1304,7 +1359,7 @@ To examine a Z-type shapefile you can do:
>>> r.shape(0).z # flat list of Z-values
[18.0, 20.0, 22.0, 0.0, 0.0, 0.0, 0.0, 15.0, 13.0, 14.0]
-**3D MultiPatch Shapefiles**
+### 3D MultiPatch Shapefiles
Multipatch shapes are useful for storing composite 3-Dimensional objects.
A MultiPatch shape represents a 3D object made up of one or more surface parts.
@@ -1350,6 +1405,7 @@ correct line endings in README.md.
```
Atle Frenvik Sveen
Bas Couwenberg
+Ben Beasley
Casey Meisenzahl
Charles Arnold
David A. Riggs
=====================================
changelog.txt
=====================================
@@ -1,4 +1,22 @@
+VERSION 2.3.0
+
+2022-04-30
+ New Features:
+ * Added support for pathlib and path-like shapefile filepaths (@mwtoews).
+ * Allow reading individual file extensions via filepaths.
+
+ Improvements:
+ * Simplified setup and deployment (@mwtoews)
+ * Faster shape access when missing shx file
+ * Switch to named logger (see #240)
+
+ Bug fixes:
+ * More robust handling of corrupt shapefiles (fixes #235)
+ * Fix errors when writing to individual file-handles (fixes #237)
+ * Revert previous decision to enforce geojson output ring orientation (detailed explanation at https://github.com/SciTools/cartopy/issues/2012)
+ * Fix test issues in environments without network access (@sebastic, @musicinmybrain).
+
VERSION 2.2.0
2022-02-02
@@ -11,6 +29,7 @@ VERSION 2.2.0
Improvements:
* More examples and restructuring of README.
* More informative Shape to geojson warnings (see #219).
+ * Add shapefile.VERBOSE flag to control warnings verbosity (default True).
* Shape object information when calling repr().
* Faster ring orientation checks, enforce geojson output ring orientation.
=====================================
pyproject.toml
=====================================
@@ -0,0 +1,3 @@
+[build-system]
+requires = ["setuptools"]
+build-backend = "setuptools.build_meta"
=====================================
pytest.ini
=====================================
@@ -0,0 +1,3 @@
+[pytest]
+markers =
+ network: marks tests requiring network access
=====================================
setup.cfg
=====================================
@@ -0,0 +1,30 @@
+[metadata]
+name = pyshp
+version = attr: shapefile.__version__
+description = Pure Python read/write support for ESRI Shapefile format
+long_description = file: README.md
+long_description_content_type = text/markdown
+author = Joel Lawhead
+author_email = jlawhead at geospatialpython.com
+maintainer = Karim Bahgat
+maintainer_email = karim.bahgat.norway at gmail.com
+url = https://github.com/GeospatialPython/pyshp
+download_url = https://pypi.org/project/pyshp/
+license = MIT
+license_files = LICENSE.TXT
+keywords = gis, geospatial, geographic, shapefile, shapefiles
+classifiers =
+ Development Status :: 5 - Production/Stable
+ Programming Language :: Python
+ Programming Language :: Python :: 2.7
+ Programming Language :: Python :: 3
+ Topic :: Scientific/Engineering :: GIS
+ Topic :: Software Development :: Libraries
+ Topic :: Software Development :: Libraries :: Python Modules
+
+[options]
+py_modules = shapefile
+python_requires = >=2.7
+
+[bdist_wheel]
+universal=1
=====================================
setup.py
=====================================
@@ -1,32 +1,3 @@
from setuptools import setup
-
-def read_file(file):
- with open(file, 'rb') as fh:
- data = fh.read()
- return data.decode('utf-8')
-
-setup(name='pyshp',
- version='2.2.0',
- description='Pure Python read/write support for ESRI Shapefile format',
- long_description=read_file('README.md'),
- long_description_content_type='text/markdown',
- author='Joel Lawhead, Karim Bahgat',
- author_email='jlawhead at geospatialpython.com',
- url='https://github.com/GeospatialPython/pyshp',
- py_modules=['shapefile'],
- license='MIT',
- zip_safe=False,
- keywords='gis geospatial geographic shapefile shapefiles',
- python_requires='>= 2.7',
- classifiers=['Programming Language :: Python',
- 'Programming Language :: Python :: 2.7',
- 'Programming Language :: Python :: 3',
- 'Programming Language :: Python :: 3.5',
- 'Programming Language :: Python :: 3.6',
- 'Programming Language :: Python :: 3.7',
- 'Programming Language :: Python :: 3.8',
- 'Programming Language :: Python :: 3.9',
- 'Topic :: Scientific/Engineering :: GIS',
- 'Topic :: Software Development :: Libraries',
- 'Topic :: Software Development :: Libraries :: Python Modules'])
+setup()
=====================================
shapefile.py
=====================================
@@ -3,11 +3,10 @@ shapefile.py
Provides read and write support for ESRI Shapefiles.
authors: jlawhead<at>geospatialpython.com
maintainer: karim.bahgat.norway<at>gmail.com
-version: 2.2.0
Compatible with Python versions 2.7-3.x
"""
-__version__ = "2.2.0"
+__version__ = "2.3.0"
from struct import pack, unpack, calcsize, error, Struct
import os
@@ -20,6 +19,9 @@ import io
from datetime import date
import zipfile
+# Create named logger
+logger = logging.getLogger(__name__)
+
# Module settings
VERBOSE = True
@@ -161,6 +163,24 @@ else:
def is_string(v):
return isinstance(v, basestring)
+if sys.version_info[0:2] >= (3, 6):
+ def pathlike_obj(path):
+ if isinstance(path, os.PathLike):
+ return os.fsdecode(path)
+ else:
+ return path
+else:
+ def pathlike_obj(path):
+ if is_string(path):
+ return path
+ elif hasattr(path, "__fspath__"):
+ return path.__fspath__()
+ else:
+ try:
+ return str(path)
+ except:
+ return path
+
# Begin
@@ -331,11 +351,9 @@ def organize_polygon_rings(rings, return_errors=None):
# where exterior rings are clockwise, and holes counterclockwise
if is_cw(ring):
# ring is exterior
- ring = rewind(ring) # GeoJSON and Shapefile exteriors have opposite orientation
exteriors.append(ring)
else:
# ring is a hole
- ring = rewind(ring) # GeoJSON and Shapefile holes have opposite orientation
holes.append(ring)
# if only one exterior, then all holes belong to that exterior
@@ -371,8 +389,8 @@ def organize_polygon_rings(rings, return_errors=None):
if len(exterior_candidates) > 1:
# get hole sample point
- # Note: all rings now follow GeoJSON orientation, i.e. holes are clockwise
- hole_sample = ring_sample(holes[hole_i], ccw=False)
+ ccw = not is_cw(holes[hole_i])
+ hole_sample = ring_sample(holes[hole_i], ccw=ccw)
# collect new exterior candidates
new_exterior_candidates = []
for ext_i in exterior_candidates:
@@ -416,8 +434,6 @@ def organize_polygon_rings(rings, return_errors=None):
# add orphan holes as exteriors
for hole_i in orphan_holes:
ext = holes[hole_i]
- # since this was previously a clockwise ordered hole, inverse the winding order
- ext = rewind(ext)
# add as single exterior without any holes
poly = [ext]
polys.append(poly)
@@ -432,8 +448,6 @@ def organize_polygon_rings(rings, return_errors=None):
if return_errors is not None:
return_errors['polygon_only_holes'] = len(holes)
exteriors = holes
- # since these were previously clockwise ordered holes, inverse the winding order
- exteriors = [rewind(ext) for ext in exteriors]
# add as single exterior without any holes
polys = [[ext] for ext in exteriors]
return polys
@@ -556,13 +570,13 @@ class Shape(object):
but the Shape contained interior holes (defined by counter-clockwise orientation in the shapefile format) that were \
orphaned, i.e. not contained by any exterior rings. The rings were still included but were \
encoded as GeoJSON exterior rings instead of holes.'
- logging.warning(msg)
+ logger.warning(msg)
only_holes = self._errors.get('polygon_only_holes', None)
if only_holes:
msg = header + 'Shapefile format requires that polygons contain at least one exterior ring, \
but the Shape was entirely made up of interior holes (defined by counter-clockwise orientation in the shapefile format). The rings were \
still included but were encoded as GeoJSON exterior rings instead of holes.'
- logging.warning(msg)
+ logger.warning(msg)
# return as geojson
if len(polys) == 1:
@@ -892,7 +906,7 @@ class ShapefileException(Exception):
# if len(messages) > 1:
# # more than just the "Summary of..." header
# msg = '\n'.join(messages)
-# logging.warning(msg)
+# logger.warning(msg)
class Reader(object):
"""Reads the three files of a shapefile as a unit or
@@ -930,14 +944,14 @@ class Reader(object):
self.encodingErrors = kwargs.pop('encodingErrors', 'strict')
# See if a shapefile name was passed as the first argument
if len(args) > 0:
- if is_string(args[0]):
- path = args[0]
-
+ path = pathlike_obj(args[0])
+ if is_string(path):
+
if '.zip' in path:
# Shapefile is inside a zipfile
if path.count('.zip') > 1:
# Multiple nested zipfiles
- raise ShapefileException('Reading from multiple nested zipfiles is not supported: %s' % args[0])
+ raise ShapefileException('Reading from multiple nested zipfiles is not supported: %s' % path)
# Split into zipfile and shapefile paths
if path.endswith('.zip'):
zpath = path
@@ -1031,7 +1045,7 @@ class Reader(object):
self.load(path)
return
- # Otherwise, load from separate shp/shx/dbf args (must be file-like)
+ # Otherwise, load from separate shp/shx/dbf args (must be path or file-like)
if "shp" in kwargs.keys():
if hasattr(kwargs["shp"], "read"):
self.shp = kwargs["shp"]
@@ -1041,7 +1055,8 @@ class Reader(object):
except (NameError, io.UnsupportedOperation):
self.shp = io.BytesIO(self.shp.read())
else:
- raise ShapefileException('The shp arg must be file-like.')
+ (baseName, ext) = os.path.splitext(kwargs["shp"])
+ self.load_shp(baseName)
if "shx" in kwargs.keys():
if hasattr(kwargs["shx"], "read"):
@@ -1052,7 +1067,8 @@ class Reader(object):
except (NameError, io.UnsupportedOperation):
self.shx = io.BytesIO(self.shx.read())
else:
- raise ShapefileException('The shx arg must be file-like.')
+ (baseName, ext) = os.path.splitext(kwargs["shx"])
+ self.load_shx(baseName)
if "dbf" in kwargs.keys():
if hasattr(kwargs["dbf"], "read"):
@@ -1063,7 +1079,8 @@ class Reader(object):
except (NameError, io.UnsupportedOperation):
self.dbf = io.BytesIO(self.dbf.read())
else:
- raise ShapefileException('The dbf arg must be file-like.')
+ (baseName, ext) = os.path.splitext(kwargs["dbf"])
+ self.load_dbf(baseName)
# Load the files
if self.shp or self.dbf:
@@ -1106,21 +1123,36 @@ class Reader(object):
elif self.shp:
# Otherwise use shape count
if self.shx:
- # Use index file to get total count
if self.numShapes is None:
- # File length (16-bit word * 2 = bytes) - header length
- self.shx.seek(24)
- shxRecordLength = (unpack(">i", self.shx.read(4))[0] * 2) - 100
- self.numShapes = shxRecordLength // 8
-
+ self.__shxHeader()
+
return self.numShapes
else:
# Index file not available, iterate all shapes to get total count
if self.numShapes is None:
- for i,shape in enumerate(self.iterShapes()):
- pass
- self.numShapes = i + 1
+ # Determine length of shp file
+ shp = self.shp
+ checkpoint = shp.tell()
+ shp.seek(0,2)
+ shpLength = shp.tell()
+ shp.seek(100)
+ # Do a fast shape iteration until end of file.
+ unpack = Struct('>2i').unpack
+ offsets = []
+ pos = shp.tell()
+ while pos < shpLength:
+ offsets.append(pos)
+ # Unpack the shape header only
+ (recNum, recLength) = unpack(shp.read(8))
+ # Jump to next shape position
+ pos += 8 + (2 * recLength)
+ shp.seek(pos)
+ # Set numShapes and offset indices
+ self.numShapes = len(offsets)
+ self._offsets = offsets
+ # Return to previous file position
+ shp.seek(checkpoint)
return self.numShapes
@@ -1160,6 +1192,8 @@ class Reader(object):
self.__shpHeader()
if self.dbf:
self.__dbfHeader()
+ if self.shx:
+ self.__shxHeader()
def load_shp(self, shapefile_name):
"""
@@ -1239,7 +1273,7 @@ class Reader(object):
return i
def __shpHeader(self):
- """Reads the header information from a .shp or .shx file."""
+ """Reads the header information from a .shp file."""
if not self.shp:
raise ShapefileException("Shapefile Reader requires a shapefile or file-like object. (no shp file found")
shp = self.shp
@@ -1341,27 +1375,40 @@ class Reader(object):
f.seek(next)
return record
+ def __shxHeader(self):
+ """Reads the header information from a .shx file."""
+ shx = self.shx
+ if not shx:
+ raise ShapefileException("Shapefile Reader requires a shapefile or file-like object. (no shx file found")
+ # File length (16-bit word * 2 = bytes) - header length
+ shx.seek(24)
+ shxRecordLength = (unpack(">i", shx.read(4))[0] * 2) - 100
+ self.numShapes = shxRecordLength // 8
+
+ def __shxOffsets(self):
+ '''Reads the shape offset positions from a .shx file'''
+ shx = self.shx
+ if not shx:
+ raise ShapefileException("Shapefile Reader requires a shapefile or file-like object. (no shx file found")
+ # Jump to the first record.
+ shx.seek(100)
+ # Each index record consists of two nrs, we only want the first one
+ shxRecords = _Array('i', shx.read(2 * self.numShapes * 4) )
+ if sys.byteorder != 'big':
+ shxRecords.byteswap()
+ self._offsets = [2 * el for el in shxRecords[::2]]
+
def __shapeIndex(self, i=None):
"""Returns the offset in a .shp file for a shape based on information
in the .shx index file."""
shx = self.shx
- if not shx:
+ # Return None if no shx or no index requested
+ if not shx or i == None:
return None
+ # At this point, we know the shx file exists
if not self._offsets:
- if self.numShapes is None:
- # File length (16-bit word * 2 = bytes) - header length
- shx.seek(24)
- shxRecordLength = (unpack(">i", shx.read(4))[0] * 2) - 100
- self.numShapes = shxRecordLength // 8
- # Jump to the first record.
- shx.seek(100)
- # Each index record consists of two nrs, we only want the first one
- shxRecords = _Array('i', shx.read(2 * self.numShapes * 4) )
- if sys.byteorder != 'big':
- shxRecords.byteswap()
- self._offsets = [2 * el for el in shxRecords[::2]]
- if not i == None:
- return self._offsets[i]
+ self.__shxOffsets()
+ return self._offsets[i]
def shape(self, i=0, bbox=None):
"""Returns a shape object for a shape in the geometry
@@ -1373,10 +1420,30 @@ class Reader(object):
i = self.__restrictIndex(i)
offset = self.__shapeIndex(i)
if not offset:
- # Shx index not available so iterate the full list.
- for j,k in enumerate(self.iterShapes()):
- if j == i:
- return k
+ # Shx index not available.
+ # Determine length of shp file
+ shp.seek(0,2)
+ shpLength = shp.tell()
+ shp.seek(100)
+ # Do a fast shape iteration until the requested index or end of file.
+ unpack = Struct('>2i').unpack
+ _i = 0
+ offset = shp.tell()
+ while offset < shpLength:
+ if _i == i:
+ # Reached the requested index, exit loop with the offset value
+ break
+ # Unpack the shape header only
+ (recNum, recLength) = unpack(shp.read(8))
+ # Jump to next shape position
+ offset += 8 + (2 * recLength)
+ shp.seek(offset)
+ _i += 1
+ # If the index was not found, it likely means the .shp file is incomplete
+ if _i != i:
+ raise ShapefileException('Shape index {} is out of bounds; the .shp file only contains {} shapes'.format(i, _i))
+
+ # Seek to the offset and read the shape
shp.seek(offset)
return self.__shape(oid=i, bbox=bbox)
@@ -1385,21 +1452,8 @@ class Reader(object):
To only read shapes within a given spatial region, specify the 'bbox'
arg as a list or tuple of xmin,ymin,xmax,ymax.
"""
- shp = self.__getFileObj(self.shp)
- # Found shapefiles which report incorrect
- # shp file length in the header. Can't trust
- # that so we seek to the end of the file
- # and figure it out.
- shp.seek(0,2)
- self.shpLength = shp.tell()
- shp.seek(100)
shapes = Shapes()
- i = 0
- while shp.tell() < self.shpLength:
- shape = self.__shape(oid=i, bbox=bbox)
- if shape:
- shapes.append(shape)
- i += 1
+ shapes.extend(self.iterShapes(bbox=bbox))
return shapes
def iterShapes(self, bbox=None):
@@ -1409,15 +1463,40 @@ class Reader(object):
arg as a list or tuple of xmin,ymin,xmax,ymax.
"""
shp = self.__getFileObj(self.shp)
+ # Found shapefiles which report incorrect
+ # shp file length in the header. Can't trust
+ # that so we seek to the end of the file
+ # and figure it out.
shp.seek(0,2)
- self.shpLength = shp.tell()
+ shpLength = shp.tell()
shp.seek(100)
- i = 0
- while shp.tell() < self.shpLength:
- shape = self.__shape(oid=i, bbox=bbox)
- if shape:
- yield shape
- i += 1
+
+ if self.numShapes:
+ # Iterate exactly the number of shapes from shx header
+ for i in xrange(self.numShapes):
+ # MAYBE: check if more left of file or exit early?
+ shape = self.__shape(oid=i, bbox=bbox)
+ if shape:
+ yield shape
+ else:
+ # No shx file, unknown nr of shapes
+ # Instead iterate until reach end of file
+ # Collect the offset indices during iteration
+ i = 0
+ offsets = []
+ pos = shp.tell()
+ while pos < shpLength:
+ offsets.append(pos)
+ shape = self.__shape(oid=i, bbox=bbox)
+ pos = shp.tell()
+ if shape:
+ yield shape
+ i += 1
+ # Entire shp file consumed
+ # Update the number of shapes and list of offsets
+ assert i == len(offsets)
+ self.numShapes = i
+ self._offsets = offsets
def __dbfHeader(self):
"""Reads a dbf header. Xbase-related code borrows heavily from ActiveState Python Cookbook Recipe 362715 by Raymond Hettinger"""
@@ -1708,8 +1787,9 @@ class Writer(object):
self.shapeType = shapeType
self.shp = self.shx = self.dbf = None
if target:
+ target = pathlike_obj(target)
if not is_string(target):
- raise Exception('The target filepath {} must be of type str/unicode, not {}.'.format(repr(target), type(target)) )
+ raise Exception('The target filepath {} must be of type str/unicode or path-like, not {}.'.format(repr(target), type(target)) )
self.shp = self.__getFileObj(os.path.splitext(target)[0] + '.shp')
self.shx = self.__getFileObj(os.path.splitext(target)[0] + '.shx')
self.dbf = self.__getFileObj(os.path.splitext(target)[0] + '.dbf')
@@ -1786,14 +1866,13 @@ class Writer(object):
if self.dbf and dbf_open:
self.__dbfHeader()
- # Close files, if target is a filepath
- if self.target:
- for attribute in (self.shp, self.shx, self.dbf):
- if hasattr(attribute, 'close'):
- try:
- attribute.close()
- except IOError:
- pass
+ # Close files
+ for attribute in (self.shp, self.shx, self.dbf):
+ if hasattr(attribute, 'close'):
+ try:
+ attribute.close()
+ except IOError:
+ pass
def __getFileObj(self, f):
"""Safety handler to verify file-like objects"""
@@ -2013,7 +2092,8 @@ class Writer(object):
"not: %r" % s)
# Write to file
offset,length = self.__shpRecord(s)
- self.__shxRecord(offset, length)
+ if self.shx:
+ self.__shxRecord(offset, length)
def __shpRecord(self, s):
f = self.__getFileObj(self.shp)
=====================================
shapefiles/test/balancing.dbf
=====================================
Binary files a/shapefiles/test/balancing.dbf and b/shapefiles/test/balancing.dbf differ
=====================================
shapefiles/test/contextwriter.dbf
=====================================
Binary files a/shapefiles/test/contextwriter.dbf and b/shapefiles/test/contextwriter.dbf differ
=====================================
shapefiles/test/corrupt_too_long.dbf
=====================================
Binary files /dev/null and b/shapefiles/test/corrupt_too_long.dbf differ
=====================================
shapefiles/test/corrupt_too_long.shp
=====================================
Binary files /dev/null and b/shapefiles/test/corrupt_too_long.shp differ
=====================================
shapefiles/test/corrupt_too_long.shx
=====================================
Binary files /dev/null and b/shapefiles/test/corrupt_too_long.shx differ
=====================================
shapefiles/test/dtype.dbf
=====================================
Binary files a/shapefiles/test/dtype.dbf and b/shapefiles/test/dtype.dbf differ
=====================================
shapefiles/test/edit.dbf
=====================================
Binary files a/shapefiles/test/edit.dbf and b/shapefiles/test/edit.dbf differ
=====================================
shapefiles/test/line.dbf
=====================================
Binary files a/shapefiles/test/line.dbf and b/shapefiles/test/line.dbf differ
=====================================
shapefiles/test/linem.dbf
=====================================
Binary files a/shapefiles/test/linem.dbf and b/shapefiles/test/linem.dbf differ
=====================================
shapefiles/test/linez.dbf
=====================================
Binary files a/shapefiles/test/linez.dbf and b/shapefiles/test/linez.dbf differ
=====================================
shapefiles/test/merge.dbf
=====================================
Binary files a/shapefiles/test/merge.dbf and b/shapefiles/test/merge.dbf differ
=====================================
shapefiles/test/multipatch.dbf
=====================================
Binary files a/shapefiles/test/multipatch.dbf and b/shapefiles/test/multipatch.dbf differ
=====================================
shapefiles/test/multipoint.dbf
=====================================
Binary files a/shapefiles/test/multipoint.dbf and b/shapefiles/test/multipoint.dbf differ
=====================================
shapefiles/test/onlydbf.dbf
=====================================
Binary files a/shapefiles/test/onlydbf.dbf and b/shapefiles/test/onlydbf.dbf differ
=====================================
shapefiles/test/point.dbf
=====================================
Binary files a/shapefiles/test/point.dbf and b/shapefiles/test/point.dbf differ
=====================================
shapefiles/test/polygon.dbf
=====================================
Binary files a/shapefiles/test/polygon.dbf and b/shapefiles/test/polygon.dbf differ
=====================================
shapefiles/test/shapetype.dbf
=====================================
Binary files a/shapefiles/test/shapetype.dbf and b/shapefiles/test/shapetype.dbf differ
=====================================
shapefiles/test/testfile.dbf
=====================================
Binary files a/shapefiles/test/testfile.dbf and b/shapefiles/test/testfile.dbf differ
=====================================
test_shapefile.py
=====================================
@@ -3,15 +3,22 @@ This module tests the functionality of shapefile.py.
"""
# std lib imports
import os.path
+import sys
+if sys.version_info.major == 3:
+ from pathlib import Path
# third party imports
import pytest
import json
import datetime
+if sys.version_info.major == 2:
+ # required by pytest for python <36
+ from pathlib2 import Path
# our imports
import shapefile
+
# define various test shape tuples of (type, points, parts indexes, and expected geo interface output)
geo_interface_tests = [ (shapefile.POINT, # point
[(1,1)],
@@ -42,7 +49,7 @@ geo_interface_tests = [ (shapefile.POINT, # point
],
[0],
{'type':'Polygon','coordinates':[
- shapefile.rewind([(1,1),(1,9),(9,9),(9,1),(1,1)]),
+ [(1,1),(1,9),(9,9),(9,1),(1,1)],
]}
),
(shapefile.POLYGON, # single polygon, holes (ordered)
@@ -52,9 +59,9 @@ geo_interface_tests = [ (shapefile.POINT, # point
],
[0,5,5+5],
{'type':'Polygon','coordinates':[
- shapefile.rewind([(1,1),(1,9),(9,9),(9,1),(1,1)]), # exterior
- shapefile.rewind([(2,2),(4,2),(4,4),(2,4),(2,2)]), # hole 1
- shapefile.rewind([(5,5),(7,5),(7,7),(5,7),(5,5)]), # hole 2
+ [(1,1),(1,9),(9,9),(9,1),(1,1)], # exterior
+ [(2,2),(4,2),(4,4),(2,4),(2,2)], # hole 1
+ [(5,5),(7,5),(7,7),(5,7),(5,5)], # hole 2
]}
),
(shapefile.POLYGON, # single polygon, holes (unordered)
@@ -65,9 +72,9 @@ geo_interface_tests = [ (shapefile.POINT, # point
],
[0,5,5+5],
{'type':'Polygon','coordinates':[
- shapefile.rewind([(1,1),(1,9),(9,9),(9,1),(1,1)]), # exterior
- shapefile.rewind([(2,2),(4,2),(4,4),(2,4),(2,2)]), # hole 1
- shapefile.rewind([(5,5),(7,5),(7,7),(5,7),(5,5)]), # hole 2
+ [(1,1),(1,9),(9,9),(9,1),(1,1)], # exterior
+ [(2,2),(4,2),(4,4),(2,4),(2,2)], # hole 1
+ [(5,5),(7,5),(7,7),(5,7),(5,5)], # hole 2
]}
),
(shapefile.POLYGON, # multi polygon, no holes
@@ -77,10 +84,10 @@ geo_interface_tests = [ (shapefile.POINT, # point
[0,5],
{'type':'MultiPolygon','coordinates':[
[ # poly 1
- shapefile.rewind([(1,1),(1,9),(9,9),(9,1),(1,1)]),
+ [(1,1),(1,9),(9,9),(9,1),(1,1)],
],
[ # poly 2
- shapefile.rewind([(11,11),(11,19),(19,19),(19,11),(11,11)]),
+ [(11,11),(11,19),(19,19),(19,11),(11,11)],
],
]}
),
@@ -95,14 +102,14 @@ geo_interface_tests = [ (shapefile.POINT, # point
[0,5,10,15,20,25],
{'type':'MultiPolygon','coordinates':[
[ # poly 1
- shapefile.rewind([(1,1),(1,9),(9,9),(9,1),(1,1)]), # exterior
- shapefile.rewind([(2,2),(4,2),(4,4),(2,4),(2,2)]), # hole 1
- shapefile.rewind([(5,5),(7,5),(7,7),(5,7),(5,5)]), # hole 2
+ [(1,1),(1,9),(9,9),(9,1),(1,1)], # exterior
+ [(2,2),(4,2),(4,4),(2,4),(2,2)], # hole 1
+ [(5,5),(7,5),(7,7),(5,7),(5,5)], # hole 2
],
[ # poly 2
- shapefile.rewind([(11,11),(11,19),(19,19),(19,11),(11,11)]), # exterior
- shapefile.rewind([(12,12),(14,12),(14,14),(12,14),(12,12)]), # hole 1
- shapefile.rewind([(15,15),(17,15),(17,17),(15,17),(15,15)]), # hole 2
+ [(11,11),(11,19),(19,19),(19,11),(11,11)], # exterior
+ [(12,12),(14,12),(14,14),(12,14),(12,12)], # hole 1
+ [(15,15),(17,15),(17,17),(15,17),(15,15)], # hole 2
],
]}
),
@@ -116,15 +123,15 @@ geo_interface_tests = [ (shapefile.POINT, # point
[0,5,10,15,20],
{'type':'MultiPolygon','coordinates':[
[ # poly 1
- shapefile.rewind([(1,1),(1,9),(9,9),(9,1),(1,1)]), # exterior 1
- shapefile.rewind([(2,2),(8,2),(8,8),(2,8),(2,2)]), # hole 1.1
+ [(1,1),(1,9),(9,9),(9,1),(1,1)], # exterior 1
+ [(2,2),(8,2),(8,8),(2,8),(2,2)], # hole 1.1
],
[ # poly 2
- shapefile.rewind([(3,3),(3,7),(7,7),(7,3),(3,3)]), # exterior 2
- shapefile.rewind([(4,4),(6,4),(6,6),(4,6),(4,4)]), # hole 2.1
+ [(3,3),(3,7),(7,7),(7,3),(3,3)], # exterior 2
+ [(4,4),(6,4),(6,6),(4,6),(4,4)], # hole 2.1
],
[ # poly 3
- shapefile.rewind([(4.5,4.5),(4.5,5.5),(5.5,5.5),(5.5,4.5),(4.5,4.5)]), # exterior 3
+ [(4.5,4.5),(4.5,5.5),(5.5,5.5),(5.5,4.5),(4.5,4.5)], # exterior 3
],
]}
),
@@ -138,15 +145,15 @@ geo_interface_tests = [ (shapefile.POINT, # point
[0,5,10,15,20+3],
{'type':'MultiPolygon','coordinates':[
[ # poly 1
- shapefile.rewind([(1,1),(1,9),(9,9),(9,1),(1,1)]), # exterior 1
- shapefile.rewind([(2,2),(3,3),(4,2),(8,2),(8,8),(4,8),(2,8),(2,4),(2,2)]), # hole 1.1
+ [(1,1),(1,9),(9,9),(9,1),(1,1)], # exterior 1
+ [(2,2),(3,3),(4,2),(8,2),(8,8),(4,8),(2,8),(2,4),(2,2)], # hole 1.1
],
[ # poly 2
- shapefile.rewind([(3,3),(3,7),(7,7),(7,3),(3,3)]), # exterior 2
- shapefile.rewind([(4,4),(4,4),(6,4),(6,4),(6,4),(6,6),(4,6),(4,4)]), # hole 2.1
+ [(3,3),(3,7),(7,7),(7,3),(3,3)], # exterior 2
+ [(4,4),(4,4),(6,4),(6,4),(6,4),(6,6),(4,6),(4,4)], # hole 2.1
],
[ # poly 3
- shapefile.rewind([(4.5,4.5),(4.5,5.5),(5.5,5.5),(5.5,4.5),(4.5,4.5)]), # exterior 3
+ [(4.5,4.5),(4.5,5.5),(5.5,5.5),(5.5,4.5),(4.5,4.5)], # exterior 3
],
]}
),
@@ -162,17 +169,16 @@ geo_interface_tests = [ (shapefile.POINT, # point
[0,5,10,15,20,25,30],
{'type':'MultiPolygon','coordinates':[
[ # poly 1
- shapefile.rewind([(1,1),(1,9),(9,9),(9,1),(1,1)]), # exterior
- shapefile.rewind([(2,2),(4,2),(4,4),(2,4),(2,2)]), # hole 1
- shapefile.rewind([(5,5),(7,5),(7,7),(5,7),(5,5)]), # hole 2
+ [(1,1),(1,9),(9,9),(9,1),(1,1)], # exterior
+ [(2,2),(4,2),(4,4),(2,4),(2,2)], # hole 1
+ [(5,5),(7,5),(7,7),(5,7),(5,5)], # hole 2
],
[ # poly 2
- shapefile.rewind([(11,11),(11,19),(19,19),(19,11),(11,11)]), # exterior
- shapefile.rewind([(12,12),(14,12),(14,14),(12,14),(12,12)]), # hole 1
- shapefile.rewind([(15,15),(17,15),(17,17),(15,17),(15,15)]), # hole 2
+ [(11,11),(11,19),(19,19),(19,11),(11,11)], # exterior
+ [(12,12),(14,12),(14,14),(12,14),(12,12)], # hole 1
+ [(15,15),(17,15),(17,17),(15,17),(15,15)], # hole 2
],
- [ # poly 3 (orphaned hole)
- # Note: due to the hole-to-exterior conversion, should return the same ring orientation
+ [ # poly 3 (orphaned hole)
[(95,95),(97,95),(97,97),(95,97),(95,95)], # exterior
],
]}
@@ -184,11 +190,9 @@ geo_interface_tests = [ (shapefile.POINT, # point
[0,5],
{'type':'MultiPolygon','coordinates':[
[ # poly 1
- # Note: due to the hole-to-exterior conversion, should return the same ring orientation
[(1,1),(9,1),(9,9),(1,9),(1,1)],
],
[ # poly 2
- # Note: due to the hole-to-exterior conversion, should return the same ring orientation
[(11,11),(19,11),(19,19),(11,19),(11,11)],
],
]}
@@ -260,6 +264,7 @@ def test_reader_context_manager():
assert sf.shx.closed is True
+ at pytest.mark.network
def test_reader_url():
"""
Assert that Reader can open shapefiles from a url.
@@ -402,6 +407,65 @@ def test_reader_shapefile_extension_ignored():
assert not os.path.exists(filename)
+def test_reader_pathlike():
+ """
+ Assert that path-like objects can be read.
+ """
+ base = Path("shapefiles")
+ with shapefile.Reader(base / "blockgroups") as sf:
+ assert len(sf) == 663
+
+
+def test_reader_dbf_only():
+ """
+ Assert that specifying just the
+ dbf argument to the shapefile reader
+ reads just the dbf file.
+ """
+ with shapefile.Reader(dbf="shapefiles/blockgroups.dbf") as sf:
+ assert len(sf) == 663
+ record = sf.record(3)
+ assert record[1:3] == ['060750601001', 4715]
+
+
+def test_reader_shp_shx_only():
+ """
+ Assert that specifying just the
+ shp and shx argument to the shapefile reader
+ reads just the shp and shx file.
+ """
+ with shapefile.Reader(shp="shapefiles/blockgroups.shp", shx="shapefiles/blockgroups.shx") as sf:
+ assert len(sf) == 663
+ shape = sf.shape(3)
+ assert len(shape.points) is 173
+
+
+def test_reader_shp_dbf_only():
+ """
+ Assert that specifying just the
+ shp and shx argument to the shapefile reader
+ reads just the shp and dbf file.
+ """
+ with shapefile.Reader(shp="shapefiles/blockgroups.shp", dbf="shapefiles/blockgroups.dbf") as sf:
+ assert len(sf) == 663
+ shape = sf.shape(3)
+ assert len(shape.points) is 173
+ record = sf.record(3)
+ assert record[1:3] == ['060750601001', 4715]
+
+
+def test_reader_shp_only():
+ """
+ Assert that specifying just the
+ shp argument to the shapefile reader
+ reads just the shp file (shx optional).
+ """
+ with shapefile.Reader(shp="shapefiles/blockgroups.shp") as sf:
+ assert len(sf) == 663
+ shape = sf.shape(3)
+ assert len(shape.points) is 173
+
+
def test_reader_filelike_dbf_only():
"""
Assert that specifying just the
@@ -426,7 +490,21 @@ def test_reader_filelike_shp_shx_only():
assert len(shape.points) is 173
-def test_reader_filelike_shx_optional():
+def test_reader_filelike_shp_dbf_only():
+ """
+ Assert that specifying just the
+ shp and shx argument to the shapefile reader
+ reads just the shp and dbf file.
+ """
+ with shapefile.Reader(shp=open("shapefiles/blockgroups.shp", "rb"), dbf=open("shapefiles/blockgroups.dbf", "rb")) as sf:
+ assert len(sf) == 663
+ shape = sf.shape(3)
+ assert len(shape.points) is 173
+ record = sf.record(3)
+ assert record[1:3] == ['060750601001', 4715]
+
+
+def test_reader_filelike_shp_only():
"""
Assert that specifying just the
shp argument to the shapefile reader
@@ -602,6 +680,194 @@ def test_shape_oid():
assert shaperec.shape.oid == i
+def test_shape_oid_no_shx():
+ """
+ Assert that the shape's oid attribute returns
+ its index in the shapefile, when shx file is missing.
+ """
+ basename = "shapefiles/blockgroups"
+ shp = open(basename + ".shp", 'rb')
+ dbf = open(basename + ".dbf", 'rb')
+ with shapefile.Reader(shp=shp, dbf=dbf) as sf, \
+ shapefile.Reader(basename) as sf_expected:
+ for i in range(len(sf)):
+ shape = sf.shape(i)
+ assert shape.oid == i
+ shape_expected = sf_expected.shape(i)
+ assert shape.__geo_interface__ == shape_expected.__geo_interface__
+
+ for i,shape in enumerate(sf.shapes()):
+ assert shape.oid == i
+ shape_expected = sf_expected.shape(i)
+ assert shape.__geo_interface__ == shape_expected.__geo_interface__
+
+ for i,shape in enumerate(sf.iterShapes()):
+ assert shape.oid == i
+ shape_expected = sf_expected.shape(i)
+ assert shape.__geo_interface__ == shape_expected.__geo_interface__
+
+ for i,shaperec in enumerate(sf.iterShapeRecords()):
+ assert shaperec.shape.oid == i
+ shape_expected = sf_expected.shape(i)
+ assert shaperec.shape.__geo_interface__ == shape_expected.__geo_interface__
+
+
+def test_reader_offsets():
+ """
+ Assert that reader will not read the shx offsets unless necessary,
+ i.e. requesting a shape index.
+ """
+ basename = "shapefiles/blockgroups"
+ with shapefile.Reader(basename) as sf:
+ # shx offsets should not be read during loading
+ assert not sf._offsets
+ # reading a shape index should trigger reading offsets from shx file
+ shape = sf.shape(3)
+ assert len(sf._offsets) == len(sf.shapes())
+
+
+def test_reader_offsets_no_shx():
+ """
+ Assert that reading a shapefile without a shx file will not build
+ the offsets unless necessary, i.e. reading all the shapes.
+ """
+ basename = "shapefiles/blockgroups"
+ shp = open(basename + ".shp", 'rb')
+ dbf = open(basename + ".dbf", 'rb')
+ with shapefile.Reader(shp=shp, dbf=dbf) as sf:
+ # offsets should not be built during loading
+ assert not sf._offsets
+ # reading a shape index should iterate to the shape
+ # but the list of offsets should remain empty
+ shape = sf.shape(3)
+ assert not sf._offsets
+ # reading all the shapes should build the list of offsets
+ shapes = sf.shapes()
+ assert len(sf._offsets) == len(shapes)
+
+
+
+def test_reader_numshapes():
+ """
+ Assert that reader reads the numShapes attribute from the
+ shx file header during loading.
+ """
+ basename = "shapefiles/blockgroups"
+ with shapefile.Reader(basename) as sf:
+ # numShapes should be set during loading
+ assert sf.numShapes != None
+ # numShapes should equal the number of shapes
+ assert sf.numShapes == len(sf.shapes())
+
+
+def test_reader_numshapes_no_shx():
+ """
+ Assert that reading a shapefile without a shx file will have
+ an unknown value for the numShapes attribute (None), and that
+ reading all the shapes will set the numShapes attribute.
+ """
+ basename = "shapefiles/blockgroups"
+ shp = open(basename + ".shp", 'rb')
+ dbf = open(basename + ".dbf", 'rb')
+ with shapefile.Reader(shp=shp, dbf=dbf) as sf:
+ # numShapes should be unknown due to missing shx file
+ assert sf.numShapes == None
+ # numShapes should be set after reading all the shapes
+ shapes = sf.shapes()
+ assert sf.numShapes == len(shapes)
+
+
+def test_reader_len():
+ """
+ Assert that calling len() on reader is equal to length of
+ all shapes and records.
+ """
+ basename = "shapefiles/blockgroups"
+ with shapefile.Reader(basename) as sf:
+ assert len(sf) == len(sf.records()) == len(sf.shapes())
+
+
+def test_reader_len_not_loaded():
+ """
+ Assert that calling len() on reader that hasn't loaded a shapefile
+ yet is equal to 0.
+ """
+ with shapefile.Reader() as sf:
+ assert len(sf) == 0
+
+
+def test_reader_len_dbf_only():
+ """
+ Assert that calling len() on reader when reading a dbf file only,
+ is equal to length of all records.
+ """
+ basename = "shapefiles/blockgroups"
+ dbf = open(basename + ".dbf", 'rb')
+ with shapefile.Reader(dbf=dbf) as sf:
+ assert len(sf) == len(sf.records())
+
+
+def test_reader_len_no_dbf():
+ """
+ Assert that calling len() on reader when dbf file is missing,
+ is equal to length of all shapes.
+ """
+ basename = "shapefiles/blockgroups"
+ shp = open(basename + ".shp", 'rb')
+ shx = open(basename + ".shx", 'rb')
+ with shapefile.Reader(shp=shp, shx=shx) as sf:
+ assert len(sf) == len(sf.shapes())
+
+
+def test_reader_len_no_dbf_shx():
+ """
+ Assert that calling len() on reader when dbf and shx file is missing,
+ is equal to length of all shapes.
+ """
+ basename = "shapefiles/blockgroups"
+ shp = open(basename + ".shp", 'rb')
+ with shapefile.Reader(shp=shp) as sf:
+ assert len(sf) == len(sf.shapes())
+
+
+def test_reader_corrupt_files():
+ """
+ Assert that reader is able to handle corrupt files by
+ strictly going off the header information.
+ """
+ basename = "shapefiles/test/corrupt_too_long"
+
+ # write a shapefile with junk byte data at end of files
+ with shapefile.Writer(basename) as w:
+ w.field("test", "C", 50)
+ # add 10 line geoms
+ for _ in range(10):
+ w.record("value")
+ w.line([[(1,1),(1,2),(2,2)]])
+ # add junk byte data to end of dbf and shp files
+ w.dbf.write(b'12345')
+ w.shp.write(b'12345')
+
+ # read the corrupt shapefile and assert that it reads correctly
+ with shapefile.Reader(basename) as sf:
+ # assert correct shapefile length metadata
+ assert len(sf) == sf.numRecords == sf.numShapes == 10
+ # assert that records are read without error
+ assert len(sf.records()) == 10
+ # assert that didn't read the extra junk data
+ stopped = sf.dbf.tell()
+ sf.dbf.seek(0, 2)
+ end = sf.dbf.tell()
+ assert (end - stopped) == 5
+ # assert that shapes are read without error
+ assert len(sf.shapes()) == 10
+ # assert that didn't read the extra junk data
+ stopped = sf.shp.tell()
+ sf.shp.seek(0, 2)
+ end = sf.shp.tell()
+ assert (end - stopped) == 5
+
+
def test_bboxfilter_shape():
"""
Assert that applying the bbox filter to shape() correctly ignores the shape
@@ -817,38 +1083,93 @@ def test_write_shp_only(tmpdir):
shp argument to the shapefile writer
creates just a shp file.
"""
- filename = tmpdir.join("test.shp").strpath
- with shapefile.Writer(shp=filename) as writer:
- pass
+ filename = tmpdir.join("test").strpath
+ with shapefile.Writer(shp=open(filename+'.shp','wb')) as writer:
+ writer.point(1, 1)
+ assert writer.shp and not writer.shx and not writer.dbf
+ assert writer.shpNum == 1
+ assert len(writer) == 1
+ assert writer.shp.closed == True
# assert test.shp exists
- assert os.path.exists(filename)
+ assert os.path.exists(filename+'.shp')
+
+ # test that can read shapes
+ with shapefile.Reader(shp=open(filename+'.shp','rb')) as reader:
+ assert reader.shp and not reader.shx and not reader.dbf
+ assert (reader.numRecords, reader.numShapes) == (None, None) # numShapes is unknown in the absence of shx file
+ assert len(reader.shapes()) == 1
# assert test.shx does not exist
- assert not os.path.exists(tmpdir.join("test.shx").strpath)
+ assert not os.path.exists(filename+'.shx')
# assert test.dbf does not exist
- assert not os.path.exists(tmpdir.join("test.dbf").strpath)
+ assert not os.path.exists(filename+'.dbf')
-def test_write_shx_only(tmpdir):
+def test_write_shp_shx_only(tmpdir):
"""
- Assert that specifying just the
+ Assert that specifying just the shp and
shx argument to the shapefile writer
- creates just a shx file.
+ creates just a shp and shx file.
"""
- filename = tmpdir.join("test.shx").strpath
- with shapefile.Writer(shx=filename) as writer:
- pass
+ filename = tmpdir.join("test").strpath
+ with shapefile.Writer(shp=open(filename+'.shp','wb'), shx=open(filename+'.shx','wb')) as writer:
+ writer.point(1, 1)
+ assert writer.shp and writer.shx and not writer.dbf
+ assert writer.shpNum == 1
+ assert len(writer) == 1
+ assert writer.shp.closed == writer.shx.closed == True
+
+ # assert test.shp exists
+ assert os.path.exists(filename+'.shp')
# assert test.shx exists
- assert os.path.exists(filename)
+ assert os.path.exists(filename+'.shx')
- # assert test.shp does not exist
- assert not os.path.exists(tmpdir.join("test.shp").strpath)
+ # test that can read shapes and offsets
+ with shapefile.Reader(shp=open(filename+'.shp','rb'), shx=open(filename+'.shx','rb')) as reader:
+ assert reader.shp and reader.shx and not reader.dbf
+ assert (reader.numRecords, reader.numShapes) == (None, 1)
+ reader.shape(0) # trigger reading of shx offsets
+ assert len(reader._offsets) == 1
+ assert len(reader.shapes()) == 1
# assert test.dbf does not exist
- assert not os.path.exists(tmpdir.join("test.dbf").strpath)
+ assert not os.path.exists(filename+'.dbf')
+
+
+def test_write_shp_dbf_only(tmpdir):
+ """
+ Assert that specifying just the
+ shp and dbf argument to the shapefile writer
+ creates just a shp and dbf file.
+ """
+ filename = tmpdir.join("test").strpath
+ with shapefile.Writer(shp=open(filename+'.shp','wb'), dbf=open(filename+'.dbf','wb')) as writer:
+ writer.field('field1', 'C') # required to create a valid dbf file
+ writer.record('value')
+ writer.point(1, 1)
+ assert writer.shp and not writer.shx and writer.dbf
+ assert writer.shpNum == writer.recNum == 1
+ assert len(writer) == 1
+ assert writer.shp.closed == writer.dbf.closed == True
+
+ # assert test.shp exists
+ assert os.path.exists(filename+'.shp')
+
+ # assert test.dbf exists
+ assert os.path.exists(filename+'.dbf')
+
+ # test that can read records and shapes
+ with shapefile.Reader(shp=open(filename+'.shp','rb'), dbf=open(filename+'.dbf','rb')) as reader:
+ assert reader.shp and not reader.shx and reader.dbf
+ assert (reader.numRecords, reader.numShapes) == (1, None) # numShapes is unknown in the absence of shx file
+ assert len(reader.records()) == 1
+ assert len(reader.shapes()) == 1
+
+ # assert test.shx does not exist
+ assert not os.path.exists(filename+'.shx')
def test_write_dbf_only(tmpdir):
@@ -857,18 +1178,29 @@ def test_write_dbf_only(tmpdir):
dbf argument to the shapefile writer
creates just a dbf file.
"""
- filename = tmpdir.join("test.dbf").strpath
- with shapefile.Writer(dbf=filename) as writer:
+ filename = tmpdir.join("test").strpath
+ with shapefile.Writer(dbf=open(filename+'.dbf','wb')) as writer:
writer.field('field1', 'C') # required to create a valid dbf file
+ writer.record('value')
+ assert not writer.shp and not writer.shx and writer.dbf
+ assert writer.recNum == 1
+ assert len(writer) == 1
+ assert writer.dbf.closed == True
# assert test.dbf exists
- assert os.path.exists(filename)
+ assert os.path.exists(filename+'.dbf')
+
+ # test that can read records
+ with shapefile.Reader(dbf=open(filename+'.dbf','rb')) as reader:
+ assert not writer.shp and not writer.shx and writer.dbf
+ assert (reader.numRecords, reader.numShapes) == (1, None)
+ assert len(reader.records()) == 1
# assert test.shp does not exist
- assert not os.path.exists(tmpdir.join("test.shp").strpath)
+ assert not os.path.exists(filename+'.shp')
# assert test.shx does not exist
- assert not os.path.exists(tmpdir.join("test.shx").strpath)
+ assert not os.path.exists(filename+'.shx')
def test_write_default_shp_shx_dbf(tmpdir):
@@ -887,6 +1219,20 @@ def test_write_default_shp_shx_dbf(tmpdir):
assert os.path.exists(filename + ".dbf")
+def test_write_pathlike(tmpdir):
+ """
+ Assert that path-like objects can be written.
+ Similar to test_write_default_shp_shx_dbf.
+ """
+ filename = tmpdir.join("test")
+ assert not isinstance(filename, str)
+ with shapefile.Writer(filename) as writer:
+ writer.field('field1', 'C')
+ assert (filename + ".shp").ensure()
+ assert (filename + ".shx").ensure()
+ assert (filename + ".dbf").ensure()
+
+
def test_write_shapefile_extension_ignored(tmpdir):
"""
Assert that the filename's extension is
@@ -917,10 +1263,10 @@ def test_write_record(tmpdir):
with shapefile.Writer(filename) as writer:
writer.autoBalance = True
- writer.field('one', 'C') # many under length limit
- writer.field('two', 'C') # 1 under length limit
- writer.field('three', 'C') # at length limit
- writer.field('four', 'C') # 1 over length limit
+ writer.field('one', 'C')
+ writer.field('two', 'C')
+ writer.field('three', 'C')
+ writer.field('four', 'C')
values = ['one','two','three','four']
writer.record(*values)
@@ -944,10 +1290,10 @@ def test_write_partial_record(tmpdir):
with shapefile.Writer(filename) as writer:
writer.autoBalance = True
- writer.field('one', 'C') # many under length limit
- writer.field('two', 'C') # 1 under length limit
- writer.field('three', 'C') # at length limit
- writer.field('four', 'C') # 1 over length limit
+ writer.field('one', 'C')
+ writer.field('two', 'C')
+ writer.field('three', 'C')
+ writer.field('four', 'C')
values = ['one','two']
writer.record(*values)
@@ -1004,4 +1350,11 @@ def test_write_empty_shapefile(tmpdir, shape_type):
w.field('field1', 'C') # required to create a valid dbf file
with shapefile.Reader(filename) as r:
+ # test correct shape type
assert r.shapeType == shape_type
+ # test length 0
+ assert len(r) == r.numRecords == r.numShapes == 0
+ # test records are empty
+ assert len(r.records()) == 0
+ # test shapes are empty
+ assert len(r.shapes()) == 0
View it on GitLab: https://salsa.debian.org/debian-gis-team/pyshp/-/commit/2752b0da284505c6cd91f48857d3144cdf6320a8
--
View it on GitLab: https://salsa.debian.org/debian-gis-team/pyshp/-/commit/2752b0da284505c6cd91f48857d3144cdf6320a8
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-grass-devel/attachments/20220430/92311756/attachment-0001.htm>
More information about the Pkg-grass-devel
mailing list