[med-svn] [Git][med-team/hdmf][upstream] New upstream version 3.14.4

Tue Sep 10 20:35:04 BST 2024


Étienne Mollier pushed to branch upstream at Debian Med / hdmf


Commits:
640c0cb3 by Étienne Mollier at 2024-09-10T21:28:54+02:00
New upstream version 3.14.4
- - - - -


23 changed files:

- CHANGELOG.md
- PKG-INFO
- docs/source/install_developers.rst
- docs/source/install_users.rst
- src/hdmf/_version.py
- src/hdmf/backends/hdf5/h5_utils.py
- src/hdmf/backends/hdf5/h5tools.py
- src/hdmf/build/manager.py
- src/hdmf/build/objectmapper.py
- src/hdmf/common/resources.py
- src/hdmf/common/table.py
- src/hdmf/container.py
- src/hdmf/data_utils.py
- src/hdmf/query.py
- src/hdmf/spec/namespace.py
- src/hdmf/spec/spec.py
- src/hdmf/validate/validator.py
- tests/unit/build_tests/test_classgenerator.py
- tests/unit/build_tests/test_io_manager.py
- tests/unit/build_tests/test_io_map.py
- tests/unit/test_io_hdf5_h5tools.py
- tests/unit/test_multicontainerinterface.py
- tests/unit/validator_tests/test_validate.py


Changes:

=====================================
CHANGELOG.md
=====================================
@@ -1,11 +1,28 @@
 # HDMF Changelog
 
+## HDMF 3.14.4 (September 4, 2024)
+
+### Enhancements
+- Added support to append to a dataset of references for HDMF-Zarr. @mavaylon1 [#1157](https://github.com/hdmf-dev/hdmf/pull/1157)
+- Adjusted stacklevel of warnings to point to user code when possible. @rly [#1166](https://github.com/hdmf-dev/hdmf/pull/1166)
+- Improved "already exists" error message when adding a container to a `MultiContainerInterface`. @rly [#1165](https://github.com/hdmf-dev/hdmf/pull/1165)
+- Added support to write multidimensional string arrays. @stephprince [#1173](https://github.com/hdmf-dev/hdmf/pull/1173)
+- Add support for appending to a dataset of references. @mavaylon1 [#1135](https://github.com/hdmf-dev/hdmf/pull/1135)
+
+### Bug fixes
+- Fixed issue where scalar datasets with a compound data type were being written as non-scalar datasets @stephprince [#1176](https://github.com/hdmf-dev/hdmf/pull/1176)
+- Fixed H5DataIO not exposing `maxshape` on non-dci dsets. @cboulay [#1149](https://github.com/hdmf-dev/hdmf/pull/1149)
+- Fixed generation of classes in an extension that contain attributes or datasets storing references to other types defined in the extension.
+  @rly [#1183](https://github.com/hdmf-dev/hdmf/pull/1183)
+
 ## HDMF 3.14.3 (July 29, 2024)
 
 ### Enhancements
 - Added new attribute "dimension_labels" on `DatasetBuilder` which specifies the names of the dimensions used in the
 dataset based on the shape of the dataset data and the dimension names in the spec for the data type. This attribute
 is available on build (during the write process), but not on read of a dataset from a file. @rly [#1081](https://github.com/hdmf-dev/hdmf/pull/1081)
+- Speed up loading namespaces by skipping register_type when already registered. @magland [#1102](https://github.com/hdmf-dev/hdmf/pull/1102)
+- Speed up namespace loading: return a shallow copy rather than a deep copy in build_const_args. @magland [#1103](https://github.com/hdmf-dev/hdmf/pull/1103)
 
 ## HDMF 3.14.2 (July 7, 2024)
 


=====================================
PKG-INFO
=====================================
@@ -1,6 +1,6 @@
 Metadata-Version: 2.3
 Name: hdmf
-Version: 3.14.3
+Version: 3.14.4
 Summary: A hierarchical data modeling framework for modern science data standards
 Project-URL: Homepage, https://github.com/hdmf-dev/hdmf
 Project-URL: Bug Tracker, https://github.com/hdmf-dev/hdmf/issues


=====================================
docs/source/install_developers.rst
=====================================
@@ -73,7 +73,7 @@ environment by using the ``conda remove --name hdmf-venv --all`` command.
     For advanced users, we recommend using Mambaforge_, a faster version of the conda package manager
     that includes conda-forge as a default channel.
 
-.. _Anaconda: https://www.anaconda.com/products/distribution
+.. _Anaconda: https://www.anaconda.com/download
 .. _Mambaforge: https://github.com/conda-forge/miniforge
 
 Install from GitHub


=====================================
docs/source/install_users.rst
=====================================
@@ -29,4 +29,4 @@ You can also install HDMF using ``conda`` by running the following command in a
 
    conda install -c conda-forge hdmf
 
-.. _Anaconda Distribution: https://www.anaconda.com/products/distribution
+.. _Anaconda Distribution: https://www.anaconda.com/download


=====================================
src/hdmf/_version.py
=====================================
@@ -12,5 +12,5 @@ __version__: str
 __version_tuple__: VERSION_TUPLE
 version_tuple: VERSION_TUPLE
 
-__version__ = version = '3.14.3'
-__version_tuple__ = version_tuple = (3, 14, 3)
+__version__ = version = '3.14.4'
+__version_tuple__ = version_tuple = (3, 14, 4)


=====================================
src/hdmf/backends/hdf5/h5_utils.py
=====================================
@@ -17,11 +17,11 @@ import os
 import logging
 
 from ...array import Array
-from ...data_utils import DataIO, AbstractDataChunkIterator
+from ...data_utils import DataIO, AbstractDataChunkIterator, append_data
 from ...query import HDMFDataset, ReferenceResolver, ContainerResolver, BuilderResolver
 from ...region import RegionSlicer
 from ...spec import SpecWriter, SpecReader
-from ...utils import docval, getargs, popargs, get_docval
+from ...utils import docval, getargs, popargs, get_docval, get_data_shape
 
 
 class HDF5IODataChunkIteratorQueue(deque):
@@ -108,6 +108,20 @@ class H5Dataset(HDMFDataset):
     def shape(self):
         return self.dataset.shape
 
+    def append(self, arg):
+        # Get Builder
+        builder = self.io.manager.get_builder(arg)
+        if builder is None:
+            raise ValueError(
+                "The container being appended to the dataset has not yet been built. "
+                "Please write the container to the file, then open the modified file, and "
+                "append the read container to the dataset."
+            )
+
+        # Get HDF5 Reference
+        ref = self.io._create_ref(builder)
+        append_data(self.dataset, ref)
+
 
 class DatasetOfReferences(H5Dataset, ReferenceResolver, metaclass=ABCMeta):
     """
@@ -501,7 +515,7 @@ class H5DataIO(DataIO):
         # Check for possible collision with other parameters
         if not isinstance(getargs('data', kwargs), Dataset) and self.__link_data:
             self.__link_data = False
-            warnings.warn('link_data parameter in H5DataIO will be ignored', stacklevel=2)
+            warnings.warn('link_data parameter in H5DataIO will be ignored', stacklevel=3)
         # Call the super constructor and consume the data parameter
         super().__init__(**kwargs)
         # Construct the dict with the io args, ignoring all options that were set to None
@@ -525,7 +539,7 @@ class H5DataIO(DataIO):
                 self.__iosettings.pop('compression', None)
                 if 'compression_opts' in self.__iosettings:
                     warnings.warn('Compression disabled by compression=False setting. ' +
-                                  'compression_opts parameter will, therefore, be ignored.', stacklevel=2)
+                                  'compression_opts parameter will, therefore, be ignored.', stacklevel=3)
                     self.__iosettings.pop('compression_opts', None)
         # Validate the compression options used
         self._check_compression_options()
@@ -540,7 +554,7 @@ class H5DataIO(DataIO):
         if isinstance(self.data, Dataset):
             for k in self.__iosettings.keys():
                 warnings.warn("%s in H5DataIO will be ignored with H5DataIO.data being an HDF5 dataset" % k,
-                              stacklevel=2)
+                              stacklevel=3)
 
         self.__dataset = None
 
@@ -618,7 +632,7 @@ class H5DataIO(DataIO):
             if self.__iosettings['compression'] not in ['gzip', h5py_filters.h5z.FILTER_DEFLATE]:
                 warnings.warn(str(self.__iosettings['compression']) + " compression may not be available "
                               "on all installations of HDF5. Use of gzip is recommended to ensure portability of "
-                              "the generated HDF5 files.", stacklevel=3)
+                              "the generated HDF5 files.", stacklevel=4)
 
     @staticmethod
     def filter_available(filter, allow_plugin_filters):
@@ -658,3 +672,14 @@ class H5DataIO(DataIO):
         if isinstance(self.data, Dataset) and not self.data.id.valid:
             return False
         return super().valid
+
+    @property
+    def maxshape(self):
+        if 'maxshape' in self.io_settings:
+            return self.io_settings['maxshape']
+        elif hasattr(self.data, 'maxshape'):
+            return self.data.maxshape
+        elif hasattr(self, "shape"):
+            return self.shape
+        else:
+            return get_data_shape(self.data)


=====================================
src/hdmf/backends/hdf5/h5tools.py
=====================================
@@ -344,7 +344,7 @@ class HDF5IO(HDMFIO):
         warnings.warn("The copy_file class method is no longer supported and may be removed in a future version of "
                       "HDMF. Please use the export method or h5py.File.copy method instead.",
                       category=DeprecationWarning,
-                      stacklevel=2)
+                      stacklevel=3)
 
         source_filename, dest_filename, expand_external, expand_refs, expand_soft = getargs('source_filename',
                                                                                             'dest_filename',
@@ -698,6 +698,8 @@ class HDF5IO(HDMFIO):
                     d = ReferenceBuilder(target_builder)
                 kwargs['data'] = d
                 kwargs['dtype'] = d.dtype
+            elif h5obj.dtype.kind == 'V':  # scalar compound data type
+                kwargs['data'] = np.array(scalar, dtype=h5obj.dtype)
             else:
                 kwargs["data"] = scalar
         else:
@@ -1227,6 +1229,8 @@ class HDF5IO(HDMFIO):
 
                 return
             # If the compound data type contains only regular data (i.e., no references) then we can write it as usual
+            elif len(np.shape(data)) == 0:
+                dset = self.__scalar_fill__(parent, name, data, options)
             else:
                 dset = self.__list_fill__(parent, name, data, options)
         # Write a dataset containing references, i.e., a region or object reference.
@@ -1469,7 +1473,7 @@ class HDF5IO(HDMFIO):
             data_shape = io_settings.pop('shape')
         elif hasattr(data, 'shape'):
             data_shape = data.shape
-        elif isinstance(dtype, np.dtype):
+        elif isinstance(dtype, np.dtype) and len(dtype) > 1:  # check if compound dtype
             data_shape = (len(data),)
         else:
             data_shape = get_data_shape(data)
@@ -1514,6 +1518,7 @@ class HDF5IO(HDMFIO):
             self.logger.debug("Getting reference for %s '%s'" % (container.__class__.__name__, container.name))
             builder = self.manager.build(container)
         path = self.__get_path(builder)
+
         self.logger.debug("Getting reference at path '%s'" % path)
         if isinstance(container, RegionBuilder):
             region = container.region
@@ -1525,6 +1530,14 @@ class HDF5IO(HDMFIO):
         else:
             return self.__file[path].ref
 
+    @docval({'name': 'container', 'type': (Builder, Container, ReferenceBuilder), 'doc': 'the object to reference',
+             'default': None},
+            {'name': 'region', 'type': (slice, list, tuple), 'doc': 'the region reference indexing object',
+             'default': None},
+            returns='the reference', rtype=Reference)
+    def _create_ref(self, **kwargs):
+        return self.__get_ref(**kwargs)
+
     def __is_ref(self, dtype):
         if isinstance(dtype, DtypeSpec):
             return self.__is_ref(dtype.dtype)


=====================================
src/hdmf/build/manager.py
=====================================
@@ -7,7 +7,7 @@ from .builders import DatasetBuilder, GroupBuilder, LinkBuilder, Builder, BaseBu
 from .classgenerator import ClassGenerator, CustomClassGenerator, MCIClassGenerator
 from ..container import AbstractContainer, Container, Data
 from ..term_set import TypeConfigurator
-from ..spec import DatasetSpec, GroupSpec, NamespaceCatalog
+from ..spec import DatasetSpec, GroupSpec, NamespaceCatalog, RefSpec
 from ..spec.spec import BaseStorageSpec
 from ..utils import docval, getargs, ExtenderMeta, get_docval
 
@@ -480,6 +480,7 @@ class TypeMap:
         load_namespaces here has the advantage of being able to keep track of type dependencies across namespaces.
         '''
         deps = self.__ns_catalog.load_namespaces(**kwargs)
+        # register container types for each dependent type in each dependent namespace
         for new_ns, ns_deps in deps.items():
             for src_ns, types in ns_deps.items():
                 for dt in types:
@@ -529,7 +530,7 @@ class TypeMap:
                     namespace = ns_key
                     break
         if namespace is None:
-            raise ValueError("Namespace could not be resolved.")
+            raise ValueError(f"Namespace could not be resolved for data type '{data_type}'.")
 
         cls = self.__get_container_cls(namespace, data_type)
 
@@ -549,6 +550,8 @@ class TypeMap:
 
     def __check_dependent_types(self, spec, namespace):
         """Ensure that classes for all types used by this type exist in this namespace and generate them if not.
+
+        `spec` should be a GroupSpec or DatasetSpec in the `namespace`
         """
         def __check_dependent_types_helper(spec, namespace):
             if isinstance(spec, (GroupSpec, DatasetSpec)):
@@ -564,6 +567,16 @@ class TypeMap:
 
         if spec.data_type_inc is not None:
             self.get_dt_container_cls(spec.data_type_inc, namespace)
+
+        # handle attributes that have a reference dtype
+        for attr_spec in spec.attributes:
+            if isinstance(attr_spec.dtype, RefSpec):
+                self.get_dt_container_cls(attr_spec.dtype.target_type, namespace)
+        # handle datasets that have a reference dtype
+        if isinstance(spec, DatasetSpec):
+            if isinstance(spec.dtype, RefSpec):
+                self.get_dt_container_cls(spec.dtype.target_type, namespace)
+        # recurse into nested types
         if isinstance(spec, GroupSpec):
             for child_spec in (spec.groups + spec.datasets + spec.links):
                 __check_dependent_types_helper(child_spec, namespace)


=====================================
src/hdmf/build/objectmapper.py
=====================================
@@ -10,8 +10,11 @@ from .builders import DatasetBuilder, GroupBuilder, LinkBuilder, Builder, Refere
 from .errors import (BuildError, OrphanContainerBuildError, ReferenceTargetNotBuiltError, ContainerConfigurationError,
                      ConstructError)
 from .manager import Proxy, BuildManager
+
 from .warnings import (MissingRequiredBuildWarning, DtypeConversionWarning, IncorrectQuantityBuildWarning,
                        IncorrectDatasetShapeBuildWarning)
+from hdmf.backends.hdf5.h5_utils import H5DataIO
+
 from ..container import AbstractContainer, Data, DataRegion
 from ..term_set import TermSetWrapper
 from ..data_utils import DataIO, AbstractDataChunkIterator
@@ -598,11 +601,17 @@ class ObjectMapper(metaclass=ExtenderMeta):
 
     def __convert_string(self, value, spec):
         """Convert string types to the specified dtype."""
+        def __apply_string_type(value, string_type):
+            if isinstance(value, (list, tuple, np.ndarray, DataIO)):
+                return [__apply_string_type(item, string_type) for item in value]
+            else:
+                return string_type(value)
+
         ret = value
         if isinstance(spec, AttributeSpec):
             if 'text' in spec.dtype:
                 if spec.shape is not None or spec.dims is not None:
-                    ret = list(map(str, value))
+                    ret = __apply_string_type(value, str)
                 else:
                     ret = str(value)
         elif isinstance(spec, DatasetSpec):
@@ -618,7 +627,7 @@ class ObjectMapper(metaclass=ExtenderMeta):
                         return x.isoformat()  # method works for both date and datetime
                 if string_type is not None:
                     if spec.shape is not None or spec.dims is not None:
-                        ret = list(map(string_type, value))
+                        ret = __apply_string_type(value, string_type)
                     else:
                         ret = string_type(value)
                     # copy over any I/O parameters if they were specified
@@ -972,6 +981,9 @@ class ObjectMapper(metaclass=ExtenderMeta):
                 for d in container.data:
                     target_builder = self.__get_target_builder(d, build_manager, builder)
                     bldr_data.append(ReferenceBuilder(target_builder))
+                if isinstance(container.data, H5DataIO):
+                    # This is here to support appending a dataset of references.
+                    bldr_data = H5DataIO(bldr_data, **container.data.get_io_params())
             else:
                 self.logger.debug("Setting %s '%s' data to reference builder"
                                   % (builder.__class__.__name__, builder.name))


=====================================
src/hdmf/common/resources.py
=====================================
@@ -628,7 +628,7 @@ class HERD(Container):
             if entity_uri is not None:
                 entity_uri = entity.entity_uri
                 msg = 'This entity already exists. Ignoring new entity uri'
-                warn(msg, stacklevel=2)
+                warn(msg, stacklevel=3)
 
         #################
         # Validate Object


=====================================
src/hdmf/common/table.py
=====================================
@@ -717,7 +717,7 @@ class DynamicTable(Container):
                     warn(("Data has elements with different lengths and therefore cannot be coerced into an "
                           "N-dimensional array. Use the 'index' argument when creating a column to add rows "
                           "with different lengths."),
-                         stacklevel=2)
+                         stacklevel=3)
 
     def __eq__(self, other):
         """Compare if the two DynamicTables contain the same data.
@@ -776,7 +776,7 @@ class DynamicTable(Container):
 
         if isinstance(index, VectorIndex):
             warn("Passing a VectorIndex in for index may lead to unexpected behavior. This functionality will be "
-                 "deprecated in a future version of HDMF.", category=FutureWarning, stacklevel=2)
+                 "deprecated in a future version of HDMF.", category=FutureWarning, stacklevel=3)
 
         if name in self.__colids:  # column has already been added
             msg = "column '%s' already exists in %s '%s'" % (name, self.__class__.__name__, self.name)
@@ -793,7 +793,7 @@ class DynamicTable(Container):
                        "Please ensure the new column complies with the spec. "
                        "This will raise an error in a future version of HDMF."
                        % (name, self.__class__.__name__, spec_table))
-                warn(msg, stacklevel=2)
+                warn(msg, stacklevel=3)
 
             index_bool = index or not isinstance(index, bool)
             spec_index = self.__uninit_cols[name].get('index', False)
@@ -803,7 +803,7 @@ class DynamicTable(Container):
                        "Please ensure the new column complies with the spec. "
                        "This will raise an error in a future version of HDMF."
                        % (name, self.__class__.__name__, spec_index))
-                warn(msg, stacklevel=2)
+                warn(msg, stacklevel=3)
 
             spec_col_cls = self.__uninit_cols[name].get('class', VectorData)
             if col_cls != spec_col_cls:
@@ -841,7 +841,7 @@ class DynamicTable(Container):
                 warn(("Data has elements with different lengths and therefore cannot be coerced into an "
                       "N-dimensional array. Use the 'index' argument when adding a column of data with "
                       "different lengths."),
-                     stacklevel=2)
+                     stacklevel=3)
 
             # Check that we are asked to create an index
             if (isinstance(index, bool) or isinstance(index, int)) and index > 0 and len(data) > 0:


=====================================
src/hdmf/container.py
=====================================
@@ -629,12 +629,8 @@ class Container(AbstractContainer):
             template += "\nFields:\n"
         for k in sorted(self.fields):  # sorted to enable tests
             v = self.fields[k]
-            # if isinstance(v, DataIO) or not hasattr(v, '__len__') or len(v) > 0:
             if hasattr(v, '__len__'):
-                if isinstance(v, (np.ndarray, list, tuple)):
-                    if len(v) > 0:
-                        template += "  {}: {}\n".format(k, self.__smart_str(v, 1))
-                elif v:
+                if isinstance(v, (np.ndarray, list, tuple)) or v:
                     template += "  {}: {}\n".format(k, self.__smart_str(v, 1))
             else:
                 template += "  {}: {}\n".format(k, v)
@@ -894,7 +890,7 @@ class Data(AbstractContainer):
         warn(
             "Data.set_dataio() is deprecated. Please use Data.set_data_io() instead.",
             DeprecationWarning,
-            stacklevel=2,
+            stacklevel=3,
         )
         dataio = getargs('dataio', kwargs)
         dataio.data = self.__data
@@ -1142,7 +1138,9 @@ class MultiContainerInterface(Container):
                     # still need to mark self as modified
                     self.set_modified()
                 if tmp.name in d:
-                    msg = "'%s' already exists in %s '%s'" % (tmp.name, cls.__name__, self.name)
+                    msg = (f"Cannot add {tmp.__class__} '{tmp.name}' at 0x{id(tmp)} to dict attribute '{attr_name}' in "
+                           f"{cls} '{self.name}'. {d[tmp.name].__class__} '{tmp.name}' at 0x{id(d[tmp.name])} "
+                           f"already exists in '{attr_name}' and has the same name.")
                     raise ValueError(msg)
                 d[tmp.name] = tmp
             return container


=====================================
src/hdmf/data_utils.py
=====================================
@@ -18,7 +18,8 @@ import numpy as np
 from .utils import docval, getargs, popargs, docval_macro, get_data_shape
 
 def append_data(data, arg):
-    if isinstance(data, (list, DataIO)):
+    from hdmf.backends.hdf5.h5_utils import HDMFDataset
+    if isinstance(data, (list, DataIO, HDMFDataset)):
         data.append(arg)
         return data
     elif type(data).__name__ == 'TermSetWrapper': # circular import


=====================================
src/hdmf/query.py
=====================================
@@ -163,6 +163,12 @@ class HDMFDataset(metaclass=ExtenderMeta):
     def next(self):
         return self.dataset.next()
 
+    def append(self, arg):
+        """
+        Override this method to support appending to backend-specific datasets
+        """
+        pass # pragma: no cover
+
 
 class ReferenceResolver(metaclass=ABCMeta):
     """


=====================================
src/hdmf/spec/namespace.py
=====================================
@@ -50,13 +50,13 @@ class SpecNamespace(dict):
             self['full_name'] = full_name
         if version == str(SpecNamespace.UNVERSIONED):
             # the unversioned version may be written to file as a string and read from file as a string
-            warn("Loaded namespace '%s' is unversioned. Please notify the extension author." % name, stacklevel=2)
+            warn(f"Loaded namespace '{name}' is unversioned. Please notify the extension author.")
             version = SpecNamespace.UNVERSIONED
         if version is None:
             # version is required on write -- see YAMLSpecWriter.write_namespace -- but can be None on read in order to
             # be able to read older files with extensions that are missing the version key.
-            warn(("Loaded namespace '%s' is missing the required key 'version'. Version will be set to '%s'. "
-                  "Please notify the extension author.") % (name, SpecNamespace.UNVERSIONED), stacklevel=2)
+            warn(f"Loaded namespace '{name}' is missing the required key 'version'. Version will be set to "
+                 f"'{SpecNamespace.UNVERSIONED}'. Please notify the extension author.")
             version = SpecNamespace.UNVERSIONED
         self['version'] = version
         if date is not None:
@@ -466,15 +466,19 @@ class NamespaceCatalog:
         return included_types
 
     def __register_type(self, ndt, inc_ns, catalog, registered_types):
-        spec = inc_ns.get_spec(ndt)
-        spec_file = inc_ns.catalog.get_spec_source_file(ndt)
-        self.__register_dependent_types(spec, inc_ns, catalog, registered_types)
-        if isinstance(spec, DatasetSpec):
-            built_spec = self.dataset_spec_cls.build_spec(spec)
+        if ndt in registered_types:
+            # already registered
+            pass
         else:
-            built_spec = self.group_spec_cls.build_spec(spec)
-        registered_types.add(ndt)
-        catalog.register_spec(built_spec, spec_file)
+            spec = inc_ns.get_spec(ndt)
+            spec_file = inc_ns.catalog.get_spec_source_file(ndt)
+            self.__register_dependent_types(spec, inc_ns, catalog, registered_types)
+            if isinstance(spec, DatasetSpec):
+                built_spec = self.dataset_spec_cls.build_spec(spec)
+            else:
+                built_spec = self.group_spec_cls.build_spec(spec)
+            registered_types.add(ndt)
+            catalog.register_spec(built_spec, spec_file)
 
     def __register_dependent_types(self, spec, inc_ns, catalog, registered_types):
         """Ensure that classes for all types used by this type are registered
@@ -529,7 +533,7 @@ class NamespaceCatalog:
                 if ns['version'] != self.__namespaces.get(ns['name'])['version']:
                     # warn if the cached namespace differs from the already loaded namespace
                     warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
-                         % (ns['name'], ns['version'], self.__namespaces.get(ns['name'])['version']), stacklevel=2)
+                         % (ns['name'], ns['version'], self.__namespaces.get(ns['name'])['version']))
             else:
                 to_load.append(ns)
         # now load specs into namespace


=====================================
src/hdmf/spec/spec.py
=====================================
@@ -1,7 +1,6 @@
 import re
 from abc import ABCMeta
 from collections import OrderedDict
-from copy import deepcopy
 from warnings import warn
 
 from ..utils import docval, getargs, popargs, get_docval
@@ -84,7 +83,7 @@ class ConstructableDict(dict, metaclass=ABCMeta):
     def build_const_args(cls, spec_dict):
         ''' Build constructor arguments for this ConstructableDict class from a dictionary '''
         # main use cases are when spec_dict is a ConstructableDict or a spec dict read from a file
-        return deepcopy(spec_dict)
+        return spec_dict.copy()
 
     @classmethod
     def build_spec(cls, spec_dict):
@@ -322,7 +321,7 @@ class BaseStorageSpec(Spec):
         default_name = getargs('default_name', kwargs)
         if default_name:
             if name is not None:
-                warn("found 'default_name' with 'name' - ignoring 'default_name'", stacklevel=2)
+                warn("found 'default_name' with 'name' - ignoring 'default_name'")
             else:
                 self['default_name'] = default_name
         self.__attributes = dict()


=====================================
src/hdmf/validate/validator.py
=====================================
@@ -134,7 +134,7 @@ def get_type(data, builder_dtype=None):
     elif isinstance(data, ReferenceResolver):
         return data.dtype, None
     # Numpy nd-array data
-    elif isinstance(data, np.ndarray):
+    elif isinstance(data, np.ndarray) and len(data.dtype) <= 1:
         if data.size > 0:
             return get_type(data[0], builder_dtype)
         else:
@@ -147,11 +147,14 @@ def get_type(data, builder_dtype=None):
     # Case for h5py.Dataset and other I/O specific array types
     else:
         # Compound dtype
-        if builder_dtype and isinstance(builder_dtype, list):
+        if builder_dtype and len(builder_dtype) > 1:
             dtypes = []
             string_formats = []
             for i in range(len(builder_dtype)):
-                dtype, string_format = get_type(data[0][i])
+                if len(np.shape(data)) == 0:
+                    dtype, string_format = get_type(data[()][i])
+                else:
+                    dtype, string_format = get_type(data[0][i])
                 dtypes.append(dtype)
                 string_formats.append(string_format)
             return dtypes, string_formats
@@ -438,7 +441,9 @@ class DatasetValidator(BaseStorageValidator):
             except EmptyArrayError:
                 # do not validate dtype of empty array. HDMF does not yet set dtype when writing a list/tuple
                 pass
-        if isinstance(builder.dtype, list):
+        if builder.dtype is not None and len(builder.dtype) > 1 and len(np.shape(builder.data)) == 0:
+            shape = ()  # scalar compound dataset
+        elif isinstance(builder.dtype, list):
             shape = (len(builder.data), )  # only 1D datasets with compound types are supported
         else:
             shape = get_data_shape(data)


=====================================
tests/unit/build_tests/test_classgenerator.py
=====================================
@@ -7,7 +7,9 @@ from warnings import warn
 from hdmf.build import TypeMap, CustomClassGenerator
 from hdmf.build.classgenerator import ClassGenerator, MCIClassGenerator
 from hdmf.container import Container, Data, MultiContainerInterface, AbstractContainer
-from hdmf.spec import GroupSpec, AttributeSpec, DatasetSpec, SpecCatalog, SpecNamespace, NamespaceCatalog, LinkSpec
+from hdmf.spec import (
+    GroupSpec, AttributeSpec, DatasetSpec, SpecCatalog, SpecNamespace, NamespaceCatalog, LinkSpec, RefSpec
+)
 from hdmf.testing import TestCase
 from hdmf.utils import get_docval, docval
 
@@ -180,10 +182,11 @@ class TestDynamicContainer(TestCase):
         baz_spec = GroupSpec('A test extension with no Container class',
                              data_type_def='Baz', data_type_inc=self.bar_spec,
                              attributes=[AttributeSpec('attr3', 'a float attribute', 'float'),
-                                         AttributeSpec('attr4', 'another float attribute', 'float')])
+                                         AttributeSpec('attr4', 'another float attribute', 'float'),
+                                         AttributeSpec('attr_array', 'an array attribute', 'text', shape=(None,)),])
         self.spec_catalog.register_spec(baz_spec, 'extension.yaml')
         cls = self.type_map.get_dt_container_cls('Baz', CORE_NAMESPACE)
-        expected_args = {'name', 'data', 'attr1', 'attr2', 'attr3', 'attr4', 'skip_post_init'}
+        expected_args = {'name', 'data', 'attr1', 'attr2', 'attr3', 'attr4', 'attr_array', 'skip_post_init'}
         received_args = set()
 
         for x in get_docval(cls.__init__):
@@ -211,7 +214,7 @@ class TestDynamicContainer(TestCase):
                                          AttributeSpec('attr4', 'another float attribute', 'float')])
         self.spec_catalog.register_spec(baz_spec, 'extension.yaml')
         cls = self.type_map.get_dt_container_cls('Baz', CORE_NAMESPACE)
-        expected_args = {'name', 'data', 'attr1', 'attr2', 'attr3', 'attr4', 'foo', 'skip_post_init'}
+        expected_args = {'name', 'data', 'attr1', 'attr2', 'attr3', 'attr4', 'attr_array', 'foo', 'skip_post_init'}
         received_args = set(map(lambda x: x['name'], get_docval(cls.__init__)))
         self.assertSetEqual(expected_args, received_args)
         self.assertEqual(cls.__name__, 'Baz')
@@ -733,9 +736,18 @@ class TestGetClassSeparateNamespace(TestCase):
                 GroupSpec(data_type_inc='Bar', doc='a bar', quantity='?')
             ]
         )
+        moo_spec = DatasetSpec(
+            doc='A test dataset that is a 1D array of object references of Baz',
+            data_type_def='Moo',
+            shape=(None,),
+            dtype=RefSpec(
+                reftype='object',
+                target_type='Baz'
+            )
+        )
         create_load_namespace_yaml(
             namespace_name='ndx-test',
-            specs=[baz_spec],
+            specs=[baz_spec, moo_spec],
             output_dir=self.test_dir,
             incl_types={
                 CORE_NAMESPACE: ['Bar'],
@@ -827,6 +839,171 @@ class TestGetClassSeparateNamespace(TestCase):
 
         self._check_classes(baz_cls, bar_cls, bar_cls2, qux_cls, qux_cls2)
 
+class TestGetClassObjectReferences(TestCase):
+
+    def setUp(self):
+        self.test_dir = tempfile.mkdtemp()
+        if os.path.exists(self.test_dir):  # start clean
+            self.tearDown()
+        os.mkdir(self.test_dir)
+        self.type_map = TypeMap()
+
+    def tearDown(self):
+        shutil.rmtree(self.test_dir)
+
+    def test_get_class_include_dataset_of_references(self):
+        """Test that get_class resolves datasets of object references."""
+        qux_spec = DatasetSpec(
+            doc='A test extension',
+            data_type_def='Qux'
+        )
+        moo_spec = DatasetSpec(
+            doc='A test dataset that is a 1D array of object references of Qux',
+            data_type_def='Moo',
+            shape=(None,),
+            dtype=RefSpec(
+                reftype='object',
+                target_type='Qux'
+            ),
+        )
+
+        create_load_namespace_yaml(
+            namespace_name='ndx-test',
+            specs=[qux_spec, moo_spec],
+            output_dir=self.test_dir,
+            incl_types={},
+            type_map=self.type_map
+        )
+        # no types should be resolved to start
+        assert self.type_map.get_container_classes('ndx-test') == []
+
+        self.type_map.get_dt_container_cls('Moo', 'ndx-test')
+        # now, Moo and Qux should be resolved
+        assert len(self.type_map.get_container_classes('ndx-test')) == 2
+        assert "Moo" in [c.__name__ for c in self.type_map.get_container_classes('ndx-test')]
+        assert "Qux" in [c.__name__ for c in self.type_map.get_container_classes('ndx-test')]
+
+    def test_get_class_include_attribute_object_reference(self):
+        """Test that get_class resolves data types with an attribute that is an object reference."""
+        qux_spec = DatasetSpec(
+            doc='A test extension',
+            data_type_def='Qux'
+        )
+        woo_spec = DatasetSpec(
+            doc='A test dataset that has a scalar object reference to a Qux',
+            data_type_def='Woo',
+            attributes=[
+                AttributeSpec(
+                    name='attr1',
+                    doc='a string attribute',
+                    dtype=RefSpec(reftype='object', target_type='Qux')
+                ),
+            ]
+        )
+        create_load_namespace_yaml(
+            namespace_name='ndx-test',
+            specs=[qux_spec, woo_spec],
+            output_dir=self.test_dir,
+            incl_types={},
+            type_map=self.type_map
+        )
+        # no types should be resolved to start
+        assert self.type_map.get_container_classes('ndx-test') == []
+
+        self.type_map.get_dt_container_cls('Woo', 'ndx-test')
+        # now, Woo and Qux should be resolved
+        assert len(self.type_map.get_container_classes('ndx-test')) == 2
+        assert "Woo" in [c.__name__ for c in self.type_map.get_container_classes('ndx-test')]
+        assert "Qux" in [c.__name__ for c in self.type_map.get_container_classes('ndx-test')]
+
+    def test_get_class_include_nested_object_reference(self):
+        """Test that get_class resolves nested datasets that are object references."""
+        qux_spec = DatasetSpec(
+            doc='A test extension',
+            data_type_def='Qux'
+        )
+        spam_spec = DatasetSpec(
+            doc='A test extension',
+            data_type_def='Spam',
+            shape=(None,),
+            dtype=RefSpec(
+                reftype='object',
+                target_type='Qux'
+            ),
+        )
+        goo_spec = GroupSpec(
+            doc='A test dataset that has a nested dataset (Spam) that has a scalar object reference to a Qux',
+            data_type_def='Goo',
+            datasets=[
+                DatasetSpec(
+                    doc='a dataset',
+                    data_type_inc='Spam',
+                ),
+            ],
+        )
+
+        create_load_namespace_yaml(
+            namespace_name='ndx-test',
+            specs=[qux_spec, spam_spec, goo_spec],
+            output_dir=self.test_dir,
+            incl_types={},
+            type_map=self.type_map
+        )
+        # no types should be resolved to start
+        assert self.type_map.get_container_classes('ndx-test') == []
+
+        self.type_map.get_dt_container_cls('Goo', 'ndx-test')
+        # now, Goo, Spam, and Qux should be resolved
+        assert len(self.type_map.get_container_classes('ndx-test')) == 3
+        assert "Goo" in [c.__name__ for c in self.type_map.get_container_classes('ndx-test')]
+        assert "Spam" in [c.__name__ for c in self.type_map.get_container_classes('ndx-test')]
+        assert "Qux" in [c.__name__ for c in self.type_map.get_container_classes('ndx-test')]
+
+    def test_get_class_include_nested_attribute_object_reference(self):
+        """Test that get_class resolves nested datasets that have an attribute that is an object reference."""
+        qux_spec = DatasetSpec(
+            doc='A test extension',
+            data_type_def='Qux'
+        )
+        bam_spec = DatasetSpec(
+            doc='A test extension',
+            data_type_def='Bam',
+            attributes=[
+                AttributeSpec(
+                    name='attr1',
+                    doc='a string attribute',
+                    dtype=RefSpec(reftype='object', target_type='Qux')
+                ),
+            ],
+        )
+        boo_spec = GroupSpec(
+            doc='A test dataset that has a nested dataset (Spam) that has a scalar object reference to a Qux',
+            data_type_def='Boo',
+            datasets=[
+                DatasetSpec(
+                    doc='a dataset',
+                    data_type_inc='Bam',
+                ),
+            ],
+        )
+
+        create_load_namespace_yaml(
+            namespace_name='ndx-test',
+            specs=[qux_spec, bam_spec, boo_spec],
+            output_dir=self.test_dir,
+            incl_types={},
+            type_map=self.type_map
+        )
+        # no types should be resolved to start
+        assert self.type_map.get_container_classes('ndx-test') == []
+
+        self.type_map.get_dt_container_cls('Boo', 'ndx-test')
+        # now, Boo, Bam, and Qux should be resolved
+        assert len(self.type_map.get_container_classes('ndx-test')) == 3
+        assert "Boo" in [c.__name__ for c in self.type_map.get_container_classes('ndx-test')]
+        assert "Bam" in [c.__name__ for c in self.type_map.get_container_classes('ndx-test')]
+        assert "Qux" in [c.__name__ for c in self.type_map.get_container_classes('ndx-test')]
+
 
 class EmptyBar(Container):
     pass


=====================================
tests/unit/build_tests/test_io_manager.py
=====================================
@@ -341,7 +341,7 @@ class TestRetrieveContainerClass(TestBase):
         self.assertIs(ret, Foo)
 
     def test_get_dt_container_cls_no_namespace(self):
-        with self.assertRaisesWith(ValueError, "Namespace could not be resolved."):
+        with self.assertRaisesWith(ValueError, "Namespace could not be resolved for data type 'Unknown'."):
             self.type_map.get_dt_container_cls(data_type="Unknown")
 
 


=====================================
tests/unit/build_tests/test_io_map.py
=====================================
@@ -9,6 +9,7 @@ from hdmf.spec import (GroupSpec, AttributeSpec, DatasetSpec, SpecCatalog, SpecN
 from hdmf.testing import TestCase
 from abc import ABCMeta, abstractmethod
 import unittest
+import numpy as np
 
 from tests.unit.helpers.utils import CORE_NAMESPACE, create_test_type_map
 
@@ -20,24 +21,27 @@ class Bar(Container):
             {'name': 'attr1', 'type': str, 'doc': 'an attribute'},
             {'name': 'attr2', 'type': int, 'doc': 'another attribute'},
             {'name': 'attr3', 'type': float, 'doc': 'a third attribute', 'default': 3.14},
+            {'name': 'attr_array', 'type': 'array_data', 'doc': 'another attribute', 'default': (1, 2, 3)},
             {'name': 'foo', 'type': 'Foo', 'doc': 'a group', 'default': None})
     def __init__(self, **kwargs):
-        name, data, attr1, attr2, attr3, foo = getargs('name', 'data', 'attr1', 'attr2', 'attr3', 'foo', kwargs)
+        name, data, attr1, attr2, attr3, attr_array, foo = getargs('name', 'data', 'attr1', 'attr2', 'attr3',
+                                                                   'attr_array', 'foo', kwargs)
         super().__init__(name=name)
         self.__data = data
         self.__attr1 = attr1
         self.__attr2 = attr2
         self.__attr3 = attr3
+        self.__attr_array = attr_array
         self.__foo = foo
         if self.__foo is not None and self.__foo.parent is None:
             self.__foo.parent = self
 
     def __eq__(self, other):
-        attrs = ('name', 'data', 'attr1', 'attr2', 'attr3', 'foo')
+        attrs = ('name', 'data', 'attr1', 'attr2', 'attr3', 'attr_array', 'foo')
         return all(getattr(self, a) == getattr(other, a) for a in attrs)
 
     def __str__(self):
-        attrs = ('name', 'data', 'attr1', 'attr2', 'attr3', 'foo')
+        attrs = ('name', 'data', 'attr1', 'attr2', 'attr3', 'attr_array', 'foo')
         return ','.join('%s=%s' % (a, getattr(self, a)) for a in attrs)
 
     @property
@@ -60,6 +64,10 @@ class Bar(Container):
     def attr3(self):
         return self.__attr3
 
+    @property
+    def attr_array(self):
+        return self.__attr_array
+
     @property
     def foo(self):
         return self.__foo
@@ -333,12 +341,15 @@ class TestMapStrings(TestCase):
                              datasets=[DatasetSpec('an example dataset', 'text', name='data', shape=(None,),
                                                    attributes=[AttributeSpec(
                                                        'attr2', 'an example integer attribute', 'int')])],
-                             attributes=[AttributeSpec('attr1', 'an example string attribute', 'text')])
+                             attributes=[AttributeSpec('attr1', 'an example string attribute', 'text'),
+                                         AttributeSpec('attr_array', 'an example array attribute', 'text',
+                                            shape=(None,))])
         type_map = self.customSetUp(bar_spec)
         type_map.register_map(Bar, BarMapper)
-        bar_inst = Bar('my_bar', ['a', 'b', 'c', 'd'], 'value1', 10)
+        bar_inst = Bar('my_bar', ['a', 'b', 'c', 'd'], 'value1', 10, attr_array=['a', 'b', 'c', 'd'])
         builder = type_map.build(bar_inst)
-        self.assertEqual(builder.get('data').data, ['a', 'b', 'c', 'd'])
+        np.testing.assert_array_equal(builder.get('data').data, np.array(['a', 'b', 'c', 'd']))
+        np.testing.assert_array_equal(builder.get('attr_array'), np.array(['a', 'b', 'c', 'd']))
 
     def test_build_scalar(self):
         bar_spec = GroupSpec('A test group specification with a data type',
@@ -353,6 +364,102 @@ class TestMapStrings(TestCase):
         builder = type_map.build(bar_inst)
         self.assertEqual(builder.get('data').data, "['a', 'b', 'c', 'd']")
 
+    def test_build_2d_lol(self):
+        bar_spec = GroupSpec(
+            doc='A test group specification with a data type',
+            data_type_def='Bar',
+            datasets=[
+                DatasetSpec(
+                    doc='an example dataset',
+                    dtype='text',
+                    name='data',
+                    shape=(None, None),
+                    attributes=[AttributeSpec(name='attr2', doc='an example integer attribute', dtype='int')],
+                )
+            ],
+            attributes=[AttributeSpec(name='attr_array', doc='an example array attribute', dtype='text',
+                                      shape=(None, None))],
+        )
+        type_map = self.customSetUp(bar_spec)
+        type_map.register_map(Bar, BarMapper)
+        str_lol_2d = [['aa', 'bb'], ['cc', 'dd']]
+        bar_inst = Bar('my_bar', str_lol_2d, 'value1', 10, attr_array=str_lol_2d)
+        builder = type_map.build(bar_inst)
+        self.assertEqual(builder.get('data').data, str_lol_2d)
+        self.assertEqual(builder.get('attr_array'), str_lol_2d)
+
+    def test_build_2d_ndarray(self):
+        bar_spec = GroupSpec(
+            doc='A test group specification with a data type',
+            data_type_def='Bar',
+            datasets=[
+                DatasetSpec(
+                    doc='an example dataset',
+                    dtype='text',
+                    name='data',
+                    shape=(None, None),
+                    attributes=[AttributeSpec(name='attr2', doc='an example integer attribute', dtype='int')],
+                )
+            ],
+            attributes=[AttributeSpec(name='attr_array', doc='an example array attribute', dtype='text',
+                                      shape=(None, None))],
+        )
+        type_map = self.customSetUp(bar_spec)
+        type_map.register_map(Bar, BarMapper)
+        str_array_2d = np.array([['aa', 'bb'], ['cc', 'dd']])
+        bar_inst = Bar('my_bar', str_array_2d, 'value1', 10, attr_array=str_array_2d)
+        builder = type_map.build(bar_inst)
+        np.testing.assert_array_equal(builder.get('data').data, str_array_2d)
+        np.testing.assert_array_equal(builder.get('attr_array'), str_array_2d)
+
+    def test_build_3d_lol(self):
+        bar_spec = GroupSpec(
+            doc='A test group specification with a data type',
+            data_type_def='Bar',
+            datasets=[
+                DatasetSpec(
+                    doc='an example dataset',
+                    dtype='text',
+                    name='data',
+                    shape=(None, None, None),
+                    attributes=[AttributeSpec(name='attr2', doc='an example integer attribute', dtype='int')],
+                )
+            ],
+            attributes=[AttributeSpec(name='attr_array', doc='an example array attribute', dtype='text',
+                                      shape=(None, None, None))],
+        )
+        type_map = self.customSetUp(bar_spec)
+        type_map.register_map(Bar, BarMapper)
+        str_lol_3d = [[['aa', 'bb'], ['cc', 'dd']], [['ee', 'ff'], ['gg', 'hh']]]
+        bar_inst = Bar('my_bar', str_lol_3d, 'value1', 10, attr_array=str_lol_3d)
+        builder = type_map.build(bar_inst)
+        self.assertEqual(builder.get('data').data, str_lol_3d)
+        self.assertEqual(builder.get('attr_array'), str_lol_3d)
+
+    def test_build_3d_ndarray(self):
+        bar_spec = GroupSpec(
+            doc='A test group specification with a data type',
+            data_type_def='Bar',
+            datasets=[
+                DatasetSpec(
+                    doc='an example dataset',
+                    dtype='text',
+                    name='data',
+                    shape=(None, None, None),
+                    attributes=[AttributeSpec(name='attr2', doc='an example integer attribute', dtype='int')],
+                )
+            ],
+            attributes=[AttributeSpec(name='attr_array', doc='an example array attribute', dtype='text',
+                                      shape=(None, None, None))],
+        )
+        type_map = self.customSetUp(bar_spec)
+        type_map.register_map(Bar, BarMapper)
+        str_array_3d = np.array([[['aa', 'bb'], ['cc', 'dd']], [['ee', 'ff'], ['gg', 'hh']]])
+        bar_inst = Bar('my_bar', str_array_3d, 'value1', 10, attr_array=str_array_3d)
+        builder = type_map.build(bar_inst)
+        np.testing.assert_array_equal(builder.get('data').data, str_array_3d)
+        np.testing.assert_array_equal(builder.get('attr_array'), str_array_3d)
+
     def test_build_dataio(self):
         bar_spec = GroupSpec('A test group specification with a data type',
                              data_type_def='Bar',


=====================================
tests/unit/test_io_hdf5_h5tools.py
=====================================
@@ -24,7 +24,7 @@ from hdmf import Data, docval
 from hdmf.data_utils import DataChunkIterator, GenericDataChunkIterator, InvalidDataIOError
 from hdmf.spec.catalog import SpecCatalog
 from hdmf.spec.namespace import NamespaceCatalog, SpecNamespace
-from hdmf.spec.spec import GroupSpec
+from hdmf.spec.spec import GroupSpec, DtypeSpec
 from hdmf.testing import TestCase, remove_test_file
 from hdmf.common.resources import HERD
 from hdmf.term_set import TermSet, TermSetWrapper
@@ -144,6 +144,16 @@ class H5IOTest(TestCase):
             read_a = read_a.decode('utf-8')
         self.assertEqual(read_a, a)
 
+    def test_write_dataset_scalar_compound(self):
+        cmpd_dtype = np.dtype([('x', np.int32), ('y', np.float64)])
+        a = np.array((1, 0.1), dtype=cmpd_dtype)
+        self.io.write_dataset(self.f, DatasetBuilder('test_dataset', a,
+                                                     dtype=[DtypeSpec('x', doc='x', dtype='int32'),
+                                                            DtypeSpec('y', doc='y', dtype='float64')]))
+        dset = self.f['test_dataset']
+        self.assertTupleEqual(dset.shape, ())
+        self.assertEqual(dset[()].tolist(), a.tolist())
+
     ##########################################
     #  write_dataset tests: TermSetWrapper
     ##########################################
@@ -164,6 +174,31 @@ class H5IOTest(TestCase):
         dset = self.f['test_dataset']
         self.assertTrue(np.all(dset[:] == a))
 
+    def test_write_dataset_lol_strings(self):
+        a = [['aa', 'bb'], ['cc', 'dd']]
+        self.io.write_dataset(self.f, DatasetBuilder('test_dataset', a, attributes={}))
+        dset = self.f['test_dataset']
+        decoded_dset = [[item.decode('utf-8') if isinstance(item, bytes) else item for item in sublist]
+                        for sublist in dset[:]]
+        self.assertTrue(decoded_dset == a)
+
+    def test_write_dataset_list_compound_datatype(self):
+        a = np.array([(1, 2, 0.5), (3, 4, 0.5)], dtype=[('x', 'int'), ('y', 'int'), ('z', 'float')])
+        dset_builder = DatasetBuilder(
+                    name='test_dataset',
+                    data=a.tolist(),
+                    attributes={},
+                    dtype=[
+                        DtypeSpec('x', doc='x', dtype='int'),
+                        DtypeSpec('y', doc='y', dtype='int'),
+                        DtypeSpec('z', doc='z', dtype='float'),
+                    ],
+                )
+        self.io.write_dataset(self.f, dset_builder)
+        dset = self.f['test_dataset']
+        for field in a.dtype.names:
+            self.assertTrue(np.all(dset[field][:] == a[field]))
+
     def test_write_dataset_list_compress_gzip(self):
         a = H5DataIO(np.arange(30).reshape(5, 2, 3),
                      compression='gzip',
@@ -572,6 +607,12 @@ class H5IOTest(TestCase):
     #############################################
     #  H5DataIO general
     #############################################
+    def test_pass_through_of_maxshape_on_h5dataset(self):
+        k = 10
+        self.io.write_dataset(self.f, DatasetBuilder('test_dataset', np.arange(k), attributes={}))
+        dset = H5DataIO(self.f['test_dataset'])
+        self.assertEqual(dset.maxshape, (k,))
+
     def test_warning_on_non_gzip_compression(self):
         # Make sure no warning is issued when using gzip
         with warnings.catch_warnings(record=True) as w:
@@ -762,6 +803,17 @@ class H5IOTest(TestCase):
                 self.assertEqual(str(bldr['test_dataset'].data),
                                  '<HDF5 dataset "test_dataset": shape (5,), type "|O">')
 
+    def test_read_scalar_compound(self):
+        cmpd_dtype = np.dtype([('x', np.int32), ('y', np.float64)])
+        a = np.array((1, 0.1), dtype=cmpd_dtype)
+        self.io.write_dataset(self.f, DatasetBuilder('test_dataset', a,
+                                                     dtype=[DtypeSpec('x', doc='x', dtype='int32'),
+                                                            DtypeSpec('y', doc='y', dtype='float64')]))
+        self.io.close()
+        with HDF5IO(self.path, 'r') as io:
+            bldr = io.read_builder()
+            np.testing.assert_array_equal(bldr['test_dataset'].data[()], a)
+
 
 class TestRoundTrip(TestCase):
 
@@ -2958,6 +3010,57 @@ class TestExport(TestCase):
             self.assertEqual(f['foofile_data'].file.filename, self.paths[1])
             self.assertIsInstance(f.attrs['foo_ref_attr'], h5py.Reference)
 
+    def test_append_dataset_of_references(self):
+        """Test that exporting a written container with a dataset of references works."""
+        bazs = []
+        num_bazs = 1
+        for i in range(num_bazs):
+            bazs.append(Baz(name='baz%d' % i))
+        array_bazs=np.array(bazs)
+        wrapped_bazs = H5DataIO(array_bazs, maxshape=(None,))
+        baz_data = BazData(name='baz_data1', data=wrapped_bazs)
+        bucket = BazBucket(name='bucket1', bazs=bazs.copy(), baz_data=baz_data)
+
+        with HDF5IO(self.paths[0], manager=get_baz_buildmanager(), mode='w') as write_io:
+            write_io.write(bucket)
+
+        with HDF5IO(self.paths[0], manager=get_baz_buildmanager(), mode='a') as append_io:
+            read_bucket1 = append_io.read()
+            new_baz = Baz(name='new')
+            read_bucket1.add_baz(new_baz)
+            append_io.write(read_bucket1)
+
+        with HDF5IO(self.paths[0], manager=get_baz_buildmanager(), mode='a') as ref_io:
+            read_bucket1 = ref_io.read()
+            DoR = read_bucket1.baz_data.data
+            DoR.append(read_bucket1.bazs['new'])
+
+        with HDF5IO(self.paths[0], manager=get_baz_buildmanager(), mode='r') as read_io:
+            read_bucket1 = read_io.read()
+            self.assertEqual(len(read_bucket1.baz_data.data), 2)
+            self.assertIs(read_bucket1.baz_data.data[1], read_bucket1.bazs["new"])
+
+    def test_append_dataset_of_references_orphaned_target(self):
+        bazs = []
+        num_bazs = 1
+        for i in range(num_bazs):
+            bazs.append(Baz(name='baz%d' % i))
+        array_bazs=np.array(bazs)
+        wrapped_bazs = H5DataIO(array_bazs, maxshape=(None,))
+        baz_data = BazData(name='baz_data1', data=wrapped_bazs)
+        bucket = BazBucket(name='bucket1', bazs=bazs.copy(), baz_data=baz_data)
+
+        with HDF5IO(self.paths[0], manager=get_baz_buildmanager(), mode='w') as write_io:
+            write_io.write(bucket)
+
+        with HDF5IO(self.paths[0], manager=get_baz_buildmanager(), mode='a') as ref_io:
+            read_bucket1 = ref_io.read()
+            new_baz = Baz(name='new')
+            read_bucket1.add_baz(new_baz)
+            DoR = read_bucket1.baz_data.data
+            with self.assertRaises(ValueError):
+                DoR.append(read_bucket1.bazs['new'])
+
     def test_append_external_link_data(self):
         """Test that exporting a written container after adding a link with link_data=True creates external links."""
         foo1 = Foo('foo1', [1, 2, 3, 4, 5], "I am foo1", 17, 3.14)
@@ -3666,6 +3769,14 @@ class H5DataIOTests(TestCase):
         with self.assertRaisesRegex(ValueError, "Setting data when dtype and shape are not None is not supported"):
             dataio.data = list()
 
+    def test_dataio_maxshape(self):
+        dataio = H5DataIO(data=np.arange(10), maxshape=(None,))
+        self.assertEqual(dataio.maxshape, (None,))
+
+    def test_dataio_maxshape_from_data(self):
+        dataio = H5DataIO(data=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
+        self.assertEqual(dataio.maxshape, (10,))
+
 
 def test_hdf5io_can_read():
     assert not HDF5IO.can_read("not_a_file")


=====================================
tests/unit/test_multicontainerinterface.py
=====================================
@@ -198,7 +198,10 @@ class TestBasic(TestCase):
         """Test that adding a container to the attribute dict correctly adds the container."""
         obj1 = Container('obj1')
         foo = Foo(obj1)
-        msg = "'obj1' already exists in Foo 'Foo'"
+        msg = (f"Cannot add <class 'hdmf.container.Container'> 'obj1' at 0x{id(obj1)} to dict attribute "
+               "'containers' in <class 'tests.unit.test_multicontainerinterface.Foo'> 'Foo'. "
+               f"<class 'hdmf.container.Container'> 'obj1' at 0x{id(obj1)} already exists in 'containers' "
+               "and has the same name.")
         with self.assertRaisesWith(ValueError, msg):
             foo.add_container(obj1)
 


=====================================
tests/unit/validator_tests/test_validate.py
=====================================
@@ -501,6 +501,28 @@ class TestDtypeValidation(TestCase):
         results = self.vmap.validate(bar_builder)
         self.assertEqual(len(results), 0)
 
+    def test_scalar_compound_dtype(self):
+        """Test that validator allows scalar compound dtype data where a compound dtype is specified."""
+        spec_catalog = SpecCatalog()
+        dtype = [DtypeSpec('x', doc='x', dtype='int'), DtypeSpec('y', doc='y', dtype='float')]
+        spec = GroupSpec('A test group specification with a data type',
+                         data_type_def='Bar',
+                         datasets=[DatasetSpec('an example dataset', dtype, name='data',)],
+                         attributes=[AttributeSpec('attr1', 'an example attribute', 'text',)])
+        spec_catalog.register_spec(spec, 'test2.yaml')
+        self.namespace = SpecNamespace(
+            'a test namespace', CORE_NAMESPACE, [{'source': 'test2.yaml'}], version='0.1.0', catalog=spec_catalog)
+        self.vmap = ValidatorMap(self.namespace)
+
+        value = np.array((1, 2.2), dtype=[('x', 'int'), ('y', 'float')])
+        bar_builder = GroupBuilder('my_bar',
+                                   attributes={'data_type': 'Bar', 'attr1': 'test'},
+                                   datasets=[DatasetBuilder(name='data',
+                                                            data=value,
+                                                            dtype=[DtypeSpec('x', doc='x', dtype='int'),
+                                                                   DtypeSpec('y', doc='y', dtype='float'),],),])
+        results = self.vmap.validate(bar_builder)
+        self.assertEqual(len(results), 0)
 
 class Test1DArrayValidation(TestCase):
 



View it on GitLab: https://salsa.debian.org/med-team/hdmf/-/commit/640c0cb353e4a311e6a56d0f895ef351e60a8fee

-- 
View it on GitLab: https://salsa.debian.org/med-team/hdmf/-/commit/640c0cb353e4a311e6a56d0f895ef351e60a8fee
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20240910/9bc932ac/attachment-0001.htm>