[med-svn] [Git][med-team/hdmf][master] 5 commits: New upstream version 3.5.1

Andreas Tille (@tille) gitlab at salsa.debian.org
Sun Feb 5 20:44:54 GMT 2023



Andreas Tille pushed to branch master at Debian Med / hdmf


Commits:
98669c45 by Andreas Tille at 2023-02-05T21:39:58+01:00
New upstream version 3.5.1
- - - - -
b8e2e70a by Andreas Tille at 2023-02-05T21:39:58+01:00
routine-update: New upstream version

- - - - -
227e768c by Andreas Tille at 2023-02-05T21:40:00+01:00
Update upstream source from tag 'upstream/3.5.1'

Update to upstream version '3.5.1'
with Debian dir 4d2bb152330675a24165756be9398a0de11a23ed
- - - - -
195be857 by Andreas Tille at 2023-02-05T21:40:00+01:00
routine-update: Standards-Version: 4.6.2

- - - - -
a29f833f by Andreas Tille at 2023-02-05T21:41:55+01:00
routine-update: Ready to upload to unstable

- - - - -


19 changed files:

- Legal.txt
- PKG-INFO
- README.rst
- debian/changelog
- debian/control
- license.txt
- requirements.txt
- setup.cfg
- src/hdmf.egg-info/PKG-INFO
- src/hdmf/_version.py
- src/hdmf/backends/hdf5/h5tools.py
- src/hdmf/backends/io.py
- src/hdmf/common/resources.py
- src/hdmf/data_utils.py
- src/hdmf/testing/testcase.py
- tests/unit/common/test_resources.py
- tests/unit/test_io_hdf5_h5tools.py
- tests/unit/utils.py
- tox.ini


Changes:

=====================================
Legal.txt
=====================================
@@ -1,4 +1,4 @@
-“hdmf” Copyright (c) 2017-2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
+“hdmf” Copyright (c) 2017-2023, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
 
 If you have questions about your rights to use or distribute this software, please contact Berkeley Lab's Innovation & Partnerships Office at IPO at lbl.gov.
 


=====================================
PKG-INFO
=====================================
@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: hdmf
-Version: 3.4.7
+Version: 3.5.1
 Summary: A package for standardizing hierarchical object data
 Home-page: https://github.com/hdmf-dev/hdmf
 Author: Andrew Tritt
@@ -117,7 +117,7 @@ Citing HDMF
 LICENSE
 =======
 
-"hdmf" Copyright (c) 2017-2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
+"hdmf" Copyright (c) 2017-2023, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
 Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
 
 (1) Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
@@ -133,7 +133,7 @@ You are under no obligation whatsoever to provide any bug fixes, patches, or upg
 COPYRIGHT
 =========
 
-"hdmf" Copyright (c) 2017-2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
+"hdmf" Copyright (c) 2017-2023, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
 If you have questions about your rights to use or distribute this software, please contact Berkeley Lab's Innovation & Partnerships Office at IPO at lbl.gov.
 
 NOTICE.  This Software was developed under funding from the U.S. Department of Energy and the U.S. Government consequently retains certain rights. As such, the U.S. Government has been granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the Software to reproduce, distribute copies to the public, prepare derivative works, and perform publicly and display publicly, and to permit other to do so.


=====================================
README.rst
=====================================
@@ -92,7 +92,7 @@ Citing HDMF
 LICENSE
 =======
 
-"hdmf" Copyright (c) 2017-2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
+"hdmf" Copyright (c) 2017-2023, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
 Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
 
 (1) Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
@@ -108,7 +108,7 @@ You are under no obligation whatsoever to provide any bug fixes, patches, or upg
 COPYRIGHT
 =========
 
-"hdmf" Copyright (c) 2017-2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
+"hdmf" Copyright (c) 2017-2023, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
 If you have questions about your rights to use or distribute this software, please contact Berkeley Lab's Innovation & Partnerships Office at IPO at lbl.gov.
 
 NOTICE.  This Software was developed under funding from the U.S. Department of Energy and the U.S. Government consequently retains certain rights. As such, the U.S. Government has been granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the Software to reproduce, distribute copies to the public, prepare derivative works, and perform publicly and display publicly, and to permit other to do so.


=====================================
debian/changelog
=====================================
@@ -1,3 +1,11 @@
+hdmf (3.5.1-1) unstable; urgency=medium
+
+  * Team upload.
+  * New upstream version
+  * Standards-Version: 4.6.2 (routine-update)
+
+ -- Andreas Tille <tille at debian.org>  Sun, 05 Feb 2023 21:40:33 +0100
+
 hdmf (3.4.7-1) unstable; urgency=medium
 
   * Team upload.


=====================================
debian/control
=====================================
@@ -21,7 +21,7 @@ Build-Depends: dh-python,
                python3-urllib3,
                python3-pandas,
                python3-unittest2
-Standards-Version: 4.6.1
+Standards-Version: 4.6.2
 Vcs-Browser: https://salsa.debian.org/med-team/hdmf
 Vcs-Git: https://salsa.debian.org/med-team/hdmf.git
 Homepage: https://github.com/hdmf-dev/hdmf


=====================================
license.txt
=====================================
@@ -1,4 +1,4 @@
-“hdmf” Copyright (c) 2017-2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
+“hdmf” Copyright (c) 2017-2023, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
 
 Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
 


=====================================
requirements.txt
=====================================
@@ -9,4 +9,4 @@ pandas==1.3.5;python_version<'3.8'  # note that pandas 1.4 dropped python 3.7 su
 ruamel.yaml==0.17.21
 scipy==1.9.3;python_version>='3.8'
 scipy==1.7.3;python_version<'3.8'   # note that scipy 1.8 dropped python 3.7 support
-setuptools==65.4.1
+setuptools==65.5.1


=====================================
setup.cfg
=====================================
@@ -30,6 +30,7 @@ per-file-ignores =
 	src/hdmf/validate/__init__.py:F401
 	setup.py:T201
 	test.py:T201
+	test_gallery.py:T201
 
 [metadata]
 description_file = README.rst


=====================================
src/hdmf.egg-info/PKG-INFO
=====================================
@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: hdmf
-Version: 3.4.7
+Version: 3.5.1
 Summary: A package for standardizing hierarchical object data
 Home-page: https://github.com/hdmf-dev/hdmf
 Author: Andrew Tritt
@@ -117,7 +117,7 @@ Citing HDMF
 LICENSE
 =======
 
-"hdmf" Copyright (c) 2017-2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
+"hdmf" Copyright (c) 2017-2023, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
 Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
 
 (1) Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
@@ -133,7 +133,7 @@ You are under no obligation whatsoever to provide any bug fixes, patches, or upg
 COPYRIGHT
 =========
 
-"hdmf" Copyright (c) 2017-2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
+"hdmf" Copyright (c) 2017-2023, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy).  All rights reserved.
 If you have questions about your rights to use or distribute this software, please contact Berkeley Lab's Innovation & Partnerships Office at IPO at lbl.gov.
 
 NOTICE.  This Software was developed under funding from the U.S. Department of Energy and the U.S. Government consequently retains certain rights. As such, the U.S. Government has been granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the Software to reproduce, distribute copies to the public, prepare derivative works, and perform publicly and display publicly, and to permit other to do so.


=====================================
src/hdmf/_version.py
=====================================
@@ -8,11 +8,11 @@ import json
 
 version_json = '''
 {
- "date": "2022-11-09T21:53:41-0800",
+ "date": "2023-01-26T16:25:19-0800",
  "dirty": false,
  "error": null,
- "full-revisionid": "ccc5c252db697a65439086ff925f729e5eae118d",
- "version": "3.4.7"
+ "full-revisionid": "8222de45f0c251b8c23674457111ba21935d8972",
+ "version": "3.5.1"
 }
 '''  # END VERSION_JSON
 


=====================================
src/hdmf/backends/hdf5/h5tools.py
=====================================
@@ -55,6 +55,9 @@ class HDF5IO(HDMFIO):
         path, manager, mode, comm, file_obj, driver = popargs('path', 'manager', 'mode', 'comm', 'file', 'driver',
                                                               kwargs)
 
+        self.__open_links = []  # keep track of other files opened from links in this file
+        self.__file = None  # This will be set below, but set to None first in case an error occurs and we need to close
+
         if path is None and file_obj is None:
             raise ValueError("You must supply either a path or a file.")
 
@@ -89,7 +92,6 @@ class HDF5IO(HDMFIO):
         self.__dci_queue = HDF5IODataChunkIteratorQueue()  # a queue of DataChunkIterators that need to be exhausted
         ObjectMapper.no_convert(Dataset)
         self._written_builders = WriteStatusTracker()  # track which builders were written (or read) by this IO object
-        self.__open_links = []      # keep track of other files opened from links in this file
 
     @property
     def comm(self):
@@ -736,8 +738,15 @@ class HDF5IO(HDMFIO):
         """
         if close_links:
             self.close_linked_files()
-        if self.__file is not None:
-            self.__file.close()
+        try:
+            if self.__file is not None:
+                self.__file.close()
+        except AttributeError:
+            # Do not do anything in case that self._file does not exist. This
+            # may happen in case that an error occurs before HDF5IO has been fully
+            # setup in __init__, e.g,. if a child class (such as NWBHDF5IO) raises
+            # an error before self.__file has been created
+            self.__file = None
 
     def close_linked_files(self):
         """Close all opened, linked-to files.
@@ -746,10 +755,19 @@ class HDF5IO(HDMFIO):
         not, which prevents the linked-to file from being deleted or truncated. Use this method to close all opened,
         linked-to files.
         """
-        for obj in self.__open_links:
-            if obj:
-                obj.file.close()
-        self.__open_links = []
+        # Make sure
+        try:
+            for obj in self.__open_links:
+                if obj:
+                    obj.file.close()
+        except AttributeError:
+            # Do not do anything in case that self.__open_links does not exist. This
+            # may happen in case that an error occurs before HDF5IO has been fully
+            # setup in __init__, e.g,. if a child class (such as NWBHDF5IO) raises
+            # an error before self.__open_links has been created.
+            pass
+        finally:
+            self.__open_links = []
 
     @docval({'name': 'builder', 'type': GroupBuilder, 'doc': 'the GroupBuilder object representing the HDF5 file'},
             {'name': 'link_data', 'type': bool,
@@ -769,7 +787,7 @@ class HDF5IO(HDMFIO):
         for name, dbldr in f_builder.datasets.items():
             self.write_dataset(self.__file, dbldr, **kwargs)
         for name, lbldr in f_builder.links.items():
-            self.write_link(self.__file, lbldr)
+            self.write_link(self.__file, lbldr, export_source=kwargs.get("export_source"))
         self.set_attributes(self.__file, f_builder.attributes)
         self.__add_refs()
         self.__dci_queue.exhaust_queue()
@@ -957,7 +975,7 @@ class HDF5IO(HDMFIO):
         links = builder.links
         if links:
             for link_name, sub_builder in links.items():
-                self.write_link(group, sub_builder)
+                self.write_link(group, sub_builder, export_source=kwargs.get("export_source"))
         attributes = builder.attributes
         self.set_attributes(group, attributes)
         self.__set_written(builder)
@@ -985,9 +1003,11 @@ class HDF5IO(HDMFIO):
 
     @docval({'name': 'parent', 'type': Group, 'doc': 'the parent HDF5 object'},
             {'name': 'builder', 'type': LinkBuilder, 'doc': 'the LinkBuilder to write'},
+            {'name': 'export_source', 'type': str,
+             'doc': 'The source of the builders when exporting', 'default': None},
             returns='the Link that was created', rtype='Link')
     def write_link(self, **kwargs):
-        parent, builder = getargs('parent', 'builder', kwargs)
+        parent, builder, export_source = getargs('parent', 'builder', 'export_source', kwargs)
         self.logger.debug("Writing LinkBuilder '%s' to parent group '%s'" % (builder.name, parent.name))
         if self.get_written(builder):
             self.logger.debug("    LinkBuilder '%s' is already written" % builder.name)
@@ -996,7 +1016,12 @@ class HDF5IO(HDMFIO):
         target_builder = builder.builder
         path = self.__get_path(target_builder)
         # source will indicate target_builder's location
-        if builder.source == target_builder.source:
+        if export_source is None:
+            write_source = builder.source
+        else:
+            write_source = export_source
+
+        if write_source == target_builder.source:
             link_obj = SoftLink(path)
             self.logger.debug("    Creating SoftLink '%s/%s' to '%s'"
                               % (parent.name, name, link_obj.path))


=====================================
src/hdmf/backends/io.py
=====================================
@@ -132,3 +132,6 @@ class HDMFIO(metaclass=ABCMeta):
 
     def __exit__(self, type, value, traceback):
         self.close()
+
+    def __del__(self):
+        self.close()


=====================================
src/hdmf/common/resources.py
=====================================
@@ -1,4 +1,5 @@
 import pandas as pd
+import numpy as np
 import re
 from . import register_class, EXP_NAMESPACE
 from . import get_type_map
@@ -165,6 +166,58 @@ class ExternalResources(Container):
         self.object_keys = kwargs['object_keys'] or ObjectKeyTable()
         self.type_map = kwargs['type_map'] or get_type_map()
 
+    @staticmethod
+    def assert_external_resources_equal(left, right, check_dtype=True):
+        """
+        Compare that the keys, resources, entities, objects, and object_keys tables match
+
+        :param left: ExternalResources object to compare with right
+        :param right: ExternalResources object to compare with left
+        :param check_dtype: Enforce strict checking of dtypes. Dtypes may be different
+            for example for ids, where depending on how the data was saved
+            ids may change from int64 to int32. (Default: True)
+        :returns: The function returns True if all values match. If mismatches are found,
+            AssertionError will be raised.
+        :raises AssertionError: Raised if any differences are found. The function collects
+            all differences into a single error so that the assertion will indicate
+            all found differences.
+        """
+        errors = []
+        try:
+            pd.testing.assert_frame_equal(left.keys.to_dataframe(),
+                                          right.keys.to_dataframe(),
+                                          check_dtype=check_dtype)
+        except AssertionError as e:
+            errors.append(e)
+        try:
+            pd.testing.assert_frame_equal(left.objects.to_dataframe(),
+                                          right.objects.to_dataframe(),
+                                          check_dtype=check_dtype)
+        except AssertionError as e:
+            errors.append(e)
+        try:
+            pd.testing.assert_frame_equal(left.resources.to_dataframe(),
+                                          right.resources.to_dataframe(),
+                                          check_dtype=check_dtype)
+        except AssertionError as e:
+            errors.append(e)
+        try:
+            pd.testing.assert_frame_equal(left.entities.to_dataframe(),
+                                          right.entities.to_dataframe(),
+                                          check_dtype=check_dtype)
+        except AssertionError as e:
+            errors.append(e)
+        try:
+            pd.testing.assert_frame_equal(left.object_keys.to_dataframe(),
+                                          right.object_keys.to_dataframe(),
+                                          check_dtype=check_dtype)
+        except AssertionError as e:
+            errors.append(e)
+        if len(errors) > 0:
+            msg = ''.join(str(e)+"\n\n" for e in errors)
+            raise AssertionError(msg)
+        return True
+
     @docval({'name': 'key_name', 'type': str, 'doc': 'The name of the key to be added.'})
     def _add_key(self, **kwargs):
         """
@@ -243,8 +296,9 @@ class ExternalResources(Container):
                      'an external resource reference key. Use an empty string if not applicable.'),
              'default': ''},
             {'name': 'field', 'type': str, 'default': '',
-             'doc': ('The field of the compound data type using an external resource.')})
-    def _check_object_field(self, container, relative_path, field):
+             'doc': ('The field of the compound data type using an external resource.')},
+            {'name': 'create', 'type': bool, 'default': True})
+    def _check_object_field(self, container, relative_path, field, create):
         """
         Check if a container, relative path, and field have been added.
 
@@ -265,8 +319,10 @@ class ExternalResources(Container):
 
         if len(objecttable_idx) == 1:
             return self.objects.row[objecttable_idx[0]]
-        elif len(objecttable_idx) == 0:
+        elif len(objecttable_idx) == 0 and create:
             return self._add_object(container, relative_path, field)
+        elif len(objecttable_idx) == 0 and not create:
+            raise ValueError("Object not in Object Table.")
         else:
             raise ValueError("Found multiple instances of the same object id, relative path, "
                              "and field in objects table.")
@@ -449,14 +505,15 @@ class ExternalResources(Container):
 
         keys = []
         entities = []
-        object_field = self._check_object_field(container, relative_path, field)
+        object_field = self._check_object_field(container=container, relative_path=relative_path,
+                                                field=field, create=False)
         # Find all keys associated with the object
         for row_idx in self.object_keys.which(objects_idx=object_field.idx):
             keys.append(self.object_keys['keys_idx', row_idx])
         # Find all the entities/resources for each key.
         for key_idx in keys:
             entity_idx = self.entities.which(keys_idx=key_idx)
-            entities.append(self.entities.__getitem__(entity_idx[0]))
+            entities.append(list(self.entities.__getitem__(entity_idx[0])))
         df = pd.DataFrame(entities, columns=['keys_idx', 'resource_idx', 'entity_id', 'entity_uri'])
         return df
 
@@ -549,7 +606,8 @@ class ExternalResources(Container):
 
         # Step 4: Clean up the index and sort columns by table type and name
         result_df.reset_index(inplace=True, drop=True)
-        column_labels = [('objects', 'objects_idx'), ('objects', 'object_id'), ('objects', 'field'),
+        column_labels = [('objects', 'objects_idx'), ('objects', 'object_id'),
+                         ('objects', 'relative_path'), ('objects', 'field'),
                          ('keys', 'keys_idx'), ('keys', 'key'),
                          ('resources', 'resources_idx'), ('resources', 'resource'), ('resources', 'resource_uri'),
                          ('entities', 'entities_idx'), ('entities', 'entity_id'), ('entities', 'entity_uri')]
@@ -562,9 +620,8 @@ class ExternalResources(Container):
         # return the result
         return result_df
 
-    @docval({'name': 'db_file', 'type': str, 'doc': 'Name of the SQLite database file'},
-            rtype=pd.DataFrame, returns='A DataFrame with all data merged into a flat, denormalized table.')
-    def export_to_sqlite(self, db_file):
+    @docval({'name': 'db_file', 'type': str, 'doc': 'Name of the SQLite database file'})
+    def to_sqlite(self, db_file):
         """
         Save the keys, resources, entities, objects, and object_keys tables using sqlite3 to the given db_file.
 
@@ -580,9 +637,9 @@ class ExternalResources(Container):
         offset must be applied to the relevant foreign keys.
 
         :raises: The function will raise errors if connection to the database fails. If
-                 the given db_file already exists, then there is also the possibility that
-                 certain updates may result in errors if there are collisions between the
-                 new and existing data.
+            the given db_file already exists, then there is also the possibility that
+            certain updates may result in errors if there are collisions between the
+            new and existing data.
         """
         import sqlite3
         # connect to the database
@@ -650,3 +707,138 @@ class ExternalResources(Container):
             self.entities[:])
         connection.commit()
         connection.close()
+
+    @docval({'name': 'path', 'type': str, 'doc': 'path of the tsv file to write'})
+    def to_tsv(self, **kwargs):
+        """
+        Write ExternalResources as a single, flat table to TSV
+        Internally, the function uses :py:meth:`pandas.DataFrame.to_csv`. Pandas can
+        infer compression based on the filename, i.e., by changing the file extension to
+        ‘.gz’, ‘.bz2’, ‘.zip’, ‘.xz’, or ‘.zst’ we can write compressed files.
+        The TSV is formatted as follows: 1) line one indicates for each column the name of the table
+        the column belongs to, 2) line two is the name of the column within the table, 3) subsequent
+        lines are each a row in the flattened ExternalResources table. The first column is the
+        row id in the flattened table and does not have a label, i.e., the first and second
+        row will start with a tab character, and subseqent rows are numbered sequentially 1,2,3,... .
+        For example:
+
+        .. code-block::
+            :linenos:
+
+            \tobjects\tobjects\tobjects\tobjects\tkeys\tkeys\tresources\tresources\tresources\tentities\tentities\tentities
+            \tobjects_idx\tobject_id\trelative_path\tfield\tkeys_idx\tkey\tresources_idx\tresource\tresource_uri\tentities_idx\tentity_id\tentity_uri
+            0\t0\t1fc87200-e91e-45b3-978c-6d295af144c3\t\tspecies\t0\tMus musculus\t0\tNCBI_Taxonomy\thttps://www.ncbi.nlm.nih.gov/taxonomy\t0\tNCBI:txid10090\thttps://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=10090
+            1\t0\t9bf0c58e-09dc-4457-a652-94065b112c41\t\tspecies\t1\tHomo sapiens\t0\tNCBI_Taxonomy\thttps://www.ncbi.nlm.nih.gov/taxonomy\t1\tNCBI:txid9606\thttps://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606
+
+        See also :py:meth:`~hdmf.common.resources.ExternalResources.from_tsv`
+        """  # noqa: E501
+        path = popargs('path', kwargs)
+        df = self.to_dataframe(use_categories=True)
+        df.to_csv(path, sep='\t')
+
+    @classmethod
+    @docval({'name': 'path', 'type': str, 'doc': 'path of the tsv file to read'},
+            returns="ExternalResources loaded from TSV", rtype="ExternalResources")
+    def from_tsv(cls, **kwargs):
+        """
+        Read ExternalResources from a flat tsv file
+        Formatting of the TSV file is assumed to be consistent with the format
+        generated by :py:meth:`~hdmf.common.resources.ExternalResources.to_tsv`.
+        The function attempts to validate that the data in the TSV is consistent
+        and parses the data from the denormalized table in the TSV to the
+        normalized linked table structure used by ExternalResources.
+        Currently the checks focus on ensuring that row id links between tables are valid.
+        Inconsistencies in other (non-index) fields (e.g., when two rows with the same resource_idx
+        have different resource_uri values) are not checked and will be ignored. In this case, the value
+        from the first row that contains the corresponding entry will be kept.
+
+        .. note::
+           Since TSV files may be edited by hand or other applications, it is possible that data
+           in the TSV may be inconsistent. E.g., object_idx may be missing if rows were removed
+           and ids not updated. Also since the TSV is flattened into a single denormalized table
+           (i.e., data are stored with duplication, rather than normalized across several tables),
+           it is possible that values may be inconsistent if edited outside. E.g., we may have
+           objects with the same index (object_idx) but different object_id, relative_path, or field
+           values. While flat TSVs are sometimes preferred for ease of sharing, editing
+           the TSV without using the :py:meth:`~hdmf.common.resources.ExternalResources` class
+           should be done with great care!
+        """
+        def check_idx(idx_arr, name):
+            """Check that indices are consecutively numbered without missing values"""
+            idx_diff = np.diff(idx_arr)
+            if np.any(idx_diff != 1):
+                missing_idx = [i for i in range(np.max(idx_arr)) if i not in idx_arr]
+                msg = "Missing %s entries %s" % (name, str(missing_idx))
+                raise ValueError(msg)
+
+        path = popargs('path', kwargs)
+        df = pd.read_csv(path, header=[0, 1], sep='\t').replace(np.nan, '')
+        # Construct the ExternalResources
+        er = ExternalResources(name="external_resources")
+
+        # Retrieve all the objects
+        ob_idx, ob_rows = np.unique(df[('objects', 'objects_idx')], return_index=True)
+        # Sort objects based on their index
+        ob_order = np.argsort(ob_idx)
+        ob_idx = ob_idx[ob_order]
+        ob_rows = ob_rows[ob_order]
+        # Check that objects are consecutively numbered
+        check_idx(idx_arr=ob_idx, name='objects_idx')
+        # Add the objects to the Object table
+        ob_ids = df[('objects', 'object_id')].iloc[ob_rows]
+        ob_relpaths = df[('objects', 'relative_path')].iloc[ob_rows]
+        ob_fields = df[('objects', 'field')].iloc[ob_rows]
+        for ob in zip(ob_ids, ob_relpaths, ob_fields):
+            er._add_object(container=ob[0], relative_path=ob[1], field=ob[2])
+
+        # Retrieve all keys
+        keys_idx, keys_rows = np.unique(df[('keys', 'keys_idx')], return_index=True)
+        # Sort keys based on their index
+        keys_order = np.argsort(keys_idx)
+        keys_idx = keys_idx[keys_order]
+        keys_rows = keys_rows[keys_order]
+        # Check that keys are consecutively numbered
+        check_idx(idx_arr=keys_idx, name='keys_idx')
+        # Add the keys to the Keys table
+        keys_key = df[('keys', 'key')].iloc[keys_rows]
+        all_added_keys = [er._add_key(k) for k in keys_key]
+
+        # Add all the object keys to the ObjectKeys table. A single key may be assigned to multiple
+        # objects. As such it is not sufficient to iterate over the unique ob_rows with the unique
+        # objects, but we need to find all unique (objects_idx, keys_idx) combinations.
+        ob_keys_idx = np.unique(df[[('objects', 'objects_idx'), ('keys', 'keys_idx')]], axis=0)
+        for obk in ob_keys_idx:
+            er._add_object_key(obj=obk[0], key=obk[1])
+
+        # Retrieve all resources
+        resources_idx, resources_rows = np.unique(df[('resources', 'resources_idx')], return_index=True)
+        # Sort resources based on their index
+        resources_order = np.argsort(resources_idx)
+        resources_idx = resources_idx[resources_order]
+        resources_rows = resources_rows[resources_order]
+        # Check that resources are consecutively numbered
+        check_idx(idx_arr=resources_idx, name='resources_idx')
+        # Add the resources to the Resources table
+        resources_resource = df[('resources', 'resource')].iloc[resources_rows]
+        resources_uri = df[('resources', 'resource_uri')].iloc[resources_rows]
+        for r in zip(resources_resource, resources_uri):
+            er._add_resource(resource=r[0], uri=r[1])
+
+        # Retrieve all entities
+        entities_idx, entities_rows = np.unique(df[('entities', 'entities_idx')], return_index=True)
+        # Sort entities based on their index
+        entities_order = np.argsort(entities_idx)
+        entities_idx = entities_idx[entities_order]
+        entities_rows = entities_rows[entities_order]
+        # Check that entities are consecutively numbered
+        check_idx(idx_arr=entities_idx, name='entities_idx')
+        # Add the entities ot the Resources table
+        entities_id = df[('entities', 'entity_id')].iloc[entities_rows]
+        entities_uri = df[('entities', 'entity_uri')].iloc[entities_rows]
+        entities_keys = np.array(all_added_keys)[df[('keys', 'keys_idx')].iloc[entities_rows]]
+        entities_resources_idx = df[('resources', 'resources_idx')].iloc[entities_rows]
+        for e in zip(entities_keys, entities_resources_idx, entities_id, entities_uri):
+            er._add_entity(key=e[0], resources_idx=e[1], entity_id=e[2], entity_uri=e[3])
+
+        # Return the reconstructed ExternalResources
+        return er


=====================================
src/hdmf/data_utils.py
=====================================
@@ -426,6 +426,16 @@ class DataChunkIterator(AbstractDataChunkIterator):
     i.e., multiple values from the input iterator can be combined to a single chunk. This is
     useful for buffered I/O operations, e.g., to improve performance by accumulating data
     in memory and writing larger blocks at once.
+
+    .. note::
+
+         DataChunkIterator assumes that the iterator that it wraps returns one element along the
+         iteration dimension at a time. I.e., the iterator is expected to return chunks that are
+         one dimension lower than the array itself. For example, when iterating over the first dimension
+         of a dataset with shape (1000, 10, 10), then the iterator would return 1000 chunks of
+         shape (10, 10) one-chunk-at-a-time. If this pattern does not match your use-case then
+         using :py:class:`~hdmf.data_utils.GenericDataChunkIterator` or
+         :py:class:`~hdmf.data_utils.AbstractDataChunkIterator` may be more appropriate.
     """
 
     __docval_init = (
@@ -585,10 +595,13 @@ class DataChunkIterator(AbstractDataChunkIterator):
         return self.__next_chunk
 
     def __next__(self):
-        r"""Return the next data chunk or raise a StopIteration exception if all chunks have been retrieved.
+        """
+        Return the next data chunk or raise a StopIteration exception if all chunks have been retrieved.
 
-        HINT: numpy.s\_ provides a convenient way to generate index tuples using standard array slicing. This
-        is often useful to define the DataChunk.selection of the current chunk
+        .. tip::
+
+            :py:attr:`numpy.s_` provides a convenient way to generate index tuples using standard array slicing. This
+            is often useful to define the DataChunk.selection of the current chunk
 
         :returns: DataChunk object with the data and selection of the current chunk
         :rtype: DataChunk
@@ -639,11 +652,19 @@ class DataChunkIterator(AbstractDataChunkIterator):
     @property
     def maxshape(self):
         """
-        Get a shape tuple describing the maximum shape of the array described by this DataChunkIterator. If an iterator
-        is provided and no data has been read yet, then the first chunk will be read (i.e., next will be called on the
-        iterator) in order to determine the maxshape.
+        Get a shape tuple describing the maximum shape of the array described by this DataChunkIterator.
+
+        .. note::
+
+            If an iterator is provided and no data has been read yet, then the first chunk will be read
+            (i.e., next will be called on the iterator) in order to determine the maxshape. The iterator
+            is expected to return single chunks along the iterator dimension, this means that maxshape will
+            add an additional dimension along the iteration dimension. E.g., if we iterate over
+            the first dimension and the iterator returns chunks of shape (10, 10), then the maxshape would
+            be (None, 10, 10) or (len(self.data), 10, 10), depending on whether size of the
+            iteration dimension is known.
 
-        :return: Shape tuple. None is used for dimenwions where the maximum shape is not known or unlimited.
+        :return: Shape tuple. None is used for dimensions where the maximum shape is not known or unlimited.
         """
         if self.__maxshape is None:
             # If no data has been read from the iterator yet, read the first chunk and use it to determine the maxshape


=====================================
src/hdmf/testing/testcase.py
=====================================
@@ -34,29 +34,39 @@ class TestCase(unittest.TestCase):
 
         return self.assertWarnsRegex(warn_type, '^%s$' % re.escape(exc_msg), *args, **kwargs)
 
-    def assertContainerEqual(self, container1, container2,
-                             ignore_name=False, ignore_hdmf_attrs=False, ignore_string_to_byte=False):
+    def assertContainerEqual(self,
+                             container1,
+                             container2,
+                             ignore_name=False,
+                             ignore_hdmf_attrs=False,
+                             ignore_string_to_byte=False,
+                             message=None):
         """
         Asserts that the two AbstractContainers have equal contents. This applies to both Container and Data types.
 
+        :param container1: First container
+        :type container1: AbstractContainer
+        :param container2: Second container to compare with container 1
+        :type container2: AbstractContainer
         :param ignore_name: whether to ignore testing equality of name of the top-level container
         :param ignore_hdmf_attrs: whether to ignore testing equality of HDMF container attributes, such as
                                   container_source and object_id
         :param ignore_string_to_byte: ignore conversion of str to bytes and compare as unicode instead
+        :param message: custom additional message to show when assertions as part of this assert are failing
         """
-        self.assertTrue(isinstance(container1, AbstractContainer))
-        self.assertTrue(isinstance(container2, AbstractContainer))
+        self.assertTrue(isinstance(container1, AbstractContainer), message)
+        self.assertTrue(isinstance(container2, AbstractContainer), message)
         type1 = type(container1)
         type2 = type(container2)
-        self.assertEqual(type1, type2)
+        self.assertEqual(type1, type2, message)
         if not ignore_name:
-            self.assertEqual(container1.name, container2.name)
+            self.assertEqual(container1.name, container2.name, message)
         if not ignore_hdmf_attrs:
-            self.assertEqual(container1.container_source, container2.container_source)
-            self.assertEqual(container1.object_id, container2.object_id)
+            self.assertEqual(container1.container_source, container2.container_source, message)
+            self.assertEqual(container1.object_id, container2.object_id, message)
         # NOTE: parent is not tested because it can lead to infinite loops
         if isinstance(container1, Container):
-            self.assertEqual(len(container1.children), len(container2.children))
+            self.assertEqual(len(container1.children), len(container2.children), message)
         # do not actually check the children values here. all children *should* also be fields, which is checked below.
         # this is in case non-field children are added to one and not the other
 
@@ -66,47 +76,103 @@ class TestCase(unittest.TestCase):
                 f2 = getattr(container2, field)
                 self._assert_field_equal(f1, f2,
                                          ignore_hdmf_attrs=ignore_hdmf_attrs,
-                                         ignore_string_to_byte=ignore_string_to_byte)
+                                         ignore_string_to_byte=ignore_string_to_byte,
+                                         message=message)
+
+    def _assert_field_equal(self,
+                            f1,
+                            f2,
+                            ignore_hdmf_attrs=False,
+                            ignore_string_to_byte=False,
+                            message=None):
+        """
+        Internal helper function used to compare two fields from Container objects
 
-    def _assert_field_equal(self, f1, f2, ignore_hdmf_attrs=False, ignore_string_to_byte=False):
+        :param f1: The first field
+        :param f2: The second field
+        :param ignore_hdmf_attrs: whether to ignore testing equality of HDMF container attributes, such as
+                                  container_source and object_id
+        :param ignore_string_to_byte: ignore conversion of str to bytes and compare as unicode instead
+        :param message: custom additional message to show when assertions as part of this assert are failing
+        """
         array_data_types = get_docval_macro('array_data')
         if (isinstance(f1, array_data_types) or isinstance(f2, array_data_types)):
             self._assert_array_equal(f1, f2,
                                      ignore_hdmf_attrs=ignore_hdmf_attrs,
-                                     ignore_string_to_byte=ignore_string_to_byte)
+                                     ignore_string_to_byte=ignore_string_to_byte,
+                                     message=message)
         elif isinstance(f1, dict) and len(f1) and isinstance(f1.values()[0], Container):
-            self.assertIsInstance(f2, dict)
+            self.assertIsInstance(f2, dict, message)
             f1_keys = set(f1.keys())
             f2_keys = set(f2.keys())
-            self.assertSetEqual(f1_keys, f2_keys)
+            self.assertSetEqual(f1_keys, f2_keys, message)
             for k in f1_keys:
                 with self.subTest(module_name=k):
                     self.assertContainerEqual(f1[k], f2[k],
                                               ignore_hdmf_attrs=ignore_hdmf_attrs,
-                                              ignore_string_to_byte=ignore_string_to_byte)
+                                              ignore_string_to_byte=ignore_string_to_byte,
+                                              message=message)
         elif isinstance(f1, Container):
             self.assertContainerEqual(f1, f2,
                                       ignore_hdmf_attrs=ignore_hdmf_attrs,
-                                      ignore_string_to_byte=ignore_string_to_byte)
+                                      ignore_string_to_byte=ignore_string_to_byte,
+                                      message=message)
         elif isinstance(f1, Data):
             self._assert_data_equal(f1, f2,
                                     ignore_hdmf_attrs=ignore_hdmf_attrs,
-                                    ignore_string_to_byte=ignore_string_to_byte)
+                                    ignore_string_to_byte=ignore_string_to_byte,
+                                    message=message)
         elif isinstance(f1, (float, np.floating)):
-            np.testing.assert_allclose(f1, f2)
+            np.testing.assert_allclose(f1, f2, err_msg=message)
         else:
-            self.assertEqual(f1, f2)
+            self.assertEqual(f1, f2, message)
+
+    def _assert_data_equal(self,
+                           data1,
+                           data2,
+                           ignore_hdmf_attrs=False,
+                           ignore_string_to_byte=False,
+                           message=None):
+        """
+        Internal helper function used to compare two :py:class:`~hdmf.container.Data` objects
 
-    def _assert_data_equal(self, data1, data2, ignore_hdmf_attrs=False, ignore_string_to_byte=False):
-        self.assertTrue(isinstance(data1, Data))
-        self.assertTrue(isinstance(data2, Data))
-        self.assertEqual(len(data1), len(data2))
+        :param data1: The first :py:class:`~hdmf.container.Data` object
+        :type data1: :py:class:`hdmf.container.Data`
+        :param data1: The second :py:class:`~hdmf.container.Data` object
+        :type data1: :py:class:`hdmf.container.Data
+        :param ignore_hdmf_attrs: whether to ignore testing equality of HDMF container attributes, such as
+                                  container_source and object_id
+        :param ignore_string_to_byte: ignore conversion of str to bytes and compare as unicode instead
+        :param message: custom additional message to show when assertions as part of this assert are failing
+        """
+        self.assertTrue(isinstance(data1, Data), message)
+        self.assertTrue(isinstance(data2, Data), message)
+        self.assertEqual(len(data1), len(data2), message)
         self._assert_array_equal(data1.data, data2.data,
                                  ignore_hdmf_attrs=ignore_hdmf_attrs,
-                                 ignore_string_to_byte=ignore_string_to_byte)
-        self.assertContainerEqual(data1, data2, ignore_hdmf_attrs=ignore_hdmf_attrs)
+                                 ignore_string_to_byte=ignore_string_to_byte,
+                                 message=message)
+        self.assertContainerEqual(container1=data1,
+                                  container2=data2,
+                                  ignore_hdmf_attrs=ignore_hdmf_attrs,
+                                  message=message)
+
+    def _assert_array_equal(self,
+                            arr1,
+                            arr2,
+                            ignore_hdmf_attrs=False,
+                            ignore_string_to_byte=False,
+                            message=None):
+        """
+        Internal helper function used to check whether two arrays are equal
 
-    def _assert_array_equal(self, arr1, arr2, ignore_hdmf_attrs=False, ignore_string_to_byte=False):
+        :param arr1: The first array
+        :param arr2: The second array
+        :param ignore_hdmf_attrs: whether to ignore testing equality of HDMF container attributes, such as
+                                  container_source and object_id
+        :param ignore_string_to_byte: ignore conversion of str to bytes and compare as unicode instead
+        :param message: custom additional message to show when assertions as part of this assert are failing
+        """
         array_data_types = tuple([i for i in get_docval_macro('array_data')
                                   if (i != list and i != tuple and i != AbstractDataChunkIterator)])
         # We construct array_data_types this way to avoid explicit dependency on h5py, Zarr and other
@@ -119,52 +185,72 @@ class TestCase(unittest.TestCase):
             arr2 = arr2[()]
         if not isinstance(arr1, (tuple, list, np.ndarray)) and not isinstance(arr2, (tuple, list, np.ndarray)):
             if isinstance(arr1, (float, np.floating)):
-                np.testing.assert_allclose(arr1, arr2)
+                np.testing.assert_allclose(arr1, arr2, err_msg=message)
             else:
                 if ignore_string_to_byte:
                     if isinstance(arr1, bytes):
                         arr1 = arr1.decode('utf-8')
                     if isinstance(arr2, bytes):
                         arr2 = arr2.decode('utf-8')
-                self.assertEqual(arr1, arr2)  # scalar
+                self.assertEqual(arr1, arr2, message)  # scalar
         else:
-            self.assertEqual(len(arr1), len(arr2))
+            self.assertEqual(len(arr1), len(arr2), message)
             if isinstance(arr1, np.ndarray) and len(arr1.dtype) > 1:  # compound type
                 arr1 = arr1.tolist()
             if isinstance(arr2, np.ndarray) and len(arr2.dtype) > 1:  # compound type
                 arr2 = arr2.tolist()
             if isinstance(arr1, np.ndarray) and isinstance(arr2, np.ndarray):
                 if np.issubdtype(arr1.dtype, np.number):
-                    np.testing.assert_allclose(arr1, arr2)
+                    np.testing.assert_allclose(arr1, arr2, err_msg=message)
                 else:
-                    np.testing.assert_array_equal(arr1, arr2)
+                    np.testing.assert_array_equal(arr1, arr2, err_msg=message)
             else:
                 for sub1, sub2 in zip(arr1, arr2):
                     if isinstance(sub1, Container):
                         self.assertContainerEqual(sub1, sub2,
                                                   ignore_hdmf_attrs=ignore_hdmf_attrs,
-                                                  ignore_string_to_byte=ignore_string_to_byte)
+                                                  ignore_string_to_byte=ignore_string_to_byte,
+                                                  message=message)
                     elif isinstance(sub1, Data):
                         self._assert_data_equal(sub1, sub2,
                                                 ignore_hdmf_attrs=ignore_hdmf_attrs,
-                                                ignore_string_to_byte=ignore_string_to_byte)
+                                                ignore_string_to_byte=ignore_string_to_byte,
+                                                message=message)
                     else:
                         self._assert_array_equal(sub1, sub2,
                                                  ignore_hdmf_attrs=ignore_hdmf_attrs,
-                                                 ignore_string_to_byte=ignore_string_to_byte)
-
-    def assertBuilderEqual(self, builder1, builder2, check_path=True, check_source=True):
-        """Test whether two builders are equal. Like assertDictEqual but also checks type, name, path, and source.
+                                                 ignore_string_to_byte=ignore_string_to_byte,
+                                                 message=message)
+
+    def assertBuilderEqual(self,
+                           builder1,
+                           builder2,
+                           check_path=True,
+                           check_source=True,
+                           message=None):
+        """
+        Test whether two builders are equal. Like assertDictEqual but also checks type, name, path, and source.
+
+        :param builder1: The first builder
+        :type builder1: Builder
+        :param builder2: The second builder
+        :type builder2: Builder
+        :param check_path: Check that the builder.path values are equal
+        :type check_path: bool
+        :param check_source: Check that the builder.source values are equal
+        :type check_source: bool
+        :param message: Custom message to add when any asserts as part of this assert are failing
+        :type message: str or None (default=None)
         """
-        self.assertTrue(isinstance(builder1, Builder))
-        self.assertTrue(isinstance(builder2, Builder))
-        self.assertEqual(type(builder1), type(builder2))
-        self.assertEqual(builder1.name, builder2.name)
+        self.assertTrue(isinstance(builder1, Builder), message)
+        self.assertTrue(isinstance(builder2, Builder), message)
+        self.assertEqual(type(builder1), type(builder2), message)
+        self.assertEqual(builder1.name, builder2.name, message)
         if check_path:
-            self.assertEqual(builder1.path, builder2.path)
+            self.assertEqual(builder1.path, builder2.path, message)
         if check_source:
-            self.assertEqual(builder1.source, builder2.source)
-        self.assertDictEqual(builder1, builder2)
+            self.assertEqual(builder1.source, builder2.source, message)
+        self.assertDictEqual(builder1, builder2, message)
 
 
 class H5RoundTripMixin(metaclass=ABCMeta):


=====================================
tests/unit/common/test_resources.py
=====================================
@@ -118,6 +118,7 @@ class TestExternalResources(H5RoundTripMixin, TestCase):
              'object_id': {0: data1.object_id, 1: data1.object_id,
                            2: data2.object_id, 3: data2.object_id, 4: data2.object_id,
                            5: data3.object_id, 6: data3.object_id},
+             'relative_path': {0: '', 1: '', 2: '', 3: '', 4: '', 5: '', 6: ''},
              'field': {0: 'species', 1: 'species', 2: '', 3: '', 4: '', 5: '', 6: ''},
              'keys_idx': {0: 0, 1: 1, 2: 0, 3: 1, 4: 2, 5: 3, 6: 3},
              'key': {0: 'Mus musculus', 1: 'Homo sapiens', 2: 'Mus musculus', 3: 'Homo sapiens',
@@ -145,7 +146,7 @@ class TestExternalResources(H5RoundTripMixin, TestCase):
         # Convert to dataframe with categories and compare against the expected result
         result_df = er.to_dataframe(use_categories=True)
         cols_with_categories = [
-            ('objects', 'objects_idx'), ('objects', 'object_id'), ('objects', 'field'),
+            ('objects', 'objects_idx'), ('objects', 'object_id'), ('objects', 'relative_path'), ('objects', 'field'),
             ('keys', 'keys_idx'), ('keys', 'key'),
             ('resources', 'resources_idx'), ('resources', 'resource'), ('resources', 'resource_uri'),
             ('entities', 'entities_idx'), ('entities', 'entity_id'), ('entities', 'entity_uri')]
@@ -153,6 +154,108 @@ class TestExternalResources(H5RoundTripMixin, TestCase):
         expected_df = pd.DataFrame.from_dict(expected_df_data)
         pd.testing.assert_frame_equal(result_df, expected_df)
 
+    def test_assert_external_resources_equal(self):
+        er_left = ExternalResources('terms')
+        er_left.add_ref(
+            container='uuid1', key='key1',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="id11", entity_uri='url11')
+
+        er_right = ExternalResources('terms')
+        er_right.add_ref(
+            container='uuid1', key='key1',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="id11", entity_uri='url11')
+
+        self.assertTrue(ExternalResources.assert_external_resources_equal(er_left,
+                                                                          er_right))
+
+    def test_invalid_keys_assert_external_resources_equal(self):
+        er_left = ExternalResources('terms')
+        er_left.add_ref(
+            container='uuid1', key='key1',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="id11", entity_uri='url11')
+
+        er_right = ExternalResources('terms')
+        er_right.add_ref(
+            container='invalid', key='invalid',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="id11", entity_uri='url11')
+
+        with self.assertRaises(AssertionError):
+            ExternalResources.assert_external_resources_equal(er_left,
+                                                              er_right)
+
+    def test_invalid_objects_assert_external_resources_equal(self):
+        er_left = ExternalResources('terms')
+        er_left.add_ref(
+            container='invalid', key='key1',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="id11", entity_uri='url11')
+
+        er_right = ExternalResources('terms')
+        er_right.add_ref(
+            container='uuid1', key='key1',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="id11", entity_uri='url11')
+
+        with self.assertRaises(AssertionError):
+            ExternalResources.assert_external_resources_equal(er_left,
+                                                              er_right)
+
+    def test_invalid_resources_assert_external_resources_equal(self):
+        er_left = ExternalResources('terms')
+        er_left.add_ref(
+            container='uuid1', key='key1',
+            resource_name='invalid', resource_uri='invalid',
+            entity_id="id11", entity_uri='url11')
+
+        er_right = ExternalResources('terms')
+        er_right.add_ref(
+            container='uuid1', key='key1',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="id11", entity_uri='url11')
+
+        with self.assertRaises(AssertionError):
+            ExternalResources.assert_external_resources_equal(er_left,
+                                                              er_right)
+
+    def test_invalid_entity_assert_external_resources_equal(self):
+        er_left = ExternalResources('terms')
+        er_left.add_ref(
+            container='uuid1', key='key1',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="invalid", entity_uri='invalid')
+
+        er_right = ExternalResources('terms')
+        er_right.add_ref(
+            container='uuid1', key='key1',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="id11", entity_uri='url11')
+
+        with self.assertRaises(AssertionError):
+            ExternalResources.assert_external_resources_equal(er_left,
+                                                              er_right)
+
+    def test_invalid_object_keys_assert_external_resources_equal(self):
+        er_left = ExternalResources('terms')
+        er_left.add_ref(
+            container='invalid', key='invalid',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="id11", entity_uri='url11')
+
+        er_right = ExternalResources('terms')
+        er_right._add_key('key')
+        er_right.add_ref(
+            container='uuid1', key='key1',
+            resource_name='resource11', resource_uri='resource_uri11',
+            entity_id="id11", entity_uri='url11')
+
+        with self.assertRaises(AssertionError):
+            ExternalResources.assert_external_resources_equal(er_left,
+                                                              er_right)
+
     def test_add_ref(self):
         er = ExternalResources(name='terms')
         data = Data(name="species", data=['Homo sapiens', 'Mus musculus'])
@@ -165,6 +268,54 @@ class TestExternalResources(H5RoundTripMixin, TestCase):
         self.assertEqual(er.entities.data, [(0, 0, 'entity_id1', 'entity1')])
         self.assertEqual(er.objects.data, [(data.object_id, '', '')])
 
+    def test_to_tsv_and_from_tsv(self):
+        # write er to file
+        self.container.to_tsv(path=self.export_filename)
+        # read er back from file and compare
+        er_obj = ExternalResources.from_tsv(path=self.export_filename)
+        # Check that the data is correct
+        ExternalResources.assert_external_resources_equal(er_obj, self.container, check_dtype=False)
+
+    def test_to_tsv_and_from_tsv_missing_keyidx(self):
+        # write er to file
+        df = self.container.to_dataframe(use_categories=True)
+        df.at[0, ('keys', 'keys_idx')] = 10  # Change key_ix 0 to 10
+        df.to_csv(self.export_filename, sep='\t')
+        # read er back from file and compare
+        msg = "Missing keys_idx entries [0, 2, 3, 4, 5, 6, 7, 8, 9]"
+        with self.assertRaisesWith(ValueError, msg):
+            _ = ExternalResources.from_tsv(path=self.export_filename)
+
+    def test_to_tsv_and_from_tsv_missing_objectidx(self):
+        # write er to file
+        df = self.container.to_dataframe(use_categories=True)
+        df.at[0, ('objects', 'objects_idx')] = 10  # Change key_ix 0 to 10
+        df.to_csv(self.export_filename, sep='\t')
+        # read er back from file and compare
+        msg = "Missing objects_idx entries [0, 2, 3, 4, 5, 6, 7, 8, 9]"
+        with self.assertRaisesWith(ValueError, msg):
+            _ = ExternalResources.from_tsv(path=self.export_filename)
+
+    def test_to_tsv_and_from_tsv_missing_resourcesidx(self):
+        # write er to file
+        df = self.container.to_dataframe(use_categories=True)
+        df.at[0, ('resources', 'resources_idx')] = 10  # Change key_ix 0 to 10
+        df.to_csv(self.export_filename, sep='\t')
+        # read er back from file and compare
+        msg = "Missing resources_idx entries [0, 2, 3, 4, 5, 6, 7, 8, 9]"
+        with self.assertRaisesWith(ValueError, msg):
+            _ = ExternalResources.from_tsv(path=self.export_filename)
+
+    def test_to_tsv_and_from_tsv_missing_entitiesidx(self):
+        # write er to file
+        df = self.container.to_dataframe(use_categories=True)
+        df.at[0, ('entities', 'entities_idx')] = 10  # Change key_ix 0 to 10
+        df.to_csv(self.export_filename, sep='\t')
+        # read er back from file and compare
+        msg = "Missing entities_idx entries [0, 2, 3, 4, 5, 6, 7, 8, 9]"
+        with self.assertRaisesWith(ValueError, msg):
+            _ = ExternalResources.from_tsv(path=self.export_filename)
+
     def test_add_ref_duplicate_resource(self):
         er = ExternalResources(name='terms')
         er.add_ref(
@@ -328,14 +479,17 @@ class TestExternalResources(H5RoundTripMixin, TestCase):
 
     def test_get_object_resources(self):
         er = ExternalResources(name='terms')
-        data = Data(name='data_name', data=np.array([('Mus musculus', 9, 81.0), ('Homo sapien', 3, 27.0)],
-                    dtype=[('species', 'U14'), ('age', 'i4'), ('weight', 'f4')]))
+        table = DynamicTable(name='test_table', description='test table description')
+        table.add_column(name='test_col', description='test column description')
+        table.add_row(test_col='Mouse')
 
-        er.add_ref(container=data, key='Mus musculus', resource_name='NCBI_Taxonomy',
-                   resource_uri='https://www.ncbi.nlm.nih.gov/taxonomy',
+        er.add_ref(container=table, attribute='test_col', key='Mouse',
+                   resource_name='NCBI_Taxonomy',
+                   resource_uri='https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi',
                    entity_id='NCBI:txid10090',
-                   entity_uri='https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=10090')
-        received = er.get_object_resources(data)
+                   entity_uri='https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=10090',
+                   )
+        received = er.get_object_resources(table['test_col'])
         expected = pd.DataFrame(
             data=[[0, 0, 'NCBI:txid10090', 'https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=10090']],
             columns=['keys_idx', 'resource_idx', 'entity_id', 'entity_uri'])
@@ -366,7 +520,7 @@ class TestExternalResources(H5RoundTripMixin, TestCase):
 
         self.assertEqual(er.objects.data, [('uuid1', '', ''), (data.object_id, '', '')])
 
-    def test_check_object_field_error(self):
+    def test_check_object_field_multi_error(self):
         er = ExternalResources(name='terms')
         data = Data(name="species", data=['Homo sapiens', 'Mus musculus'])
         er._check_object_field(data, '')
@@ -374,6 +528,12 @@ class TestExternalResources(H5RoundTripMixin, TestCase):
         with self.assertRaises(ValueError):
             er._check_object_field(data, '')
 
+    def test_check_object_field_not_in_obj_table(self):
+        er = ExternalResources(name='terms')
+        data = Data(name="species", data=['Homo sapiens', 'Mus musculus'])
+        with self.assertRaises(ValueError):
+            er._check_object_field(container=data, relative_path='', field='', create=False)
+
     def test_add_ref_attribute(self):
         # Test to make sure the attribute object is being used for the id
         # for the external reference.


=====================================
tests/unit/test_io_hdf5_h5tools.py
=====================================
@@ -826,6 +826,34 @@ class TestHDF5IO(TestCase):
             self.assertEqual(io.manager, self.manager)
             self.assertEqual(io.source, self.path)
 
+    def test_delete_with_incomplete_construction_missing_file(self):
+        """
+        Here we test what happens when `close` is called before `HDF5IO.__init__` has
+        been completed. In this case, self.__file is missing.
+        """
+        class MyHDF5IO(HDF5IO):
+            def __init__(self):
+                self.__open_links = []
+                raise ValueError("interrupt before HDF5IO.__file is initialized")
+
+        with self.assertRaisesWith(exc_type=ValueError, exc_msg="interrupt before HDF5IO.__file is initialized"):
+            with MyHDF5IO() as _:
+                pass
+
+    def test_delete_with_incomplete_construction_missing_open_files(self):
+        """
+        Here we test what happens when `close` is called before `HDF5IO.__init__` has
+        been completed. In this case, self.__open_files is missing.
+        """
+        class MyHDF5IO(HDF5IO):
+            def __init__(self):
+                self.__file = None
+                raise ValueError("interrupt before HDF5IO.__open_files is initialized")
+
+        with self.assertRaisesWith(exc_type=ValueError, exc_msg="interrupt before HDF5IO.__open_files is initialized"):
+            with MyHDF5IO() as _:
+                pass
+
     def test_set_file_mismatch(self):
         self.file_obj = File(get_temp_filepath(), 'w')
         err_msg = ("You argued %s as this object's path, but supplied a file with filename: %s"
@@ -2489,6 +2517,31 @@ class TestExport(TestCase):
             # make sure the linked group is read from the first file
             self.assertEqual(read_foofile3.foo_link.container_source, self.paths[0])
 
+    def test_new_soft_link(self):
+        """Test that exporting a file with a newly created soft link makes the link internally."""
+        foo1 = Foo('foo1', [1, 2, 3, 4, 5], "I am foo1", 17, 3.14)
+        foobucket = FooBucket('bucket1', [foo1])
+        foofile = FooFile(buckets=[foobucket])
+
+        with HDF5IO(self.paths[0], manager=get_foo_buildmanager(), mode='w') as write_io:
+            write_io.write(foofile)
+
+        manager = get_foo_buildmanager()
+        with HDF5IO(self.paths[0], manager=manager, mode='r') as read_io:
+            read_foofile = read_io.read()
+            # make external link to existing group
+            read_foofile.foo_link = read_foofile.buckets['bucket1'].foos['foo1']
+
+            with HDF5IO(self.paths[1], mode='w') as export_io:
+                export_io.export(src_io=read_io, container=read_foofile)
+
+        with HDF5IO(self.paths[1], manager=get_foo_buildmanager(), mode='r') as read_io:
+            self.ios.append(read_io)  # track IO objects for tearDown
+            read_foofile2 = read_io.read()
+
+            # make sure the linked group is read from the exported file
+            self.assertEqual(read_foofile2.foo_link.container_source, self.paths[1])
+
     def test_attr_reference(self):
         """Test that exporting a written file with attribute references maintains the references."""
         foo1 = Foo('foo1', [1, 2, 3, 4, 5], "I am foo1", 17, 3.14)


=====================================
tests/unit/utils.py
=====================================
@@ -161,6 +161,7 @@ class FooFile(Container):
     def foo_link(self, value):
         if self.__foo_link is None:
             self.__foo_link = value
+            self.set_modified(True)
         else:
             raise ValueError("can't reset foo_link attribute")
 
@@ -172,6 +173,7 @@ class FooFile(Container):
     def foofile_data(self, value):
         if self.__foofile_data is None:
             self.__foofile_data = value
+            self.set_modified(True)
         else:
             raise ValueError("can't reset foofile_data attribute")
 
@@ -183,6 +185,7 @@ class FooFile(Container):
     def foo_ref_attr(self, value):
         if self.__foo_ref_attr is None:
             self.__foo_ref_attr = value
+            self.set_modified(True)
         else:
             raise ValueError("can't reset foo_ref_attr attribute")
 


=====================================
tox.ini
=====================================
@@ -35,7 +35,7 @@ commands =
 [testenv:py310-optional]
 basepython = python3.10
 install_command =
-    python -m pip install -e . {opts} {packages}
+    python -m pip install {opts} {packages}
 deps =
     -rrequirements-dev.txt
     -rrequirements-opt.txt
@@ -45,7 +45,7 @@ commands = {[testenv]commands}
 [testenv:py310-upgraded]
 basepython = python3.10
 install_command =
-    python -m pip install -U -e . {opts} {packages}
+    python -m pip install -U {opts} {packages}
 deps =
     -rrequirements-dev.txt
     -rrequirements-opt.txt
@@ -55,7 +55,7 @@ commands = {[testenv]commands}
 [testenv:py310-prerelease]
 basepython = python3.10
 install_command =
-    python -m pip install -U --pre -e . {opts} {packages}
+    python -m pip install -U --pre {opts} {packages}
 deps =
     -rrequirements-dev.txt
     -rrequirements-opt.txt
@@ -101,7 +101,7 @@ commands = {[testenv:build]commands}
 [testenv:build-py310-upgraded]
 basepython = python3.10
 install_command =
-    python -m pip install -U -e . {opts} {packages}
+    python -m pip install -U {opts} {packages}
 deps =
     -rrequirements-dev.txt
     -rrequirements-opt.txt
@@ -110,7 +110,7 @@ commands = {[testenv:build]commands}
 [testenv:build-py310-prerelease]
 basepython = python3.10
 install_command =
-    python -m pip install -U --pre -e . {opts} {packages}
+    python -m pip install -U --pre {opts} {packages}
 deps =
     -rrequirements-dev.txt
     -rrequirements-opt.txt
@@ -165,7 +165,7 @@ commands = {[testenv:gallery]commands}
 [testenv:gallery-py310-upgraded]
 basepython = python3.10
 install_command =
-    python -m pip install -U -e . {opts} {packages}
+    python -m pip install -U {opts} {packages}
 deps =
     -rrequirements-dev.txt
     -rrequirements-doc.txt
@@ -176,7 +176,7 @@ commands = {[testenv:gallery]commands}
 [testenv:gallery-py310-prerelease]
 basepython = python3.10
 install_command =
-    python -m pip install -U --pre -e . {opts} {packages}
+    python -m pip install -U --pre {opts} {packages}
 deps =
     -rrequirements-dev.txt
     -rrequirements-doc.txt



View it on GitLab: https://salsa.debian.org/med-team/hdmf/-/compare/664c29a9f1c75c26bf52b4d9093fd85b8103890a...a29f833ff55515faceca75b73740fb845c696cf2

-- 
View it on GitLab: https://salsa.debian.org/med-team/hdmf/-/compare/664c29a9f1c75c26bf52b4d9093fd85b8103890a...a29f833ff55515faceca75b73740fb845c696cf2
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20230205/100c05d9/attachment-0001.htm>


More information about the debian-med-commit mailing list