Bug#839538: diffoscope: json: detect order-only differences
Daniel Shahaf
danielsh at apache.org
Sat Oct 1 18:06:38 UTC 2016
Control: tags -1 patch
Daniel Shahaf wrote on Sat, Oct 01, 2016 at 17:23:42 +0000:
> It would be better to report "json files are equal up to order of
> elements in an object (= hash, dictionary, associative array)", and to
> print the difference in a more readable way than a hex dump. (For
> example, a linewise diff of pretty-printed json.)
Proposed patch attached. It behaves as follows:
[[[
% head *.json
==> 1.json <==
{ "hello": 42, "world": 43 }
==> 2.json <==
{ "world": 43, "hello": 42 }
% bin/diffoscope *.json
--- 1.json
+++ 2.json
│ --- 1.json
├── +++ 2.json
│┄ ordering differences only
│ @@ -1,4 +1,4 @@
│ {
│ - "hello": 42,
│ - "world": 43
│ + "world": 43,
│ + "hello": 42
│ }
╵
]]]
It passes the existing test suite, but I haven't yet tried writing
a unit test for this.
Cheers,
Daniel
diff --git a/diffoscope/comparators/json.py b/diffoscope/comparators/json.py
index d16a762..8d0c104 100644
--- a/diffoscope/comparators/json.py
+++ b/diffoscope/comparators/json.py
@@ -17,6 +17,7 @@
# You should have received a copy of the GNU General Public License
# along with diffoscope. If not, see <http://www.gnu.org/licenses/>.
+from collections import OrderedDict
import re
import json
@@ -34,18 +35,26 @@ class JSONFile(File):
with open(file.path) as f:
try:
- file.parsed = json.load(f)
+ file.parsed = json.load(f, object_pairs_hook=OrderedDict)
except json.JSONDecodeError:
return False
return True
def compare_details(self, other, source=None):
- return [Difference.from_text(self.dumps(self), self.dumps(other),
- self.path, other.path)]
+ difference = Difference.from_text(self.dumps(self), self.dumps(other),
+ self.path, other.path)
+ if difference:
+ return [difference]
+
+ difference = Difference.from_text(self.dumps(self, sort_keys=False),
+ self.dumps(other, sort_keys=False),
+ self.path, other.path,
+ comment="ordering differences only")
+ return [difference]
@staticmethod
- def dumps(file):
+ def dumps(file, sort_keys=True):
if not hasattr(file, 'parsed'):
return ""
- return json.dumps(file.parsed, indent=4, sort_keys=True)
+ return json.dumps(file.parsed, indent=4, sort_keys=sort_keys)
More information about the Reproducible-builds
mailing list