Bug#863879: diffoscope: Optimize the common case of feeders.

Daniel Shahaf danielsh at apache.org
Mon May 29 15:14:54 UTC 2017


Source: diffoscope
Version: 82
Severity: wishlist
Tags: patch

Dear Maintainer,

Please find attached a patch optimising the feeder codepath.  I don't have
performance numbers, but I expect comparison to the None singleton to be faster
than calling any lambda function.

Cheers,

Daniel

[[[
>From f29fb71aba5ed79f9f517c794be2f555b762fe12 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <danielsh at apache.org>
Date: Mon, 29 May 2017 15:13:53 +0000
Subject: [PATCH 1/2] diffoscope.difference: Optimize the common case.

Don't call a lambda function object.
---
 diffoscope/difference.py | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/diffoscope/difference.py b/diffoscope/difference.py
index ca45041..c1f0537 100644
--- a/diffoscope/difference.py
+++ b/diffoscope/difference.py
@@ -247,9 +247,13 @@ class Difference(object):
         self._visuals.extend(visuals)
         self._size_cache = None
 
-def make_feeder_from_text_reader(in_file, filter=lambda text_buf: text_buf):
-    def encoding_filter(text_buf):
-        return filter(text_buf).encode('utf-8')
+def make_feeder_from_text_reader(in_file, filter=None):
+    if filter:
+        def encoding_filter(text_buf):
+            return filter(text_buf).encode('utf-8')
+    else:
+        def encoding_filter(text_buf):
+            return text_buf.encode('utf-8')
     return make_feeder_from_raw_reader(in_file, encoding_filter)
 
 def make_feeder_from_command(command):
@@ -264,7 +268,7 @@ def make_feeder_from_command(command):
         return end_nl
     return feeder
 
-def make_feeder_from_raw_reader(in_file, filter=lambda buf: buf):
+def make_feeder_from_raw_reader(in_file, filter=None):
     def feeder(out_file):
         max_lines = Config().max_diff_input_lines
         line_count = 0
@@ -274,7 +278,7 @@ def make_feeder_from_raw_reader(in_file, filter=lambda buf: buf):
             h = hashlib.sha1()
         for buf in in_file:
             line_count += 1
-            out = filter(buf)
+            out = filter(buf) if filter else buf
             if h:
                 h.update(out)
             if line_count < max_lines:
]]]



More information about the Reproducible-builds mailing list