[Piuparts-devel] get_files_owned_by_packages

Herbert Fortes terberh at gmail.com
Wed May 15 20:52:34 BST 2019



 [0] - https://salsa.debian.org/debian/piuparts/blob/develop/piuparts.py#L1661

>>>> for basename in vdir.glob("*.list"):
>>>>     for line in basename.read_text().split("\n"):
>>>                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>

I did not try glob() inside a generator. And

>>>
>>> Will be considerably faster if /var/.../info has a non-trivial size
>>> (though I am unsure if the "basename" variable from your example can be
>>> passed to open)

I got it after lunch.

>> There are basically two options.  Go back to (a variant) of the original
>> bulk code or use a defaultdict.  Caveat here; the defaultdict can hide
>> "KeyError"s (and that can lead to high memory consumption as you are now
>> creating a bunch of empty lists that you did not expect).


Like here.
 
> base_name = ((os.path.join(vdir, basename), basename[:-len(".list")])
>              for basename in os.listdir(vdir)
>              if basename.endswith(".list"))
> 

 - path_obj
pathf_and_pkg = ((basename, basename.stem) for basename in vdir.glob("*.list"))

for basename, pkg in pathf_and_pkg:
    for line in readlines_file(basename):
        pathname = line.strip()
        if pathname in vdict:
            vdict[pathname].append(pkg)
        else:
            vdict[pathname] = [pkg]

original  - t_get: 286.9121274860008 segundos (1000 loops).
base_name - gen_if_back: 283.37079794999954 segundos (1000 loops).
          - path_obj: 315.0043815299978 segundos (1000 loops).



Regards,
Herbert



More information about the Piuparts-devel mailing list