[Debian-l10n-devel] Faster importing of packages (ddtp.debian.org)
Stuart Prescott
stuart at debian.org
Sat Jul 28 14:59:57 UTC 2012
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Martijn,
> 2. Only import changes. Each day there are Packages.diff files
> produced with just the changes from the previous day. In theory you
> could use this to just import the packages that have changed. Problem
> is I can't find much information about how that actually works. It
> looks like an ed-style diff, but I'm not sure.
You are right that the pdiffs are ed-style diffs are a right pain in the
rear end to work with. Unfortunately, you need the *old* Packages and the
diff to work out what changed; the *new* Packages file and the diff are
insufficient as the ed-style diff is not reversible like a normal patch is.
A while back I wrote two pieces of code that could be helpful here:
* lspdiff:
Run as
lspdiff --packages Packages.old --pdiff 2012-07-10-2042.36.pdiff
you get a list of package names that have changed in some way (added,
deleted or changed). You need to run it for each pdiff file in the sequence
from the oldest Packages you have through to the current Packages, creating
the intermediate Packages files with patch --ed for each step.
* deb822diff
Run as
deb822diff Packages.old Packages
you get a list of package names that have changed in some way (added,
deleted or changed). You only need to run it once for the "old" and "new"
Packages files. [This is actually a wrapper around a python module and it's
trivial to have it work on Packages.gz or Packages.bz2 or ... I wonder if
module would be useful for python-debian one day]
Of the two, I suspect that the latter is easier to work with. There is
clearly a little scripting work to do to make sure that you keep the old
Packages from the last full or incremental import around somewhere. Given
the list of changed packages, deleting those rows from the db and then
importing the updated rows would be best approach. You could update
timestamps on non-changed entries at the same time if you wanted.
Either of these bits of code need some polish to make them useful to you
(and to UDD which is what I originally wrote them for) and I'm quite happy
to help with that -- feel free to contact me directly so we don't fill the
list with noise about this.
cheers
Stuart
- --
Stuart Prescott http://www.nanonanonano.net/ stuart at nanonanonano.net
Debian Developer http://www.debian.org/ stuart at debian.org
GPG fingerprint BE65 FD1E F4EA 08F3 23D4 3C6D 9FE8 B8CD 71C5 D1A8
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iEYEARECAAYFAlAT/m0ACgkQn+i4zXHF0ah+jwCeKiqbBfFIK5SN0UmWKNzE62aO
xx4AoKLemfbNj6F6dh4IA8M6opfKOqQX
=8mtN
-----END PGP SIGNATURE-----
More information about the Debian-l10n-devel
mailing list