[Qa-debsources] patch-tracker DB schema

Stefano Zacchiroli zack at debian.org
Sun Mar 6 15:08:48 UTC 2016


On Tue, Mar 01, 2016 at 09:51:33AM +0100, Orestis Ioannou wrote:
> New file attached with modifications

Thanks.

I've noticed a typo "patch_format" vs "patches_format" in it.

> Ok i didn't know about all the other checksums. Thanks for pointing
> this to me. AFAIU we only need to store the checksum of the
> orig.tar.gz so that upstream can verify if the maintainer changed
> it. Do you think we need to store other checksums?

If it's just for that use case, I agree that the orig checksum would be
enough. However, there is still a multiplicity problem. AFAIR in
dpkg-source (3.0) you can have *multiple* orig tarballs, so we need a
way to store all those checksums, and therefore to associate the tarball
name to each checksum (otherwise we will not know which-is-which).

So it looks like we need either an extra table here, or maybe a jsonb
field containing an associative list mapping file names to checksums.  I
like the latter solution better, because it will allow to store easily
other checksums in the future if we need too.  But I've never checked
out sqlalchemy deals with postgres jsonb fields...

> CREATE INDEX ix_source_package_id ON source_package (package_id);

OK.

> For the moment the only query I am thinking is getting the checksum and
> patch format based on a package_id.
> Maybe another query i can imagine is grouping by format and then
> counting to make some statistics or graphs? But I am not really sure
> that we would need an index just for that.

Agreed, that should be trivial enough even without a index.

> There is a small problem with the schema though. I was hoping to be able
> to store in the DB whether a package _has_ patches. Right now we only
> store the format so we will always need to check the series file to
> verify if there are any patches.
[...]
> The thing is that I don't see a clean way to do this (i.e not storing a
> boolean :D) so if you have any idea on how to do this cleanly and you
> think it is worth it then shout :D

How about a separate patches table, that cross-reference the table
you've proposed here with individual patches?

That way we can store arbitrary metadata associated to each patch if we
need to, and nothing needs to be done at runtime by the webapp other
than querying the DB.  If we just want to count the number of patches,
it will be a trivial SELECT COUNT(), after having filtered the patches
table by source package id (that we can easily index).

Cheers.
-- 
Stefano Zacchiroli  . . . . . . .  zack at upsilon.cc . . . . o . . . o . o
Maître de conférences . . . . . http://upsilon.cc/zack . . . o . . . o o
Former Debian Project Leader . . . . . @zacchiro . . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/qa-debsources/attachments/20160306/7ba1af35/attachment.sig>


More information about the Qa-debsources mailing list