[Qa-debsources] patch-tracker DB schema

Orestis Ioannou orestis at oioannou.com
Tue Mar 1 08:51:33 UTC 2016


Heya,

New file attached with modifications

> 
> Just a nit: I think this enum should support all the formats listed in
> the DPKG-SOURCE(1) manpage. It will make the data type more complete,
> and it still will not force us to support anything more than the above 2
> entries in Debsources.
> 

Agreed.

>> CREATE TABLE source_package (
>>   id SERIAL NOT NULL,
>>   package_id BIGINT NOT NULL,
>>   format patches_format NOT NULL,
>>   sha256 VARCHAR(64) NOT NULL,
> 
> So, what checksum is that?
> 
> In an actual .dsc, there are the checksums of all the (other) components
> of the source package, usually an .orig.tar.gz and a .diff.gz. In
> addition to that, .changes files also have the checksum of the .dsc.
> 
> Calling this field just sha256 looks like too ambiguous to me. And it
> opens the question of whether we also want to store the other checksums.
> 

Ok i didn't know about all the other checksums. Thanks for pointing this
to me. AFAIU we only need to store the checksum of the orig.tar.gz so
that upstream can verify if the maintainer changed it. Do you think we
need to store other checksums?


> 
>> CREATE INDEX ix_source_package_id ON package (package_id);
> 
> We certainly also want an index on package_id here, because it will
> optimize the ON DELETE CASCADE.

Ok actually i had a mistake here.. I meant to write:

CREATE INDEX ix_source_package_id ON source_package (package_id);

What i wrote before doesn't make sense i guess.

> 
> Do we want other indexes? To answer that we'll need to know what will be
> the typical queries we're going to do on this query. Can you elaborate
> on that?
> 

For the moment the only query I am thinking is getting the checksum and
patch format based on a package_id.
Maybe another query i can imagine is grouping by format and then
counting to make some statistics or graphs? But I am not really sure
that we would need an index just for that.


There is a small problem with the schema though. I was hoping to be able
to store in the DB whether a package _has_ patches. Right now we only
store the format so we will always need to check the series file to
verify if there are any patches.
While this is not complicated or time consuming I was hoping to use the
DB for the versions view https://sources.debian.net/patches/flask/
(So we can put the asterisk for packages that contain no patches but the
format is supported).
The thing is that I don't see a clean way to do this (i.e not storing a
boolean :D) so if you have any idea on how to do this cleanly and you
think it is worth it then shout :D

> Thanks for your work! It's much appreciated and I'll try to be quicker
> in keeping up with this thread from now on :)
> 

Thanks for reviewing once again :) And there is absolutely no problem
with the delays :)

Cheers,

Orestis




-------------- next part --------------
A non-text attachment was scrubbed...
Name: 011-to-012.sql
Type: application/sql
Size: 499 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/qa-debsources/attachments/20160301/a8dffe06/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/qa-debsources/attachments/20160301/a8dffe06/attachment.sig>


More information about the Qa-debsources mailing list