[Qa-debsources] patch-tracker DB schema

Stefano Zacchiroli zack at debian.org
Wed Apr 13 15:10:09 UTC 2016


On Sat, Mar 19, 2016 at 09:54:41PM +0100, Orestis Ioannou wrote:
> Looking at the policy I understood that the checksums in the .dsc are
> included in the .changes. So i guess we will only parse that one right
> ?

We don't have the actual .changes files, they're by products of the
archive processing software that are not stored as-is in mirrors. (BTW,
you can look at what we have in /srv/debian-mirror on
sources.debian.net.) But we do have Sources files (e.g.,
/srv/debian-mirror/dists/stable/main/source/Sources.bz2) and they keep
all the checksum associated to any component of a source package, pretty
much in the same format of .changes file. Source files can be parsed
easily with python-debian (deb822 module).

> Another thing i understood is that for source 3.0 packages there are
> always 3 checksums:
> 
> - orig.tar.gz
> - debian.tar.gz
> - dsc
> 
> plus all the checksums of the debs. I covered in the spec the ability to
> add the checksums of the debs as well although I am not entirely sure
> they are needed. Well I am not sure how can anyone use them.

Nitpick here: there might be more than 3 files composing a source
package in source 3.0 format. That allows to "merge" together multiple
upstream tarballs, for instance.

> {
>     orig.tar.gz: {
>         sha1: 'a0ed1456fad61116f868b1855530dbe948e20f06',
>         sha256: '0d123be7f51e61c4bf15e5c492b484054be7e90f3081608a5517007bfb1fd128',
>         md5: 'c6f698f19f2a2aa07dbb9bbda90a2754',
>     },
>     debian.tar.gz: {
>         sha1: '5e86ecf0671e113b63388dac81dd8d00e00ef298',
>         sha256': 'f54ae966a5f580571ae7d9ef5e1df0bd42d63e27cb505b27957351a495bc6288',
>         md5: '938512f08422f3509ff36f125f5873ba',
>     },
>     dsc: {
>         sha1: '1f418afaa01464e63cc1ee8a66a05f0848bd155c',
>         sha256': 'ac9d57254f7e835bed299926fd51bf6f534597cc3fcc52db01c4bffedae81272',
>         md5: '4c31ab7bfc40d3cf49d7811987390357',
>     },

As per my comment above, we need to keep open the possibility of having
more top-level entries here, but that shouldn't be a problem, as it's a
dictionary and the file names are not supposed to collide. OTOH we might
want to had an extra field to the inner dictionaries to easily find out
which files are the dsc, which the debian tarball, and which form the
multiple orig tarballs.

> CREATE TYPE patches_format as ENUM(
>     '1.0',
>     '2.0',
>     '3.0 (native)',
>     '3.0 (quilt)',
>     '3.0 (custom)',
>     '3.0 (git)',
> );

> CREATE TABLE source_package (
>   id SERIAL NOT NULL,
>   package_id BIGINT NOT NULL,
>   format patches_format NOT NULL,

I guess we should s/patches_format/source_format/ (both here and in the
data type), because that's really what it is. (We will *use* it to
decide how to parse patches, but that's a consequence, not the native
information.)

Other than that, looks good to me.

> I think an index on checksums would be useful. What do you think?

Might be useful, yes.

Thanks!
-- 
Stefano Zacchiroli  . . . . . . .  zack at upsilon.cc . . . . o . . . o . o
Maître de conférences . . . . . http://upsilon.cc/zack . . . o . . . o o
Former Debian Project Leader . . . . . @zacchiro . . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/qa-debsources/attachments/20160413/b0b5bdc4/attachment.sig>


More information about the Qa-debsources mailing list