FWD: Clarification regarding FTP resource constraints for buildinfo files

Holger Levsen holger at layer-acht.org
Thu Nov 10 19:13:53 UTC 2016


Hi,

actually forwarding this to the bug.

And adding a small note that since August we now have
buildinfo.debian.net, so maybe for a start it would be sufficient if dak
would submit these .buildinfo files via curl/https to buildinfo.d.n!?!

----- Forwarded message from Ximin Luo <infinity0 at debian.org> -----

Date: Wed, 24 Aug 2016 13:16:00 +0000
From: Ximin Luo <infinity0 at debian.org>
To: ftpmaster at debian.org
Cc: Reproducible Builds discussion list <reproducible-builds at lists.alioth.debian.org>
Subject: [Reproducible-builds] Clarification regarding FTP resource constraints for buildinfo files
Reply-To: Reproducible Builds discussion list <reproducible-builds at lists.alioth.debian.org>

Hi, I'm emailing to follow-up regarding #763822. I know we have not yet come up
with a concrete proposal on that, and that is largely because we were waiting
for comments regarding the resource constraints of ftp-master and mirrors.

There is broad understanding across the R-B team that you'd prefer a design
that does not involve "lots of small files", but there's a lot of breadth in
this statement, and none of us know the precise details involved. Originally
Lunar proposed a design with 1 large file, but there are issues with this as
well, such as the performance of updates.

Here are our current main requirements as stated by dkg in message #10, and I
confirm they're still accurate as of today:

1. We want an archive user to be able to find and fetch all .buildinfo files that produced a given binary package
2. We want the eventual possibility of multiple .buildinfo files per <srcpkg,version,arch>
3. We understsand that mirror operators don't like small files because rsync gets fussy with them.
4. We want both buildds and debian developers to be able to upload .buildinfo files.

(4) by itself is easy; people have already written code to allow dak to accept
such files and discard them.

So we need to figure out how to reconcile (1,2,3). For this, it would be good
if you could tell me in more detail what the restriction (3) consists of.

We would never be uploading 10,000k buildinfo files at once, but Mattia tells
me that 1k might happen during medium binNMU transitions, growing up to 4k for
large transitions (but this would be over several days, i.e. split across
multiple runs of dinstall). Each buildinfo file is about 5.4k (median), with
7.7k as the 75% percentile, though the largest is 148k. [1]

There is also the distinction between uploading vs mirroring. Just because we
might upload 1k files over a short time, does not mean that we have to transfer
these to mirrors as 1k files. We could tar some of them up and compress them.

So could you clarify some details regarding upload resource limits, as well as
mirroring resource limits?

For example, is one extra file per source-package OK or "too much"? Or one
extra file per binary upload? How about one extra file-update, of the same
file, per binary upload? (I assume that rsync means we are free to update any
files that we store in pool/, if we need to?)

More clarifications to the above, regarding what we *don't* need:

N1. It's not essential to store 1 uploaded-buildinfo-file per file-in-the-archive, as long as we can still do (1).
N2. We don't care particularly about being able to get *a specific buildinfo-file*, as long as we can still do (1).
N3. It's OK to over-satisfy (1) with extra irrelevant data, then the user can just filter this out locally.

We have more ideas, but I think it's best to keep this email short for now.
Also I don't know what is feasible until I hear more details about the
constraints, and it would be pointless to skip further ahead to potentially
unfeasible things.

X

[1] (use wget, too big for browser) https://tests.reproducible-builds.org/debian/buildinfo/unstable/amd64/?C=S;O=A

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

_______________________________________________
Reproducible-builds mailing list
Reproducible-builds at lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

----- End forwarded message -----


-- 
cheers,
	Holger
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 811 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20161110/ee6561a7/attachment.sig>


More information about the Reproducible-builds mailing list