[Debian-salsa-ci] dumat + salsa-ci
Luca Boccassi
bluca at debian.org
Mon Sep 18 14:25:35 BST 2023
On Sun, 17 Sept 2023 at 20:43, Helmut Grohne <helmut at subdivi.de> wrote:
>
> Hi Luca and Santiago,
>
> I'm writing to you, because
> * Luca expressed interest in helping plug dumat into salsa-ci
> * Santiago is doing work for Freexian and has experience with salsa-ci
>
> dumat is https://salsa.debian.org/helmutg/dumat.
>
> One of the results from the /usr-merge DebConf23 BoF was the wish to add
> dumat to the salsa CI pipeline. I put up with the groundwork for that
> now. Can I ask you figure out a reasonable way of integrating this into
> salsa-ci?
>
> Let us assume that you have a foo.changes where the Distribution field
> is "unstable", "experimental" or a Debian codename. Then you can perform
> the following steps.
>
> Download the current database. This will download about 50MB and create
> a 1.5GB database file.
>
> curl -s https://subdivi.de/~helmut/dumat.sql.zst | zstd -d | sqlite3 dumat.db
>
> This file changes every 6h. Would it be possible to have some form of
> caching such that this is not downloaded for every single pipeline run?
> It would also be possible to have salsa-ci maintain its own version of
> this database by regularly updating it locally. The steps to create this
> database are as follows.
>
> sqlite3 dumat.db < schema.sql
> ./import_mirror.py -d dumat.db
>
> That latter step will download very many packages from deb.debian.org on
> the first invocation. In later invocations, it'll do incremental
> updates.
>
> Then add the built packages to the database.
>
> ./import_mirror.py -d dumat.db --changes path/to/foo.changes
>
> This last command changes dumat.db in a way that makes it unsuitable for
> reusing by other jobs in any way. Then, perform the analysis. This will
> take about 30 seconds of CPU time.
>
> ./analyze.py -d dumat.db > dumat.yaml
>
> I think the most reasonable gating here is looking up all the built
> .debs in the dumat.yaml by their package name and then binary version.
> Alternatively, filter by source and source version though those fields
> are optional and only present when they differ from the binary package
> name and binary version respectively. If there are any reported issues,
> consider the job failed.
>
> There definitely is one false positive python3-notebook, because it has
> a file conflict that is mitigated in the maintainer script, but dumat
> fails to understand that this is correctly mitigated. I expect more
> false positives in future, so we need to think about that eventually.
>
> I do not recommend adding this to the default pipeline due to the
> resources consumed by this job. It only makes sense for a tiny fraction
> of the archive. Maybe we can hard code a list of potentially affected
> source packages and skip it in all other cases?
>
> Is this something either of you would like to look into?
This sounds like a great idea, I can add it to the TODO list. The
execution time won't be a problem, _maybe_ the temporary disk space?
But I'm pretty sure many packages need way more than that
(libreoffice, kernel, browsers), so even that might be fine.
More information about the Debian-salsa-ci
mailing list