[Debian-salsa-ci] dumat + salsa-ci

Helmut Grohne helmut at subdivi.de
Sun Sep 17 20:42:38 BST 2023


Hi Luca and Santiago,

I'm writing to you, because
 * Luca expressed interest in helping plug dumat into salsa-ci
 * Santiago is doing work for Freexian and has experience with salsa-ci

dumat is https://salsa.debian.org/helmutg/dumat.

One of the results from the /usr-merge DebConf23 BoF was the wish to add
dumat to the salsa CI pipeline. I put up with the groundwork for that
now. Can I ask you figure out a reasonable way of integrating this into
salsa-ci?

Let us assume that you have a foo.changes where the Distribution field
is "unstable", "experimental" or a Debian codename. Then you can perform
the following steps.

Download the current database. This will download about 50MB and create
a 1.5GB database file.

curl -s https://subdivi.de/~helmut/dumat.sql.zst | zstd -d | sqlite3 dumat.db

This file changes every 6h. Would it be possible to have some form of
caching such that this is not downloaded for every single pipeline run?
It would also be possible to have salsa-ci maintain its own version of
this database by regularly updating it locally. The steps to create this
database are as follows.

sqlite3 dumat.db < schema.sql
./import_mirror.py -d dumat.db

That latter step will download very many packages from deb.debian.org on
the first invocation. In later invocations, it'll do incremental
updates.

Then add the built packages to the database.

./import_mirror.py -d dumat.db --changes path/to/foo.changes

This last command changes dumat.db in a way that makes it unsuitable for
reusing by other jobs in any way. Then, perform the analysis. This will
take about 30 seconds of CPU time.

./analyze.py -d dumat.db > dumat.yaml

I think the most reasonable gating here is looking up all the built
.debs in the dumat.yaml by their package name and then binary version.
Alternatively, filter by source and source version though those fields
are optional and only present when they differ from the binary package
name and binary version respectively.  If there are any reported issues,
consider the job failed.

There definitely is one false positive python3-notebook, because it has
a file conflict that is mitigated in the maintainer script, but dumat
fails to understand that this is correctly mitigated.  I expect more
false positives in future, so we need to think about that eventually.

I do not recommend adding this to the default pipeline due to the
resources consumed by this job. It only makes sense for a tiny fraction
of the archive. Maybe we can hard code a list of potentially affected
source packages and skip it in all other cases?

Is this something either of you would like to look into?

Helmut




More information about the Debian-salsa-ci mailing list