[Git][security-tracker-team/security-tracker][master] 10 commits: security_db: document foreign key constraints

Emilio Pozuelo Monfort (@pochu) pochu at debian.org
Tue Apr 28 09:46:07 BST 2026



Emilio Pozuelo Monfort pushed to branch master at Debian Security Tracker / security-tracker


Commits:
c12f0b3c by Helmut Grohne at 2026-02-22T13:33:36+01:00
security_db: document foreign key constraints

sqlite does not consider foreign key constraints unless explicitly
turning them on (PRAGMA foreign_keys=on). Hence, this change does not
have semantic effects and therefore does not incur a schema version.
However, it provides useful insights to readers of the code as it
conveys how these tables are meant to be joined.

- - - - -
d691a8ee by Helmut Grohne at 2026-02-22T13:34:16+01:00
security_db: index debian_bugs by note

Several queries such as BugFromDB.getDebianBugs look up debian_bugs by
note. This presently incurs a table scan where this index could be used.
Among other things, this index speeds up the CVE view of the web
tracker.

Not schema is upgraded as this change impacts performance only. To
benefit from it, please delete your security.db and regenerate it.

- - - - -
aa570cd0 by Helmut Grohne at 2026-02-22T13:34:20+01:00
security_db: index bugs_notes by bug_name

This speeds up BugFromDB significantly as it avoids a full table scan.
It is used for the web interface displaying a CVE.

- - - - -
63d939e1 by Helmut Grohne at 2026-02-22T13:34:20+01:00
security_db: speed up getBugsForSourcePackage

This function is called for "/source-package/*" views. For
"/source-package/linux", it contributes the bulk of time. The big SQL
query at the start reports each CVE for each release and uses collation
via Python to sort that table of 300000 rows.

While we want all_bugs to be sorted in the end, the intermediate rows
are multiplied by the release count and the property that really matters
there is sorting equal bugs together for groupby. Thus relax the sorting
on the sqlite3 side and move it into Python.

Observed speedup on /source-package/linux is almost 50%.

- - - - -
3682a4b7 by Helmut Grohne at 2026-02-22T13:34:20+01:00
security_db: reduce query set of getBugsForSourcePackage

The biggest cost in getBugsForSourcePackage is generating the huge join
at the very start. Reducing the number of rows is key to improving
speed. Note that special releases such as backports are trimmed after
generating them. For the linux kernel, this amounts to about 20% of rows
all of which could be skipped. Skipping them also saves us from sorting
them. So push the release filter into SQL.

- - - - -
19fa09a9 by Helmut Grohne at 2026-02-22T13:34:20+01:00
security_db.py: getBugsForSourcePackage: replace list with tuple

The values of no_dsas do not practically resemble the homogeneous list
use. Rather, they're used as tuples. As it turns out, tuple construction
is marginally faster and uses less memory. The gain in performance is
minor compared to the gain in clarity.

- - - - -
cacfcbbd by Helmut Grohne at 2026-02-22T13:34:20+01:00
security_db.py: getBugsForSourcePackage: use a dict comprehension

While the dict comprehension is marginally faster due to batching the
size extensions of the dict, it also establishes more clearly that
no_dsas is constructed once and from then on queried.

- - - - -
548e3434 by Helmut Grohne at 2026-02-22T13:34:20+01:00
security_db.py: getBugsForSourcePackage: avoid constructing large sequences

There are several large sequences being constructed as temporary objects
that can be avoided. The initial query constructs a huge list, but it is
consumed by itertools.groupby which can deal with a generator. For doing
so, we must ensure that no other query (e.g. construction of no_dsas)
interferes with the cursor state. We also have to fuse the computation
of all_releases into the existing loop.

Then, the inner data constructs several tuples. They are used both for
capturing the description and another itertools.groupby. By relocating
the description initialization, this tuple can be avoided as well.

The innermost data1 also can avoid a tuple by using max, which also can
deal with a generator.

Ultimately, avoiding all of these sequence constructions and
destructions provides a measurable speed increase of a few percent.

- - - - -
7eadc3d4 by Helmut Grohne at 2026-02-22T13:34:20+01:00
security_db.py: getBugsForSourcePackage: use itertools.starmap

This provides a subtle performance boost over the generator expression.

- - - - -
b482d443 by Emilio Pozuelo Monfort at 2026-04-28T08:46:03+00:00
Merge branch 'performance' into 'master'

security_db: speed up rendering the web tracker

See merge request security-tracker-team/security-tracker!269
- - - - -


1 changed file:

- lib/python/security_db.py


Changes:

=====================================
lib/python/security_db.py
=====================================
@@ -31,6 +31,7 @@ import apsw
 import bugs
 from collections import defaultdict, namedtuple
 import email.utils
+import functools
 import json
 import pickle
 import glob
@@ -136,46 +137,54 @@ BugsForSourcePackage_query = \
   JOIN source_package_status st ON (bugs.name = st.bug_name)
   JOIN source_packages sp ON (st.package = sp.rowid)
   WHERE sp.name = ?
+  AND sp.release IN (SUPPORTED_RELEASES)
   AND (bugs.name LIKE 'CVE-%' OR bugs.name LIKE 'TEMP-%')
-  ORDER BY bugs.name COLLATE version DESC, sp.release"""
-# Sort order is important for the groupby operation below.
+  ORDER BY bugs.name DESC, sp.release"""
+# We want to order all_bugs below by version-collated bug name, but collation
+# calls from sqlite to Python and operates on release-multiplied data. Thus
+# we sort arbitrarily for the groupby operation and later sort by version.
 
 def getBugsForSourcePackage(cursor, pkg):
-    data = [BugsForSourcePackage_internal(*row) for row in
-            cursor.execute(BugsForSourcePackage_query, (pkg,))]
-    # Filter out special releases such as backports.
-    data = [row for row in data
-            if debian_support.internRelease(row.release) is not None]
-    # Obtain the set of releases actually in used, by canonical order.
-    all_releases = tuple(sorted(set(row.release for row in data),
-                                   key = debian_support.internRelease))
     # dict from (bug_name, release) to the no-dsa reason/comment string.
-    no_dsas = {}
-    for bug_name, release, reason, comment in cursor.execute(
+    no_dsas = {
+        (bug_name, release): (reason, comment)
+        for bug_name, release, reason, comment in cursor.execute(
             """SELECT bug_name, release, reason, comment FROM package_notes_nodsa
-            WHERE package = ?""", (pkg,)):
-        no_dsas[(bug_name, release)] = [reason, comment]
+            WHERE package = ?""", (pkg,))
+    }
 
+    # Restrict to regular releases excluding e.g. backports.
+    release_names = tuple(debian_support.Release.releases)
+    data = itertools.starmap(
+        BugsForSourcePackage_internal,
+        cursor.execute(
+            BugsForSourcePackage_query.replace(
+                "SUPPORTED_RELEASES", ", ".join("?" * len(release_names))
+            ),
+            (pkg, *release_names),
+        ),
+    )
+    all_releases = set()  # Actually used release names
     all_bugs = []
+    version_key = functools.cmp_to_key(version_compare)
     # Group by bug name.
     for bug_name, data in itertools.groupby(data,
                                             lambda row: row.bug_name):
-        data = tuple(data)
-        description = data[0].description
+        description = None
         open_seen = False
         unimportant_seen = False
         releases = {}
         # Group by release.
         for release, data1 in itertools.groupby(data, lambda row: row.release):
-            data1 = tuple(data1)
+            all_releases.add(release)
             # The best row is the row with the highest version number.
             # If there is a tie, the empty subrelease row wins.
-            best_row = data1[0]
-            for row in data1[1:]:
-                cmpresult = version_compare(row.version, best_row.version)
-                if cmpresult > 0 \
-                   or (cmpresult == 0 and row.subrelease == ''):
-                    best_row = row
+            best_row = max(
+                data1,
+                key=lambda row: (version_key(row.version), not row.subrelease),
+            )
+            if description is None:
+                description = best_row.description
             reason = None
             comment = None
 
@@ -187,7 +196,8 @@ def getBugsForSourcePackage(cursor, pkg):
                     unimportant_seen = True
                 else:
                     open_seen = True
-                    reason, comment = no_dsas.get((bug_name, best_row.release), [None, None])
+                    reason, comment = no_dsas.get((bug_name, best_row.release),
+                                                  (None, None))
                     if comment is not None:
                         state = 'no-dsa'
                     else:
@@ -211,9 +221,14 @@ def getBugsForSourcePackage(cursor, pkg):
         all_bugs.append(BugForSourcePackage(bug_name, description,
                                             global_state, releases))
 
+    # all_bugs is sorted lexicographically as that was faster.
+    all_bugs.sort(key=lambda bfsp: version_key(bfsp.bug), reverse=True)
+
     # Split all_bugs into per-state sequences.
-    per_state = {'all_releases': all_releases,
-                 'all': all_bugs}
+    per_state = {
+        'all_releases': tuple(sorted(all_releases, key=debian_support.internRelease)),
+        'all': all_bugs,
+    }
     for state in ("open", "unimportant", "resolved"):
         per_state[state] = tuple(bug for bug in all_bugs
                                  if bug.global_state == state)
@@ -367,7 +382,7 @@ class DB:
 
         cursor.execute("""CREATE TABLE package_notes
         (id INTEGER NOT NULL PRIMARY KEY,
-         bug_name TEXT NOT NULL,
+         bug_name TEXT NOT NULL REFERENCES bugs(name),
          package TEXT NOT NULL,
          fixed_version TEXT
              CHECK (fixed_version IS NULL OR fixed_version <> ''),
@@ -385,8 +400,10 @@ class DB:
 
         cursor.execute("""CREATE TABLE debian_bugs
         (bug INTEGER NOT NULL,
-         note INTEGER NOT NULL,
+         note INTEGER NOT NULL REFERENCES package_notes(id),
          PRIMARY KEY (bug, note))""")
+        cursor.execute("""CREATE INDEX debian_bugs_note_index
+        ON debian_bugs(note);""")
 
         cursor.execute("""CREATE TABLE bugs
         (name TEXT NOT NULL PRIMARY KEY,
@@ -404,6 +421,8 @@ class DB:
          typ TEXT NOT NULL CHECK (typ IN ('TODO', 'NOTE')),
          release TEXT NOT NULL DEFAULT '',
          comment TEXT NOT NULL CHECK (comment <> ''))""")
+        cursor.execute(
+            """CREATE INDEX bugs_notes_bug_name ON bugs_notes(bug_name)""")
 
         cursor.execute("""CREATE TABLE bugs_xref
         (source TEXT NOT NULL,
@@ -422,7 +441,7 @@ class DB:
 
         cursor.execute("""CREATE TABLE source_package_status
         (bug_name TEXT NOT NULL,
-         package INTEGER NOT NULL,
+         package INTEGER NOT NULL REFERENCES source_packages(rowid),
          vulnerable INTEGER NOT NULL,
          urgency TEXT NOT NULL,
          PRIMARY KEY (bug_name, package))""")



View it on GitLab: https://salsa.debian.org/security-tracker-team/security-tracker/-/compare/127625abd03a29b83a33b09d9ef1cfa3b7084fce...b482d4432f57b21c8de05a5eda26bf57ddcc60de

-- 
View it on GitLab: https://salsa.debian.org/security-tracker-team/security-tracker/-/compare/127625abd03a29b83a33b09d9ef1cfa3b7084fce...b482d4432f57b21c8de05a5eda26bf57ddcc60de
You're receiving this email because of your account on salsa.debian.org. Manage all notifications: https://salsa.debian.org/-/profile/notifications | Help: https://salsa.debian.org/help


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-security-tracker-commits/attachments/20260428/ea5a124a/attachment-0001.htm>


More information about the debian-security-tracker-commits mailing list