xsltproc and dc:identifier
Chris Lamb
lamby at debian.org
Mon Mar 3 13:54:25 GMT 2025
Hi Carles,
> But I thought that thanks to:
> https://salsa.debian.org/xml-sgml-team/libxslt/-/blob/master/debian/patches/0002-Make-generate-id-deterministic.patch?ref_type=heads
I would have thought that, too. :) But it could simply be that this
patch doesn't work as intended, or it doesn't cater for this
particular case. I mean, just by glancing at:
tctxt = xsltXPathGetTransformContext(ctxt);
if (tctxt == NULL) {
val = (long)((char *)cur - (char *)&base_address);
} else {
tctxt->nextid++;
val = tctxt->nextid;
cur->content = (void *) (val);
}
… we can see that if there is no "transform context", then we will
revert to a nondeterministic identifier.
I think that the generate-id() function is *basically* working:
$ cat test.xml
<?xml version="1.0" encoding="UTF-8"?>
<test/>
$ cat test.xsl
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:for-each select="test">
<test id="{generate-id(.)}"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
$ xsltproc test.xsl test.xml
<?xml version="1.0"?>
<test id="idm1"/>
… but when it is called outside of a for-each loop, then it falls back
to a nondeterministic identifier:
$ cat test2.xml
<?xml version="1.0" encoding="UTF-8"?>
<test/>
$ cat test2.xsl
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<test id="{generate-id(.)}"/>
</xsl:template>
</xsl:stylesheet>
$ xsltproc test2.xsl test.xml
<?xml version="1.0"?>
<test id="idm45824385479600"/>
^^^^^^^^^^^^^^
This happens in the debian-history package first via:
https://sources.debian.org/src/docbook-xsl/1.79.2%2Bdfsg-7/docbook-xsl/epub/docbook.xsl/#L327
… which calls out to this XSL template, still within docbook-xsl:
https://sources.debian.org/src/docbook-xsl/1.79.2%2Bdfsg-7/docbook-xsl/epub/docbook.xsl/#L240-L300
As these are "dc:identifier" heads, there is no forloop, hence no
"transform context" (which I think we can infer includes at least for
loops).
I'm not sure whether this is fundamentally an issue with docbook-xsl
or xlstproc. Should xlstproc generate a nondeterministic identifier
anyway, or should docbook-xsl do something different to ensure
xlstproc's xsltGenerateIdFunction(...) has enough context to generate
one?
Oh, one solution for your package might be to specify a phony ISBN,
DOI or "biblioid" etc:
https://sources.debian.org/src/docbook-xsl/1.79.2%2Bdfsg-7/docbook-xsl/epub/docbook.xsl/#L244-L279
Looking at the code, I think this will mean that it will avoid calling
generate-id...?
(This was much more XSLT than I was expecting to look at this Monday
morning.)
Best wishes,
--
,''`.
: :' : Chris Lamb
`. `'` lamby at debian.org 🍥 chris-lamb.co.uk
`-
More information about the Reproducible-builds
mailing list