Bug#1050866: poppler: missing ToUnicode support for similarequal
Vincent Lefevre
vincent at vinc17.net
Wed Aug 30 14:20:02 BST 2023
Source: poppler
Version: 22.12.0-2
Severity: normal
Tags: patch upstream
Forwarded: https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/1444
For \simeq, TeX generates /similarequal instead of Adobe's
/asymptoticallyequal; so similarequal needs to be supported too.
In TeX Live 2023:
texmf-dist/fonts/map/glyphlist/glyphlist.txt (Adobe Glyph List) contains
asymptoticallyequal;2243
but texmf-dist/fonts/map/glyphlist/texglyphlist.txt (Extensions to the
Adobe Glyph List for TeX fonts and encodings) contains
similarequal;2243
As a consequence, texmf-dist/tex/generic/pdftex/glyphtounicode.tex
contains both
\pdfglyphtounicode{asymptoticallyequal}{2243}
\pdfglyphtounicode{similarequal}{2243}
NameToUnicodeTable.h already has
{ 0x2243, "asymptoticallyequal" }
so one just needs to add the missing
{ 0x2243, "similarequal" }
To reproduce the issue, consider the following simeq.tex file:
\documentclass{article}
\usepackage[T1]{fontenc}
\begin{document}
\thispagestyle{empty}
$\simeq\approx$
\end{document}
In the PDF file generated by pdflatex, after uncompressing it with
"qpdf --stream-data=uncompress":
/F32 9.9626 Tf 148.712 707.125 Td [('\031)]TJ
and
dup 25 /approxequal put
dup 39 /similarequal put
i.e. /similarequal is generated for \simeq, and pdftotext gives
'≈
(the apostrophe ', code 39, corresponds to /similarequal, but appears
as an apostrophe since /similarequal is not supported; and \031, i.e.
25 in decimal, corresponds to /approxequal, which appears correctly
because /approxequal is supported).
With the attached patch, pdftotext gives
≃≈
as wanted.
I've created a merge request upstream.
-- System Information:
Debian Release: trixie/sid
APT prefers unstable-debug
APT policy: (500, 'unstable-debug'), (500, 'stable-updates'), (500, 'stable-security'), (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental')
merged-usr: no
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 6.4.0-3-amd64 (SMP w/12 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=POSIX, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
--
Vincent Lefèvre <vincent at vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
-------------- next part --------------
Description: add ToUnicode support for similarequal.
For \simeq, TeX generates /similarequal instead of Adobe's
/asymptoticallyequal; so similarequal needs to be supported too.
In TeX Live 2023:
texmf-dist/fonts/map/glyphlist/glyphlist.txt (Adobe Glyph List) contains
asymptoticallyequal;2243
but texmf-dist/fonts/map/glyphlist/texglyphlist.txt (Extensions to the
Adobe Glyph List for TeX fonts and encodings) contains
similarequal;2243
As a consequence, texmf-dist/tex/generic/pdftex/glyphtounicode.tex
contains both
\pdfglyphtounicode{asymptoticallyequal}{2243}
\pdfglyphtounicode{similarequal}{2243}
NameToUnicodeTable.h already has
{ 0x2243, "asymptoticallyequal" }
so one just needs to add the missing
{ 0x2243, "similarequal" }
Merge-Request: https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/1444
Author: Vincent Lefevre <vincent at vinc17.net>
Last-Update: 2023-08-30
diff --git a/poppler/NameToUnicodeTable.h b/poppler/NameToUnicodeTable.h
index c7749f00..36bb5bb7 100644
--- a/poppler/NameToUnicodeTable.h
+++ b/poppler/NameToUnicodeTable.h
@@ -3518,6 +3518,7 @@ static const struct NameToUnicodeTab nameToUnicodeTextTab[] = { { 0x0021, "!" },
{ 0x05bd, "siluqhebrew" },
{ 0x05bd, "siluqlefthebrew" },
{ 0x223c, "similar" },
+ { 0x2243, "similarequal" },
{ 0x05c2, "sindothebrew" },
{ 0x3274, "siosacirclekorean" },
{ 0x3214, "siosaparenkorean" },
More information about the Pkg-freedesktop-maintainers
mailing list