Bug#1091782: pdfimages.1: Some remarks and a patch with editorial changes for this man page

Bjarni Ingi Gislason bjarniig at simnet.is
Tue Dec 31 10:35:54 GMT 2024


Package: poppler-utils
Version: 24.08.0-3
Severity: minor
Tags: patch

   * What led up to the situation?

     Checking for defects with a new version

test-[g|n]roff -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z < "man page"

  [Use "groff -e ' $' <file>" to find trailing spaces.]

  ["test-groff" is a script in the repository for "groff"; is not shipped]
(local copy and "troff" slightly changed by me).

  [The fate of "test-nroff" was decided in groff bug #55941.]

   * What was the outcome of this action?

an.tmac:<stdin>:2: style: .TH missing fourth argument; consider package/project name and version (e.g., "groff 1.23.0")

   * What outcome did you expect instead?

     No output (no warnings).

-.-

  General remarks and further material, if a diff-file exist, are in the
attachments.


-- System Information:
Debian Release: trixie/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 6.12.6-amd64 (SMP w/2 CPU threads; PREEMPT)
Locale: LANG=is_IS.iso88591, LC_CTYPE=is_IS.iso88591 (charmap=ISO-8859-1), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: sysvinit (via /sbin/init)

Versions of packages poppler-utils depends on:
ii  libc6          2.40-4
ii  libcairo2      1.18.2-2
ii  libfreetype6   2.13.3+dfsg-1
ii  liblcms2-2     2.16-2
ii  libpoppler140  24.08.0-3
ii  libstdc++6     14.2.0-8

poppler-utils recommends no packages.

poppler-utils suggests no packages.

-- no debconf information
-------------- next part --------------
Input file is pdfimages.1

  Any program (person), that produces man pages, should check the output
for defects by using (both groff and nroff)

[gn]roff -mandoc -t -ww -b -z -K utf8  <man page>

  The same goes for man pages that are used as an input.

  For a style guide use

  mandoc -T lint

-.-

  So any 'generator' should check its products with the above mentioned
'groff', 'mandoc',  and additionally with 'nroff ...'.

  This is just a simple quality control measure.

  The 'generator' may have to be corrected to get a better man page,
the source file may, and any additional file may.

  Common defects:

  Input text line longer than 80 bytes.

  Not removing trailing spaces (in in- and output).
  The reason for these trailing spaces should be found and eliminated.

  Not beginning each input sentence on a new line.
Lines should thus be shorter.

  See man-pages(7), item 'semantic newline'.

-.-

The difference between the formatted output of the original and patched file
can be seen with:

  nroff -mandoc <file1> > <out1>
  nroff -mandoc <file2> > <out2>
  diff -u <out1> <out2>

and for groff, using

"printf '%s\n%s\n' '.kern 0' '.ss 12 0' | groff -mandoc -Z - "

instead of 'nroff -mandoc'

  Add the option '-t', if the file contains a table.

  Read the output of 'diff -u' with 'less -R' or similar.

-.-.

  If 'man' (man-db) is used to check the manual for warnings,
the following must be set:

  The option "-warnings=w"

  The environmental variable:

export MAN_KEEP_STDERR=yes (or any non-empty value)

  or

  (produce only warnings):

export MANROFFOPT="-ww -b -z"

export MAN_KEEP_STDERR=yes (or any non-empty value)


-.-.

Output from "mandoc -T lint  pdfimages.1": (shortened list)

     13 input text line longer than 80 bytes

-.-.

Output from "test-groff -mandoc -t -ww -z pdfimages.1": (shortened list)

      1 	Use macro '.I' for one argument or split argument.
      1 .IR is for at least 2 arguments, got 1

-.-.

Change '-' (\-) to '\(en' (en-dash) for a numeric range.
GNU gnulib has recently (2023-06-18) updated its
"build_aux/update-copyright" to recognize "\(en" in man pages.

pdfimages.1:255:The pdfimages software and documentation are copyright 1998-2011 Glyph

-.-.

Use the correct macro for the font change of a single argument or
split the argument into two.

101:.IR image-root

-.-.

Change a HYPHEN-MINUS (code 0x2D) to a minus(-dash) (\-),
if it
is in front of a name for an option,
is a symbol for standard input,
is a single character used to indicate an option,
or is in the NAME section (man-pages(7)).
N.B. - (0x2D), processed as a UTF-8 file, is changed to a hyphen
(0x2010, groff \[u2010] or \[hy]) in the output.

19:.IR image-root - nnn . xxx ,
26:is \'-', it reads the PDF file from stdin.
117:image - an opaque image
120:mask - a monochrome mask image
123:smask - a soft-mask image
126:stencil - a monochrome mask image used for painting a color or pattern
144:gray - Gray
147:rgb - RGB
150:cmyk - CMYK
153:lab - L*a*b
156:icc - ICC Based
159:index - Indexed Color
162:sep - Separation
165:devn - DeviceN
178:image - raster image (may be Flate or LZW compressed but does not use an image encoding)
181:jpeg - Joint Photographic Experts Group
184:jp2 - JPEG2000
187:jbig2 - Joint Bi-Level Image Experts Group
190:ccitt - CCITT Group 3 or Group 4 Fax

-.-.

Wrong distance between sentences in the input file.

  Separate the sentences and subordinate clauses; each begins on a new
line.  See man-pages(7) ("Conventions for source file layout") and
"info groff" ("Input Conventions").

  The best procedure is to always start a new sentence on a new line,
at least, if you are typing on a computer.

Remember coding: Only one command ("sentence") on each (logical) line.

E-mail: Easier to quote exactly the relevant lines.

Generally: Easier to edit the sentence.

Patches: Less unaffected text.

Search for two adjacent words is easier, when they belong to the same line,
and the same phrase.

  The amount of space between sentences in the output can then be
controlled with the ".ss" request.

Mark a final abbreviation point as such by suffixing it with "\&".

29:non-monochrome. The \-png or \-tiff options change to default output
30:to PNG or TIFF respectively. If both \-png and \-tiff are specified,
32:written as PNG. In addition the \-j, \-jp2, and \-jbig2 options will
50:Write images in JPEG format as JPEG files instead of the default format. The JPEG file is identical to the JPEG data stored in the PDF.
53:Write images in JPEG2000 format as JP2 files instead of the default format. The JP2 file is identical to the JPEG2000 data stored in the PDF.
56:Write images in JBIG2 format as JBIG2 files instead of the default format. JBIG2 data in PDF is of the embedded type. The embedded type of JBIG2 has an optional separate file containing global data. The embedded data is written with the extension .jb2e and the global data (if available) will be written to the same image number with the extension .jb2g. The content of both these files is identical to the JBIG2 data in the PDF.
60:format. The CCITT file is identical to the CCITT data stored in the
61:PDF. PDF files contain additional parameters specifying
62:how to decode the CCITT data. These parameters are translated to
64:number. The parameters are:
96:Write JPEG, JPEG2000, JBIG2, and CCITT images in their native format. CMYK files are written as TIFF files. All other images are written as PNG files.
100:Instead of writing the images, list the images along with various information for each image. Do not specify an
206:The size of the embedded image in the pdf file. The following suffixes are used: 'B' bytes, 'K' kilobytes, 'M' megabytes, and 'G' gigabytes.

-.-.

Split lines longer than 80 characters into two or more lines.
Appropriate break points are the end of a sentence and a subordinate
clause; after punctuation marks.


Line 50, length 135

Write images in JPEG format as JPEG files instead of the default format. The JPEG file is identical to the JPEG data stored in the PDF.

Line 53, length 141

Write images in JPEG2000 format as JP2 files instead of the default format. The JP2 file is identical to the JPEG2000 data stored in the PDF.

Line 56, length 429

Write images in JBIG2 format as JBIG2 files instead of the default format. JBIG2 data in PDF is of the embedded type. The embedded type of JBIG2 has an optional separate file containing global data. The embedded data is written with the extension .jb2e and the global data (if available) will be written to the same image number with the extension .jb2g. The content of both these files is identical to the JBIG2 data in the PDF.

Line 96, length 150

Write JPEG, JPEG2000, JBIG2, and CCITT images in their native format. CMYK files are written as TIFF files. All other images are written as PNG files.

Line 97, length 84

This is equivalent to specifying the options \-png \-tiff \-j \-jp2 \-jbig2 \-ccitt.

Line 100, length 111

Instead of writing the images, list the images along with various information for each image. Do not specify an

Line 129, length 106

Note: Tranparency in images is represented in PDF using a separate image for the image and the mask/smask.

Line 130, length 106

The mask/smask used as part of a transparent image always immediately follows the image in the image list.

Line 138, length 107

Note: the image width/height is the size of the embedded image, not the size the image will be rendered at.

Line 178, length 88

image - raster image (may be Flate or LZW compressed but does not use an image encoding)

Line 200, length 90

The horizontal resolution of the image (in pixels per inch) when rendered on the pdf page.

Line 203, length 88

The vertical resolution of the image (in pixels per inch) when rendered on the pdf page.

Line 206, length 140

The size of the embedded image in the pdf file. The following suffixes are used: 'B' bytes, 'K' kilobytes, 'M' megabytes, and 'G' gigabytes.


-.-.

Use \(en (en-dash) for a dash at the beginning of a line,
or between space characters,
not a minus (\-) or a hyphen (-), except in the NAME section.

pdfimages.1:117:image - an opaque image
pdfimages.1:120:mask - a monochrome mask image
pdfimages.1:123:smask - a soft-mask image
pdfimages.1:126:stencil - a monochrome mask image used for painting a color or pattern
pdfimages.1:144:gray - Gray
pdfimages.1:147:rgb - RGB
pdfimages.1:150:cmyk - CMYK
pdfimages.1:153:lab - L*a*b
pdfimages.1:156:icc - ICC Based
pdfimages.1:159:index - Indexed Color
pdfimages.1:162:sep - Separation
pdfimages.1:165:devn - DeviceN
pdfimages.1:178:image - raster image (may be Flate or LZW compressed but does not use an image encoding)
pdfimages.1:181:jpeg - Joint Photographic Experts Group
pdfimages.1:184:jp2 - JPEG2000
pdfimages.1:187:jbig2 - Joint Bi-Level Image Experts Group
pdfimages.1:190:ccitt - CCITT Group 3 or Group 4 Fax

-.-.

Put a parenthetical sentence, phrase on a separate line,
if not part of a code.
See man-pages(7), item "semantic newline".
Not considered in a patch, too many lines.


pdfimages.1:24:is the image type (.ppm, .pbm, .png, .tif, .jpg, jp2, jb2e, or jb2g).  If
pdfimages.1:28:The default output format is PBM (for monochrome images) or PPM for
pdfimages.1:56:Write images in JBIG2 format as JBIG2 files instead of the default format. JBIG2 data in PDF is of the embedded type. The embedded type of JBIG2 has an optional separate file containing global data. The embedded data is written with the extension .jb2e and the global data (if available) will be written to the same image number with the extension .jb2g. The content of both these files is identical to the JBIG2 data in the PDF.
pdfimages.1:178:image - raster image (may be Flate or LZW compressed but does not use an image encoding)
pdfimages.1:197:the image dictionary object ID (number and generation)
pdfimages.1:200:The horizontal resolution of the image (in pixels per inch) when rendered on the pdf page.
pdfimages.1:203:The vertical resolution of the image (in pixels per inch) when rendered on the pdf page.

-.-.

Output from "test-groff  -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z ":

an.tmac:<stdin>:2: style: .TH missing fourth argument; consider package/project name and version (e.g., "groff 1.23.0")
an.tmac:<stdin>:101: misuse, warning: .IR is for at least 2 arguments, got 1
	Use macro '.I' for one argument or split argument.

-.-.

  Additionally (general):

  Abbreviations get a '\&' added after their final full stop (.) to mark them
as such and not as an end of a sentence.
-------------- next part --------------
--- pdfimages.1	2024-12-31 09:56:03.245683345 +0000
+++ pdfimages.1.new	2024-12-31 10:28:59.609530660 +0000
@@ -15,23 +15,32 @@ Tagged Image File Format (TIFF), JPEG, J
 .PP
 Pdfimages reads the PDF file
 .IR PDF-file ,
-scans one or more pages, and writes one file for each image,
+scans one or more pages,
+and writes one file for each image,
 .IR image-root - nnn . xxx ,
 where
 .I nnn
 is the image number and
 .I xxx
-is the image type (.ppm, .pbm, .png, .tif, .jpg, jp2, jb2e, or jb2g).  If
+is the image type
+(.ppm, .pbm, .png, .tif, .jpg, jp2, jb2e, or jb2g).
+If
 .I PDF-file
-is \'-', it reads the PDF file from stdin.
+is \'\-',
+it reads the PDF file from stdin.
 .PP
-The default output format is PBM (for monochrome images) or PPM for
-non-monochrome. The \-png or \-tiff options change to default output
-to PNG or TIFF respectively. If both \-png and \-tiff are specified,
-CMYK images will be written as TIFF and all other images will be
-written as PNG. In addition the \-j, \-jp2, and \-jbig2 options will
-cause JPEG, JPEG2000, and JBIG2, respectively, images in the PDF file
-to be written in their native format.
+The default output format is PBM
+(for monochrome images)
+or PPM for non-monochrome.
+The \-png or \-tiff options change to default output to PNG
+or TIFF respectively.
+If both \-png and \-tiff are specified,
+CMYK images will be written as TIFF
+and all other images will be written as PNG.
+In addition the \-j, \-jp2, and \-jbig2 options will
+cause JPEG, JPEG2000, and JBIG2,
+respectively,
+images in the PDF file to be written in their native format.
 .SH OPTIONS
 .TP
 .BI \-f " number"
@@ -47,21 +56,33 @@ Change the default output format to PNG.
 Change the default output format to TIFF.
 .TP
 .B \-j
-Write images in JPEG format as JPEG files instead of the default format. The JPEG file is identical to the JPEG data stored in the PDF.
+Write images in JPEG format as JPEG files instead of the default format.
+The JPEG file is identical to the JPEG data stored in the PDF.
 .TP
 .B \-jp2
-Write images in JPEG2000 format as JP2 files instead of the default format. The JP2 file is identical to the JPEG2000 data stored in the PDF.
+Write images in JPEG2000 format as JP2 files instead of the default format.
+The JP2 file is identical to the JPEG2000 data stored in the PDF.
 .TP
 .B \-jbig2
-Write images in JBIG2 format as JBIG2 files instead of the default format. JBIG2 data in PDF is of the embedded type. The embedded type of JBIG2 has an optional separate file containing global data. The embedded data is written with the extension .jb2e and the global data (if available) will be written to the same image number with the extension .jb2g. The content of both these files is identical to the JBIG2 data in the PDF.
+Write images in JBIG2 format as JBIG2 files instead of the default format.
+JBIG2 data in PDF is of the embedded type.
+The embedded type of JBIG2 has an optional separate file containing global
+data.
+The embedded data is written with the extension .jb2e
+and the global data
+(if available)
+will be written to the same image number with the extension .jb2g.
+The content of both these files is identical to the JBIG2 data in the PDF.
 .TP
 .B \-ccitt
-Write images in CCITT format as CCITT files instead of the default
-format. The CCITT file is identical to the CCITT data stored in the
-PDF. PDF files contain additional parameters specifying
-how to decode the CCITT data. These parameters are translated to
-fax2tiff input options and written to a .params file with the same image
-number. The parameters are:
+Write images in CCITT format as CCITT files instead of the default format.
+The CCITT file is identical to the CCITT data stored in the
+PDF.
+PDF files contain additional parameters specifying
+how to decode the CCITT data.
+These parameters are translated to fax2tiff input options
+and written to a .params file with the same image number.
+The parameters are:
 .RS
 .TP
 .B \-1
@@ -79,7 +100,7 @@ Beginning of line is aligned on a byte b
 .B \-P
 Beginning of line is not aligned on a byte boundary
 .TP
-.B \-X n
+.BI \-X " n"
 The image width in pixels
 .TP
 .B \-W
@@ -93,12 +114,18 @@ Input data fills from most significant b
 .RE
 .TP
 .B \-all
-Write JPEG, JPEG2000, JBIG2, and CCITT images in their native format. CMYK files are written as TIFF files. All other images are written as PNG files.
-This is equivalent to specifying the options \-png \-tiff \-j \-jp2 \-jbig2 \-ccitt.
+Write JPEG, JPEG2000, JBIG2,
+and CCITT images in their native format.
+CMYK files are written as TIFF files.
+All other images are written as PNG files.
+This is equivalent to specifying the options
+\-png \-tiff \-j \-jp2 \-jbig2 \-ccitt.
 .TP
 .B \-list
-Instead of writing the images, list the images along with various information for each image. Do not specify an
-.IR image-root
+Instead of writing the images,
+list the images along with various information for each image.
+Do not specify an
+.I image-root
 with this option.
 .IP
 The following information is listed for each image:
@@ -114,20 +141,22 @@ the image number
 the image type:
 .PP
 .RS
-image - an opaque image
+image \(en an opaque image
 .RE
 .RS
-mask - a monochrome mask image
+mask \(en a monochrome mask image
 .RE
 .RS
-smask - a soft-mask image
+smask \(en a soft-mask image
 .RE
 .RS
-stencil - a monochrome mask image used for painting a color or pattern
+stencil \(en a monochrome mask image used for painting a color or pattern
 .RE
 .PP
-Note: Tranparency in images is represented in PDF using a separate image for the image and the mask/smask.
-The mask/smask used as part of a transparent image always immediately follows the image in the image list.
+Note: Tranparency in images is represented in PDF
+using a separate image for the image and the mask/smask.
+The mask/smask used as part of a transparent image
+always immediately follows the image in the image list.
 .TP
 .B width
 image width (in pixels)
@@ -135,34 +164,35 @@ image width (in pixels)
 .B height
 image height (in pixels)
 .PP
-Note: the image width/height is the size of the embedded image, not the size the image will be rendered at.
+Note: the image width/height is the size of the embedded image,
+not the size the image will be rendered at.
 .TP
 .B color
 image color space:
 .PP
 .RS
-gray - Gray
+gray \(en Gray
 .RE
 .RS
-rgb - RGB
+rgb \(en RGB
 .RE
 .RS
-cmyk - CMYK
+cmyk \(en CMYK
 .RE
 .RS
-lab - L*a*b
+lab \(en L*a*b
 .RE
 .RS
-icc - ICC Based
+icc \(en ICC Based
 .RE
 .RS
-index - Indexed Color
+index \(en Indexed Color
 .RE
 .RS
-sep - Separation
+sep \(en Separation
 .RE
 .RS
-devn - DeviceN
+devn \(en DeviceN
 .RE
 .TP
 .B comp
@@ -175,35 +205,43 @@ bits per component
 encoding:
 .PP
 .RS
-image - raster image (may be Flate or LZW compressed but does not use an image encoding)
+image \(en raster image
+(may be Flate or LZW compressed but does not use an image encoding)
 .RE
 .RS
-jpeg - Joint Photographic Experts Group
+jpeg \(en Joint Photographic Experts Group
 .RE
 .RS
-jp2 - JPEG2000
+jp2 \(en JPEG2000
 .RE
 .RS
-jbig2 - Joint Bi-Level Image Experts Group
+jbig2 \(en Joint Bi-Level Image Experts Group
 .RE
 .RS
-ccitt - CCITT Group 3 or Group 4 Fax
+ccitt \(en CCITT Group 3 or Group 4 Fax
 .RE
 .TP
 .B interp
 "yes" if the interpolation is to be performed when scaling up the image
 .TP
 .B object ID
-the image dictionary object ID (number and generation)
+the image dictionary object ID
+(number and generation)
 .TP
 .B x\-ppi
-The horizontal resolution of the image (in pixels per inch) when rendered on the pdf page.
+The horizontal resolution of the image
+(in pixels per inch)
+when rendered on the pdf page.
 .TP
 .B y\-ppi
-The vertical resolution of the image (in pixels per inch) when rendered on the pdf page.
+The vertical resolution of the image
+(in pixels per inch)
+when rendered on the pdf page.
 .TP
 .B size
-The size of the embedded image in the pdf file. The following suffixes are used: 'B' bytes, 'K' kilobytes, 'M' megabytes, and 'G' gigabytes.
+The size of the embedded image in the pdf file.
+The following suffixes are used:
+\&'B' bytes, 'K' kilobytes, 'M' megabytes, and 'G' gigabytes.
 .TP
 .B ratio
 The compression ratio of the embedded image.
@@ -252,7 +290,7 @@ Error related to PDF permissions.
 99
 Other error.
 .SH AUTHOR
-The pdfimages software and documentation are copyright 1998-2011 Glyph
+The pdfimages software and documentation are copyright 1998\(en2011 Glyph
 & Cog, LLC.
 .SH "SEE ALSO"
 .BR pdfdetach (1),


More information about the Pkg-freedesktop-maintainers mailing list