Bug#1077497: encguess.1: Some remarks and editorial changes for this man page

Bjarni Ingi Gislason bjarniig at simnet.is
Mon Jul 29 13:51:11 BST 2024


Package: perl
Version: 5.38.2-5
Severity: minor
Tags: patch

   * What led up to the situation?

     Checking for defects with

[test-]groff -mandoc -t -K utf8 -rF0 -rHY=0 -ww -b -z < "man page"

  [test-groff is a script in the repository for "groff"] (local copy and
"troff" slightly changed by me).

   * What was the outcome of this action?

troff: backtrace: file '<stdin>':80
troff:<stdin>:80: warning: trailing space in the line

   * What outcome did you expect instead?

     No output (no warnings).

-.-

  General remarks and further material (declared as a "diff" file) are in the
attachments.



-- System Information:
Debian Release: trixie/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 6.9.10-amd64 (SMP w/2 CPU threads; PREEMPT)
Locale: LANG=is_IS.iso88591, LC_CTYPE=is_IS.iso88591 (charmap=ISO-8859-1), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: sysvinit (via /sbin/init)

Versions of packages perl depends on:
ii  dpkg               1.22.9
ii  libperl5.38t64     5.38.2-5
ii  perl-base          5.38.2-5
ii  perl-modules-5.38  5.38.2-5

Versions of packages perl recommends:
ii  netbase  6.4

Versions of packages perl suggests:
pn  libtap-harness-archive-perl                             <none>
pn  libterm-readline-gnu-perl | libterm-readline-perl-perl  <none>
ii  make                                                    4.3-4.1
ii  perl-doc                                                5.38.2-5

-- no debconf information
-------------- next part --------------
  Any program (person), that produces man pages, should check its content for
defects by using

groff -mandoc -t -ww -b -z [ -K utf8 | k ] <man page>

  The same goes for man pages that are used as an input.

  For a style guide use

  mandoc -T lint

-.-

  So any generator should check its products with the above mentioned
'groff', 'mandoc',  and additionally with 'nroff ...'.

  This is just a simple quality control measure.

  The generator may have to be corrected to get a better man page,
the source file may, and any additional file may.

  Common errors:

  Input text line longer than 80 bytes.

  Not removing trailing spaces (in in- and output).
  The reason for these trailing spaces should be found and eliminated.

  Not beginning each input sentence
(that is not confined to a markup)
in the first column.
Line length should thus be reduced.


-.-

The difference between the formatted outputs can be seen with:

  nroff -mandoc <file1> > <out1>
  nroff -mandoc <file2> > <out2>
  diff -u <out1> <out2>

and for groff, using

"printf '%s\n%s\n' '.kern 0' '.ss 12 0' | groff -mandoc -Z - "

instead of "nroff -mandoc"

  Add the option "-t", if the file contains a table.

  Read the output of "diff -u" with "less -R" or similar.

-.-.

  If "man" (man-db) is used to check the manual for warnings,
the following must be set:

  The option "-warnings=w"

  The environmental variable:

export MAN_KEEP_STDERR=yes (or any non-empty value)

  or

  (produce only warnings):

export MANROFFOPT="-ww -b -z"

export MAN_KEEP_STDERR=yes (or any non-empty value)

-.-.

Output from "mandoc -T lint encguess.1": (possibly shortened list)

mandoc: encguess.1:80:52: STYLE: whitespace at end of input line
mandoc: encguess.1:92:81: STYLE: input text line longer than 80 bytes: Guess encoding of a ...
mandoc: encguess.1:99:85: STYLE: input text line longer than 80 bytes: Guess the encoding t...

-.-.

Remove space characters at the end of lines.

Use "git apply ... --whitespace=fix" to fix extra space issues, or use
global configuration "core.whitespace".

80:specify a list of "suspect encoding types" to test, 

-.-.

Find a repeated word

! 130 --> the

-.-.

Wrong distance between sentences.

  Separate the sentences and subordinate clauses; each begins on a new
line.  See man-pages(7) ("Conventions for source file layout") and
"info groff" ("Input Conventions").

  The best procedure is to always start a new sentence on a new line,
at least, if you are typing on a computer.

Remember coding: Only one command ("sentence") on each (logical) line.

E-mail: Easier to quote exactly the relevant lines.

Generally: Easier to edit the sentence.

Patches: Less unaffected text.

Search for two adjacent words is easier, when they belong to the same line,
and the same phrase.

  The amount of space between sentences in the output can then be
controlled with the ".ss" request.

N.B.

  The number of lines affected can be too large to be in the patch.

116:time until all but the right type are eliminated. The set of encoding
118:utf8 and UTF\-16/32 with BOM. This can be overridden by passing one or
119:more encoding types via the \-s parameter. If you need to pass in
130:under the terms of the the Artistic License (2.0). You may obtain a

-.-.

Split lines longer than 80 characters into two or more lines.
Appropriate break points are the end of a sentence and a subordinate
clause; after punctuation marks.

Line 92, length 81

Guess encoding of a file named \f(CW\*(C`test.txt\*(C'\fR, using only the default

Line 99, length 85

Guess the encoding type of a file named \f(CW\*(C`test.txt\*(C'\fR, using the suspect


-.-.

Add a zero (0) in front of a decimal fraction that begins with a period
(.)

7:.if t .sp .5v

-.-.

Write "<number>-bit", not "<number>bit", see man-pages(7)

100:types \f(CW\*(C`euc\-jp,shiftjis,7bit\-jis\*(C'\fR.
103:\&   encguess \-s euc\-jp,shiftjis,7bit\-jis test.txt
104:\&   encguess \-s euc\-jp:shiftjis:7bit\-jis test.txt
111:\&   encguess \-us euc\-jp,shiftjis,7bit\-jis test*.txt

-.-.

Output from "test-groff -b -mandoc -rF0 -rHY=0 -K utf8 -t -ww -z ":

troff: backtrace: file '<stdin>':80
troff:<stdin>:80: warning: trailing space in the line

-------------- next part --------------
--- encguess.1	2024-07-29 11:23:26.697734034 +0000
+++ encguess.1.new	2024-07-29 11:39:00.552372803 +0000
@@ -4,7 +4,7 @@
 .\" Standard preamble:
 .\" ========================================================================
 .de Sp \" Vertical space (when we can't use .PP)
-.if t .sp .5v
+.if t .sp 0.5v
 .if n .sp
 ..
 .de Vb \" Begin verbatim text
@@ -77,7 +77,7 @@ encguess \- guess character encodings of
 show this message and exit.
 .IP \-s 2
 .IX Item "-s"
-specify a list of "suspect encoding types" to test, 
+specify a list of "suspect encoding types" to test,
 separated by either \f(CW\*(C`:\*(C'\fR or \f(CW\*(C`,\*(C'\fR
 .IP \-S 2
 .IX Item "-S"
@@ -89,15 +89,15 @@ suppress display of unidentified types
 .SS EXAMPLES:
 .IX Subsection "EXAMPLES:"
 .IP \(bu 2
-Guess encoding of a file named \f(CW\*(C`test.txt\*(C'\fR, using only the default
-suspect types.
+Guess encoding of a file named \f(CW\*(C`test.txt\*(C'\fR,
+using only the default suspect types.
 .Sp
 .Vb 1
 \&   encguess test.txt
 .Ve
 .IP \(bu 2
-Guess the encoding type of a file named \f(CW\*(C`test.txt\*(C'\fR, using the suspect
-types \f(CW\*(C`euc\-jp,shiftjis,7bit\-jis\*(C'\fR.
+Guess the encoding type of a file named \f(CW\*(C`test.txt\*(C'\fR,
+using the suspect types \f(CW\*(C`euc\-jp,shiftjis,7bit\-jis\*(C'\fR.
 .Sp
 .Vb 2
 \&   encguess \-s euc\-jp,shiftjis,7bit\-jis test.txt
@@ -113,12 +113,14 @@ unidentified files.
 .SH DESCRIPTION
 .IX Header "DESCRIPTION"
 The encoding identification is done by checking one encoding type at a
-time until all but the right type are eliminated. The set of encoding
+time until all but the right type are eliminated.
+The set of encoding
 types to try is defined by the \-s parameter and defaults to ascii,
-utf8 and UTF\-16/32 with BOM. This can be overridden by passing one or
-more encoding types via the \-s parameter. If you need to pass in
-multiple suspect encoding types, use a quoted string with the a space
-separating each value.
+utf8 and UTF\-16/32 with BOM.
+This can be overridden by passing one or
+more encoding types via the \-s parameter.
+If you need to pass in multiple suspect encoding types,
+use a quoted string with the a space separating each value.
 .SH "SEE ALSO"
 .IX Header "SEE ALSO"
 Encode::Guess, Encode::Detect
@@ -127,7 +129,7 @@ Encode::Guess, Encode::Detect
 Copyright 2015 Michael LaGrasta and Dan Kogai.
 .PP
 This program is free software; you can redistribute it and/or modify it
-under the terms of the the Artistic License (2.0). You may obtain a
-copy of the full license at:
+under the terms of the Artistic License (2.0).
+You may obtain a copy of the full license at:
 .PP
 <http://www.perlfoundation.org/artistic_license_2_0>


More information about the Perl-maintainers mailing list