Bug#1120096: URI::Heuristic.3pm: Some remarks and a patch with editorial changes for this man page

Bjarni Ingi Gislason bjarniig at simnet.is
Wed Nov 5 08:11:42 GMT 2025


Package: liburi-perl
Version: 5.34-2
Severity: minor
Tags: patch

>From "/usr/share/doc/debian/bug-reporting.txt.gz":

  Don't file bugs upstream

   If you file a bug in Debian, don't send a copy to the upstream software
   maintainers yourself, as it is possible that the bug exists only in
   Debian. If necessary, the maintainer of the package will forward the
   bug upstream.

-.-

  I do not send reports upstream if I have to get an account there.
The Debian maintainers have one already.

  If I get a negative (or no) response from upstream, I send henceforth
bugs to Debian.

-.-

   * What led up to the situation?

     Checking for defects with a new version

test-[g|n]roff -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=0 -ww -z < "man page"

  [Use 

grep -n -e ' $' -e '\\~$' -e ' \\f.$' -e ' \\"' <file>

  to find (most) trailing spaces.]

  ["test-groff" is a script in the repository for "groff"; is not shipped]
(local copy and "troff" slightly changed by me).

  [The fate of "test-nroff" was decided in groff bug #55941.]

   * What was the outcome of this action?

Output from "test-groff  -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=0 -ww -z ":

troff:<stdin>:88: warning: trailing space in the line
troff:<stdin>:89: warning: trailing space in the line
troff:<stdin>:90: warning: trailing space in the line
troff:<stdin>:91: warning: trailing space in the line

   * What outcome did you expect instead?

     No output (no warnings).

-.-

  General remarks and further material, if a diff-file exist, are in the
attachments.


-- System Information:
Debian Release: forky/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 6.16.12+deb14+1-amd64 (SMP w/2 CPU threads; PREEMPT)
Locale: LANG=is_IS.iso88591, LC_CTYPE=is_IS.iso88591 (charmap=ISO-8859-1), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: sysvinit (via /sbin/init)

Versions of packages liburi-perl depends on:
ii  perl [libencode-perl]                  5.40.1-6
ii  perl-base [libscalar-list-utils-perl]  5.40.1-6

liburi-perl recommends no packages.

Versions of packages liburi-perl suggests:
pn  libbusiness-isbn-perl  <none>
pn  libmime-base32-perl    <none>
ii  libregexp-ipv6-perl    0.03-3
ii  libwww-perl            6.81-1

-- no debconf information
-------------- next part --------------
Input file is URI::Heuristic.3pm

Output from "mandoc -T lint  URI::Heuristic.3pm": (shortened list)

      1 STYLE: input text line longer than 80 bytes: 
      4 STYLE: whitespace at end of input line


Find most trailing spaces with:
grep -n -e ' $' -e ' \\f.$' -e ' \\"' <man page>

-.-.

Output from
test-nroff -mandoc -t -ww -z URI::Heuristic.3pm: (shortened list)

      4 line(s) with a trailing space


Find most trailing spaces with:
grep -n -e ' $' -e ' \\f.$' -e ' \\"' <man page>

-.-.

Show if Pod::Man generated this.

2:.\" Automatically generated by Pod::Man 5.0102 (Pod::Simple 3.45)

Latest version in Debian testing:

This is perl 5, version 40, subversion 1 (v5.40.1) built for x86_64-linux-gnu-thread-multi
(with 48 registered patches, see perl -V for more detail)

-.-.

Remove space characters (whitespace) at the end of lines.
Use "git apply ... --whitespace=fix" to fix extra space issues, or use
global configuration "core.whitespace".

Number of lines affected is

4

-.-.

Change '-' (\-) to '\(en' (en-dash) for a (numeric) range.

GNU gnulib has recently (2023-06-18) updated its
"build_aux/update-copyright" to recognize "\(en" in man pages.

URI::Heuristic.3pm:128:Copyright 1997\-1998, Gisle Aas

-.-.

Add a "\&" (or a comma (Oxford comma)) after an abbreviation
or use English words
(man-pages(7)).
Abbreviation points should be marked as such and protected against being
interpreted as an end of sentence, if they are not, and that independent
of the current place on the line.

79:absolute URIs (i.e. that start with a \f(CW\*(C`scheme:\*(C'\fR part) are never modified

-.-.

Wrong distance (not two spaces) between sentences in the input file.

  Separate the sentences and subordinate clauses; each begins on a new
line.  See man-pages(7) ("Conventions for source file layout") and
"info groff" ("Input Conventions").

  The best procedure is to always start a new sentence on a new line,
at least, if you are typing on a computer.

Remember coding: Only one command ("sentence") on each (logical) line.

E-mail: Easier to quote exactly the relevant lines.

Generally: Easier to edit the sentence.

Patches: Less unaffected text.

Search for two adjacent words is easier, when they belong to the same line,
and the same phrase.

  The amount of space between sentences in the output can then be
controlled with the ".ss" request.

Mark a final abbreviation point as such by suffixing it with "\&".

Some sentences (etc.) do not begin on a new line.

Split (sometimes) lines after a punctuation mark; before a conjunction.

  Lines with only one (or two) space(s) between sentences could be split,
so latter sentences begin on a new line.

Use

#!/usr/bin/sh

sed -e '/^\./n' \
-e 's/\([[:alpha:]]\)\.  */\1.\n/g' $1

to split lines after a sentence period.
Check result with the difference between the formatted outputs.
See also the attachment "general.bugs"

79:absolute URIs (i.e. that start with a \f(CW\*(C`scheme:\*(C'\fR part) are never modified
90:scheme (http, ftp, etc.) is a URL rather than a local path.  So don't name 
106:to be the default country. See also Locale::Country.

-.-.

Split lines longer than 80 characters (fill completly
an A4 sized page line on a terminal)
into two or more lines.
Appropriate break points are the end of a sentence and a subordinate
clause; after punctuation marks.
Add "\:" to split the string for the output, "\<newline>" in the source.  

Line 58, length 86

.TH URI::Heuristic 3pm 2025-10-11 "perl v5.40.1" "User Contributed Perl Documentation"

Line 79, length 88

absolute URIs (i.e. that start with a \f(CW\*(C`scheme:\*(C'\fR part) are never modified

Longest line is number 79 with 88 characters

-.-.

Add a zero (0) in front of a decimal fraction that begins with a period
(.)

7:.if t .sp .5v

-.-.

Put a parenthetical sentence, phrase on a separate line,
if not part of a code.
See man-pages(7), item "semantic newline".

URI::Heuristic.3pm:79:absolute URIs (i.e. that start with a \f(CW\*(C`scheme:\*(C'\fR part) are never modified
URI::Heuristic.3pm:90:scheme (http, ftp, etc.) is a URL rather than a local path.  So don't name 
URI::Heuristic.3pm:104:The two-letter country code (ISO 3166) for your location.  If
URI::Heuristic.3pm:110:examined and country (not language) information possibly found in them

-.-.

Add lines to use the CR font for groff instead of CW.

.if t \{\
.  ie \\n(.g .ft CR
.  el .ft CW
.\}


11:.ft CW

-.-.

.\" Define a fallback for font CW with

.if \n(.g \{\
.  ie t .ftr CW CR
.  el .ftr CW R
.\}

79:absolute URIs (i.e. that start with a \f(CW\*(C`scheme:\*(C'\fR part) are never modified
96:returns a \f(CW\*(C`URI\*(C'\fR object.

-.-.

Output from "test-groff  -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=0 -ww -z ":

troff:<stdin>:88: warning: trailing space in the line
troff:<stdin>:89: warning: trailing space in the line
troff:<stdin>:90: warning: trailing space in the line
troff:<stdin>:91: warning: trailing space in the line

-.-

Generally:

Split (sometimes) lines after a punctuation mark; before a conjunction.

-.-
-------------- next part --------------
--- URI::Heuristic.3pm	2025-11-05 07:55:41.985845558 +0000
+++ URI::Heuristic.3pm.new	2025-11-05 08:06:17.087350127 +0000
@@ -4,7 +4,7 @@
 .\" Standard preamble:
 .\" ========================================================================
 .de Sp \" Vertical space (when we can't use .PP)
-.if t .sp .5v
+.if t .sp 0.5v
 .if n .sp
 ..
 .de Vb \" Begin verbatim text
@@ -76,7 +76,7 @@ URI::Heuristic \- Expand URI using heuri
 .IX Header "DESCRIPTION"
 This module provides functions that expand strings into real absolute
 URIs using some built-in heuristics.  Strings that already represent
-absolute URIs (i.e. that start with a \f(CW\*(C`scheme:\*(C'\fR part) are never modified
+absolute URIs (i.e.\& that start with a \f(CW\*(C`scheme:\*(C'\fR part) are never modified
 and are returned unchanged.  The main use of these functions is to
 allow abbreviated URIs similar to what many web browsers allow for URIs
 typed in by the user.
@@ -85,10 +85,10 @@ The following functions are provided:
 .IP uf_uristr($str) 4
 .IX Item "uf_uristr($str)"
 Tries to make the argument string
-into a proper absolute URI string.  The "uf_" prefix stands for "User 
-Friendly".  Under MacOS, it assumes that any string with a common URL 
-scheme (http, ftp, etc.) is a URL rather than a local path.  So don't name 
-your volumes after common URL schemes and expect \fBuf_uristr()\fR to construct 
+into a proper absolute URI string.  The "uf_" prefix stands for "User
+Friendly".  Under MacOS, it assumes that any string with a common URL
+scheme (http, ftp, etc.\&) is a URL rather than a local path.  So don't name
+your volumes after common URL schemes and expect \fBuf_uristr()\fR to construct
 valid file: URL's on those volumes for you, because it won't.
 .IP uf_uri($str) 4
 .IX Item "uf_uri($str)"
@@ -103,7 +103,7 @@ the following environment variables:
 .IX Item "COUNTRY"
 The two-letter country code (ISO 3166) for your location.  If
 the domain name of your host ends with two letters, then it is taken
-to be the default country. See also Locale::Country.
+to be the default country.  See also Locale::Country.
 .IP "HTTP_ACCEPT_LANGUAGE, LC_ALL, LANG" 10
 .IX Item "HTTP_ACCEPT_LANGUAGE, LC_ALL, LANG"
 If COUNTRY is not set, these standard environment variables are
@@ -125,7 +125,7 @@ country.  An empty URL_GUESS_PATTERN dis
 involves host name lookups.
 .SH COPYRIGHT
 .IX Header "COPYRIGHT"
-Copyright 1997\-1998, Gisle Aas
+Copyright 1997\(en1998, Gisle Aas
 .PP
 This library is free software; you can redistribute it and/or
 modify it under the same terms as Perl itself.
-------------- next part --------------
  Any program (person), that produces man pages, should check the output
for defects by using (both groff and nroff)

[gn]roff -mandoc -t -ww -b -z -K utf8 <man page>

  To find trailing space use

grep -n -e ' $' -e ' \\f.$' -e ' \\"' <man page>

  The same goes for man pages that are used as an input.

-.-

  For a style guide use

  mandoc -T lint

-.-

  For general input conventions consult the man page "nroff(7)" (item
"Input conventions") or the Texinfo manual about the same item.

-.-

  Any "autogenerator" should check its products with the above mentioned
'groff', 'mandoc', and additionally with 'nroff ...'.

  It should also check its input files for too long (> 80) lines.

  This is just a simple quality control measure.

  The "autogenerator" may have to be corrected to get a better man page,
the source file may, and any additional file may.

  Common defects:

  Not removing trailing spaces (in in- and output).
  The reason for these trailing spaces should be found and eliminated.

  "git" has a "tool" to point out whitespace,
see for example "git-apply(1)" and git-config(1)")

  Not beginning each input sentence on a new line.
Line length and patch size should thus be reduced.

  The script "reportbug" uses 'quoted-printable' encoding when a line is
longer than 1024 characters in an 'ascii' file.

  See man-pages(7), item "semantic newline".

-.-

The difference between the formatted output of the original
and patched file can be seen with:

  nroff -mandoc <file1> > <out1>
  nroff -mandoc <file2> > <out2>
  diff -d -u <out1> <out2>

and for groff, using

\"printf '%s\n%s\n' '.kern 0' '.ss 12 0' | groff -mandoc -Z - \"

instead of 'nroff -mandoc'

  Add the option '-t', if the file contains a table.

  Read the output from 'diff -d -u ...' with 'less -R' or similar.

-.-.

  If 'man' (man-db) is used to check the manual for warnings,
the following must be set:

  The option "-warnings=w"

  The environmental variable:

export MAN_KEEP_STDERR=yes (or any non-empty value)

  or

  (produce only warnings):

export MANROFFOPT="-ww -b -z"

export MAN_KEEP_STDERR=yes (or any non-empty value)

-.-


More information about the pkg-perl-maintainers mailing list