Bug#1100736: llvm-exegesis-18.1: Some remarks and a patch with editorial changes for this man page
Bjarni Ingi Gislason
bjarniig at simnet.is
Tue Mar 18 01:28:52 GMT 2025
Package: llvm-18
Version: 1:18.1.8-16
Severity: minor
Tags: patch
* What led up to the situation?
Checking for defects with a new version
test-[g|n]roff -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z < "man page"
[Use "groff -e ' $' -e '\\~$' <file>" to find obvious trailing spaces.]
["test-groff" is a script in the repository for "groff"; is not shipped]
(local copy and "troff" slightly changed by me).
[The fate of "test-nroff" was decided in groff bug #55941.]
* What was the outcome of this action?
troff:<stdin>:334: warning: [page 4, 4.7i (diversion 'an*paragraph-tag', 0.0i)]: cannot break line
* What outcome did you expect instead?
No output (no warnings).
-.-
General remarks and further material, if a diff-file exist, are in the
attachments.
-- System Information:
Debian Release: trixie/sid
APT prefers testing
APT policy: (500, 'testing')
Architecture: amd64 (x86_64)
Kernel: Linux 6.12.17-amd64 (SMP w/2 CPU threads; PREEMPT)
Locale: LANG=is_IS.iso88591, LC_CTYPE=is_IS.iso88591 (charmap=ISO-8859-1), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: sysvinit (via /sbin/init)
Versions of packages llvm-18 depends on:
ii libc6 2.41-4
ii libcurl4t64 8.12.1-3
ii libgcc-s1 14.2.0-17
ii libllvm18 1:18.1.8-16
ii libpfm4 4.13.0+git83-g91970fe-1
ii libstdc++6 14.2.0-17
ii libtinfo6 6.5+20250216-2
ii libzstd1 1.5.6+dfsg-2
ii llvm-18-linker-tools 1:18.1.8-16
ii llvm-18-runtime 1:18.1.8-16
ii zlib1g 1:1.3.dfsg+really1.3.1-1+b1
Versions of packages llvm-18 recommends:
pn llvm-18-dev <none>
Versions of packages llvm-18 suggests:
pn llvm-18-doc <none>
-- no debconf information
-------------- next part --------------
Input file is llvm-exegesis-18.1
Output from "mandoc -T lint llvm-exegesis-18.1": (shortened list)
1 input text line longer than 80 bytes: * \fBassemble\-measu...
1 input text line longer than 80 bytes: * \fBmeasure\fP: Sam...
1 input text line longer than 80 bytes: * \fBprepare\-and\-a...
1 input text line longer than 80 bytes: By default, when \fI...
3 input text line longer than 80 bytes: Either \fIopcode\-in...
1 input text line longer than 80 bytes: File to read (\fIana...
1 input text line longer than 80 bytes: If non\-empty, write...
1 input text line longer than 80 bytes: If provided, write t...
1 input text line longer than 80 bytes: Measuring the uop de...
1 input text line longer than 80 bytes: On choosing the \(dq...
1 input text line longer than 80 bytes: Specify the run mode...
1 input text line longer than 80 bytes: The main goal of thi...
1 input text line longer than 80 bytes: We need to have at l...
1 input text line longer than 80 bytes: When a positive valu...
1 input text line longer than 80 bytes: and \fILLVM\-EXEGESI...
1 input text line longer than 80 bytes: characteristics. The...
1 input text line longer than 80 bytes: crashing. Setting up...
1 input text line longer than 80 bytes: example, \fI/tmp/inc...
1 input text line longer than 80 bytes: is passed in. This i...
1 input text line longer than 80 bytes: possible so that we ...
1 input text line longer than 80 bytes: repetition count of ...
1 input text line longer than 80 bytes: snippet until the lo...
1 input text line longer than 80 bytes: use has a correspond...
1 input text line longer than 80 bytes: value is named using...
9 skipping paragraph macro: sp after SH
-.-.
Output from "test-nroff -mandoc -t -ww -z llvm-exegesis-18.1": (shortened list)
1 cannot break line
-.-.
Show if generated from reStructuredText
Who is actually generating this man page? Debian or upstream?
Is the generating software out of date?
1:.\" Man page generated from reStructuredText.
-.-.
Change (or include a "FIXME" paragraph about) misused SI (metric)
numeric prefixes (or names) to the binary ones, like Ki (kibi), Mi
(mebi), Gi (gibi), or Ti (tebi), if indicated.
If the metric prefixes are correct, add the definitions or an
explanation to avoid misunderstanding.
89:this register as a live in ensures that a pointer to a block of memory (1MB)
-.-.
Add a (no-break, "\ " or "\~") space between a number and a unit,
as these are not one entity.
89:this register as a live in ensures that a pointer to a block of memory (1MB)
-.-.
Strings longer than 3/4 of a standard line length (80).
Use "\:" to split the string at the end of an output line, for example a
long URL (web address)
254 \-\-analysis\-inconsistencies\-output\-file=/tmp/inconsistencies.html
276 2,VPADDQYrr,,VPADDBYrr_VPADDDYrr_VPADDQYrr_VPADDWYrr_VPSUBBYrr_VPSUBDYrr_VPSUBQYrr_VPSUBWYrr,1.02
277 2,VPSUBQYrr,,VPADDBYrr_VPADDDYrr_VPADDQYrr_VPADDWYrr_VPSUBBYrr_VPSUBDYrr_VPSUBQYrr_VPSUBWYrr,1.01
334 .B \-\-benchmark\-phase=[prepare\-snippet|prepare\-and\-assemble\-snippet|assemble\-measured\-code|measure]
-.-.
Add a "\&" (or a comma (Oxford comma)) after "e.g." and "i.e.",
or use English words
(man-pages(7)).
Abbreviation points should be marked as such and protected against being
interpreted as an end of sentence, if they are not, and that independent
of the current place on the line.
193:\fBllvm\-exegesis\fP checks the liveliness of registers (i.e. any register
-.-.
Wrong distance (not two spaces) between sentences in the input file.
Separate the sentences and subordinate clauses; each begins on a new
line. See man-pages(7) ("Conventions for source file layout") and
"info groff" ("Input Conventions").
The best procedure is to always start a new sentence on a new line,
at least, if you are typing on a computer.
Remember coding: Only one command ("sentence") on each (logical) line.
E-mail: Easier to quote exactly the relevant lines.
Generally: Easier to edit the sentence.
Patches: Less unaffected text.
Search for two adjacent words is easier, when they belong to the same line,
and the same phrase.
The amount of space between sentences in the output can then be
controlled with the ".ss" request.
Mark a final abbreviation point as such by suffixing it with "\&".
Some sentences (etc.) do not begin on a new line.
Split (sometimes) lines after a punctuation mark; before a conjunction.
Lines with only one (or two) space(s) between sentences could be split,
so latter sentences begin on a new line.
Use
#!/usr/bin/sh
sed -e '/^\./n' \
-e 's/\([[:alpha:]]\)\. */\1.\n/g' $1
to split lines after a sentence period.
Check result with the difference between the formatted outputs.
See also the attachment "general.bugs"
[List of affected lines removed.]
-.-.
Split lines longer than 80 characters into two or more lines.
Appropriate break points are the end of a sentence and a subordinate
clause; after punctuation marks.
Add "\:" to split the string for the output, "\<newline>" in the source.
[List of affected lines removed.]
[...]
Line 341, length 188
* \fBassemble\-measured\-code\fP: Same as \fBprepare\-and\-assemble\-snippet\fP\&. but also creates the full sequence that can be dumped to a file using \fB\-\-dump\-object\-to\-disk\fP\&.
[...]
Longest line is: 188 characters.
-.-.
Put a parenthetical sentence, phrase on a separate line,
if not part of a code.
See man-pages(7), item "semantic newline".
[List of affected lines removed.]
-.-.
No need for '\&' to be in front of a period (.),
if there is a character in front of it.
Remove with "sed -e 's/\\&\././g'".
227:We need to have at least eight bytes of memory allocated starting \fI0x2000\fP\&.
327:for \fIx86\-lbr\-sample\-period\fP and \fI\-\-repetition\-mode=loop\fP\&.
330:\fI\-analysis\-clusters\-output\-file=\fP and \fI\-analysis\-inconsistencies\-output\-file=\fP\&.
341:* \fBassemble\-measured\-code\fP: Same as \fBprepare\-and\-assemble\-snippet\fP\&. but also creates the full sequence that can be dumped to a file using \fB\-\-dump\-object\-to\-disk\fP\&.
384:repetition count of the snippet will be \fInum\-repetitions\fP/\fIsnippet size\fP\&.
390:Only effective for \fI\-repetition\-mode=[loop|min]\fP\&.
-.-.
Only one space character after a possible end of sentence
(after a punctuation, that can end a sentence).
[List of affected lines removed.]
-.-.
Put a subordinate sentence (after a comma) on a new line.
[List of affected lines removed.]
-.-.
Remove quotes when there is a printable
but no space character between them
and the quotes are not for emphasis (markup),
for example as an argument to a macro.
llvm-exegesis-18.1:30:.TH "LLVM-EXEGESIS" "1" "2025-01-27" "15" "LLVM"
-.-.
Output from "test-groff -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z ":
troff:<stdin>:334: warning: [page 4, 4.7i (diversion 'an*paragraph-tag', 0.0i)]: cannot break line
-.-.
Generally:
Split (sometimes) lines after a punctuation mark; before a conjunction.
-------------- next part --------------
--- llvm-exegesis-18.1 2025-03-17 15:21:26.832839347 +0000
+++ llvm-exegesis-18.1.new 2025-03-18 01:20:00.565519322 +0000
@@ -27,21 +27,19 @@ level margin: \\n[rst2man-indent\\n[rst2
.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
..
-.TH "LLVM-EXEGESIS" "1" "2025-01-27" "15" "LLVM"
+.TH LLVM-EXEGESIS 1 2025-01-27 15 LLVM
.SH NAME
llvm-exegesis \- LLVM Machine Instruction Benchmark
.SH SYNOPSIS
-.sp
\fBllvm\-exegesis\fP [\fIoptions\fP]
.SH DESCRIPTION
-.sp
\fBllvm\-exegesis\fP is a benchmarking tool that uses information available
in LLVM to measure host machine instruction characteristics like latency,
throughput, or port decomposition.
.sp
Given an LLVM opcode name and a benchmarking mode, \fBllvm\-exegesis\fP
-generates a code snippet that makes execution as serial (resp. as parallel) as
-possible so that we can measure the latency (resp. inverse throughput/uop decomposition)
+generates a code snippet that makes execution as serial (resp\&. as parallel) as
+possible so that we can measure the latency (resp.\& inverse throughput/uop decomposition)
of the instruction.
The code snippet is jitted and, unless requested not to, executed on the
host subtarget. The time taken (resp. resource usage) is measured using
@@ -54,14 +52,12 @@ scheduling models. To that end, we also
\fBllvm\-exegesis\fP can also benchmark arbitrary user\-provided code
snippets.
.SH SUPPORTED PLATFORMS
-.sp
\fBllvm\-exegesis\fP currently only supports X86 (64\-bit only), ARM (AArch64
only), MIPS, and PowerPC (PowerPC64LE only) on Linux for benchmarking. Not all
benchmarking functionality is guaranteed to work on every platform.
\fBllvm\-exegesis\fP also has a separate analysis mode that is supported
on every platform that LLVM is.
.SH SNIPPET ANNOTATIONS
-.sp
\fBllvm\-exegesis\fP supports benchmarking arbitrary snippets of assembly.
However, benchmarking these snippets often requires some setup so that they
can execute properly. \fBllvm\-exegesis\fP has five annotations and some
@@ -86,7 +82,7 @@ benchmarking script with a \fILLVM\-EXEG
.IP \(bu 2
Scratch memory register \- The specific register that this value is put in
is platform dependent (e.g., it is the RDI register on X86 Linux). Setting
-this register as a live in ensures that a pointer to a block of memory (1MB)
+this register as a live in ensures that a pointer to a block of memory (1\~MiB)
is placed within this register that can be used by the snippet.
.UNINDENT
.IP \(bu 2
@@ -118,7 +114,6 @@ cases where the memory accessed by the s
of the snippet, like RIP\-relative addressing.
.UNINDENT
.SH EXAMPLE 1: BENCHMARKING INSTRUCTIONS
-.sp
Assume you have an X86\-64 machine. To measure the latency of a single
instruction, run:
.INDENT 0.0
@@ -177,7 +172,6 @@ $ llvm\-exegesis \-\-mode=latency \-\-op
.UNINDENT
.UNINDENT
.SH EXAMPLE 2: BENCHMARKING A CUSTOM CODE SNIPPET
-.sp
To measure the latency/uops of a custom piece of code, you can specify the
\fIsnippets\-file\fP option (\fI\-\fP reads from standard input).
.INDENT 0.0
@@ -190,7 +184,7 @@ $ echo \(dqvzeroupper\(dq | llvm\-exeges
.UNINDENT
.sp
Real\-life code snippets typically depend on registers or memory.
-\fBllvm\-exegesis\fP checks the liveliness of registers (i.e. any register
+\fBllvm\-exegesis\fP checks the liveliness of registers (i.e.\& any register
use has a corresponding def or is a \(dqlive in\(dq). If your code depends on the
value of some registers, you need to use snippet annotations to ensure setup
is performed properly.
@@ -210,7 +204,6 @@ addq $0x10, %rdi
.UNINDENT
.UNINDENT
.SH EXAMPLE 3: BENCHMARKING WITH MEMORY ANNOTATIONS
-.sp
Some snippets require memory setup in specific places to execute without
crashing. Setting up memory can be accomplished with the \fILLVM\-EXEGESIS\-MEM\-DEF\fP
and \fILLVM\-EXEGESIS\-MEM\-MAP\fP annotations. To execute the following snippet:
@@ -224,7 +217,7 @@ movq (%rax), %rdi
.UNINDENT
.UNINDENT
.sp
-We need to have at least eight bytes of memory allocated starting \fI0x2000\fP\&.
+We need to have at least eight bytes of memory allocated starting \fI0x2000\fP.
We can create the necessary execution environment with the following
annotations added to the snippet:
.INDENT 0.0
@@ -240,7 +233,6 @@ movq (%rax), %rdi
.UNINDENT
.UNINDENT
.SH EXAMPLE 4: ANALYSIS
-.sp
Assuming you have a set of benchmarked instructions (either latency or uops) as
YAML in file \fI/tmp/benchmarks.yaml\fP, you can analyze the results using the
following command:
@@ -324,21 +316,21 @@ Specify the run mode. Note that some mod
\fIlatency\fP mode can be make use of either RDTSC or LBR.
\fIlatency[LBR]\fP is only available on X86 (at least \fISkylake\fP).
To run in \fIlatency\fP mode, a positive value must be specified
-for \fIx86\-lbr\-sample\-period\fP and \fI\-\-repetition\-mode=loop\fP\&.
+for \fIx86\-lbr\-sample\-period\fP and \fI\-\-repetition\-mode=loop\fP.
.sp
In \fIanalysis\fP mode, you also need to specify at least one of the
-\fI\-analysis\-clusters\-output\-file=\fP and \fI\-analysis\-inconsistencies\-output\-file=\fP\&.
+\fI\-analysis\-clusters\-output\-file=\fP and \fI\-analysis\-inconsistencies\-output\-file=\fP.
.UNINDENT
.INDENT 0.0
.TP
-.B \-\-benchmark\-phase=[prepare\-snippet|prepare\-and\-assemble\-snippet|assemble\-measured\-code|measure]
+.B \-\-benchmark\-phase=[prepare\-snippet|\:prepare\-and\-assemble\-snippet|\:assemble\-measured\-code|\:measure]
By default, when \fI\-mode=\fP is specified, the generated snippet will be executed
and measured, and that requires that we are running on the hardware for which
the snippet was generated, and that supports performance measurements.
However, it is possible to stop at some stage before measuring. Choices are:
* \fBprepare\-snippet\fP: Only generate the minimal instruction sequence.
* \fBprepare\-and\-assemble\-snippet\fP: Same as \fBprepare\-snippet\fP, but also dumps an excerpt of the sequence (hex encoded).
-* \fBassemble\-measured\-code\fP: Same as \fBprepare\-and\-assemble\-snippet\fP\&. but also creates the full sequence that can be dumped to a file using \fB\-\-dump\-object\-to\-disk\fP\&.
+* \fBassemble\-measured\-code\fP: Same as \fBprepare\-and\-assemble\-snippet\fP. but also creates the full sequence that can be dumped to a file using \fB\-\-dump\-object\-to\-disk\fP.
* \fBmeasure\fP: Same as \fBassemble\-measured\-code\fP, but also runs the measurement.
.UNINDENT
.INDENT 0.0
@@ -381,13 +373,13 @@ and produce the minimal measured result.
.TP
.B \-\-num\-repetitions=<Number of repetitions>
Specify the target number of executed instructions. Note that the actual
-repetition count of the snippet will be \fInum\-repetitions\fP/\fIsnippet size\fP\&.
+repetition count of the snippet will be \fInum\-repetitions\fP/\fIsnippet size\fP.
Higher values lead to more accurate measurements but lengthen the benchmark.
.UNINDENT
.INDENT 0.0
.TP
.B \-\-loop\-body\-size=<Preferred loop body size>
-Only effective for \fI\-repetition\-mode=[loop|min]\fP\&.
+Only effective for \fI\-repetition\-mode=[loop|min]\fP.
Instead of looping over the snippet directly, first duplicate it so that the
loop body contains at least this many instructions. This potentially results
in loop body being cached in the CPU Op Cache / Loop Cache, which allows to
@@ -520,7 +512,6 @@ when performing latency measurements. By
a latency measurement enough times to balance run\-time and noise reduction.
.UNINDENT
.SH EXIT STATUS
-.sp
\fBllvm\-exegesis\fP returns 0 on success. Otherwise, an error message is
printed to standard error, and the tool returns a non 0 value.
.SH AUTHOR
-------------- next part --------------
Any program (person), that produces man pages, should check the output
for defects by using (both groff and nroff)
[gn]roff -mandoc -t -ww -b -z -K utf8 <man page>
The same goes for man pages that are used as an input.
For a style guide use
mandoc -T lint
-.-
Any "autogenerator" should check its products with the above mentioned
'groff', 'mandoc', and additionally with 'nroff ...'.
It should also check its input files for too long (> 80) lines.
This is just a simple quality control measure.
The "autogenerator" may have to be corrected to get a better man page,
the source file may, and any additional file may.
Common defects:
Not removing trailing spaces (in in- and output).
The reason for these trailing spaces should be found and eliminated.
"git" has a "tool" to point out whitespace,
see for example "git-apply(1)" and git-config(1)")
Not beginning each input sentence on a new line.
Line length and patch size should thus be reduced.
The script "reportbug" uses 'quoted-printable' encoding when a line is
longer than 1024 characters in an 'ascii' file.
See man-pages(7), item "semantic newline".
-.-
The difference between the formatted output of the original and patched file
can be seen with:
nroff -mandoc <file1> > <out1>
nroff -mandoc <file2> > <out2>
diff -d -u <out1> <out2>
and for groff, using
\"printf '%s\n%s\n' '.kern 0' '.ss 12 0' | groff -mandoc -Z - \"
instead of 'nroff -mandoc'
Add the option '-t', if the file contains a table.
Read the output from 'diff -d -u ...' with 'less -R' or similar.
-.-.
If 'man' (man-db) is used to check the manual for warnings,
the following must be set:
The option \"-warnings=w\"
The environmental variable:
export MAN_KEEP_STDERR=yes (or any non-empty value)
or
(produce only warnings):
export MANROFFOPT=\"-ww -b -z\"
export MAN_KEEP_STDERR=yes (or any non-empty value)
-.-
More information about the Pkg-llvm-team
mailing list