Bug#1053668: diffoscope: Consider using `file -i` as fallback for unknown file output
Niels Thykier
niels at thykier.net
Sun Oct 8 12:50:20 BST 2023
Package: diffoscope
Version: 250
Severity: wishlist
X-Debbugs-Cc: niels at thykier.net
Hi,
I noticed that `diffoscope` used `hexdump -C` based diffs for the
debian/changelog in the `mscgen` package.
My first bet was that `file` would produce incorrect output and indeed,
`file` classifies that changelog as a `Message Sequence Chart` rather
than text. This is now filed as 1053666.
Digging a bit deeper, it turns out that `file -i` correctly classifies
the changelog as `text/plain; charset=utf-8`. That is, `file` knows it
is text and I suspect `diffoscope` should try `file -i` as well when it
gets an unknown result from `file`.
This bug report obviously assumes that the `hexdump -C` like output is
triggered because `diffoscope` uses `file` for determining how to
analyze the changelog. If it uses something else, then there is some
other bug in play that makes `diffoscope` treat the `mscgen` changelog
as a binary file.
Here are two samples files that `file` considers to be `Message Sequence
Chart (chart)` and `text/plain; charset=us-ascii` with -i, in case it is
useful for a test:
```
msc {
a, b;
}
```
```
msc {
c, d;
}
```
Best regards,
Niels
More information about the Reproducible-builds
mailing list