Bug#870627: shared-mime-info: text/html magics yield many false positives
Vincent Lefevre
vincent at vinc17.net
Thu Aug 3 15:15:57 UTC 2017
Package: shared-mime-info
Version: 1.8-1
Severity: important
text/html magics in /usr/share/mime/packages/freedesktop.org.xml
yield many false positives, which breaks "xdg-mime query filetype"
(according to strace, this program calls /usr/bin/mimetype, which
uses /usr/share/mime/magic).
I can see:
<magic priority="50">
<match value="<!DOCTYPE HTML" type="string" offset="0:256"/>
<match value="<!doctype html" type="string" offset="0:256"/>
<match value="<HEAD" type="string" offset="0:256"/>
<match value="<head" type="string" offset="0:256"/>
<match value="<TITLE" type="string" offset="0:256"/>
<match value="<title" type="string" offset="0:256"/>
<match value="<HTML" type="string" offset="0:256"/>
<match value="<html" type="string" offset="0:256"/>
<match value="<SCRIPT" type="string" offset="0:256"/>
<match value="<script" type="string" offset="0:256"/>
<match value="<BODY" type="string" offset="0"/>
<match value="<body" type="string" offset="0"/>
<match value="<!--" type="string" offset="0"/>
<match value="<h1" type="string" offset="0"/>
<match value="<H1" type="string" offset="0"/>
<match value="<!doctype HTML" type="string" offset="0"/>
<match value="<!DOCTYPE html" type="string" offset="0"/>
</magic>
but the fact that a text file contains one of these strings
doesn't mean that this is a HTML file!
I've attached a file (which is just a diff file) as an example.
I get:
zira:~> xdg-mime query filetype file
text/html
zira:~> /usr/bin/mimetype file
file: text/html
though this file doesn't even contain a line of HTML.
This file starts with:
----------------------------------------------------------------------
diff -r 940e528ef852 doc/manual.xml.head
--- a/doc/manual.xml.head Tue Dec 18 20:46:33 2012 -0800
+++ b/doc/manual.xml.head Wed Dec 19 12:22:14 2012 +0100
@@ -4621,6 +4621,37 @@
</sect1>
+<sect1 id="mailto-allow">
+<title>Control allowed header fields in a mailto: URL</title>
+
+<para>Usage:</para>
+
+<cmdsynopsis>
----------------------------------------------------------------------
The cause is the string "<title". Without it, I get text/x-patch as
expected.
-- System Information:
Debian Release: buster/sid
APT prefers unstable-debug
APT policy: (500, 'unstable-debug'), (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 4.11.0-2-amd64 (SMP w/8 CPU cores)
Locale: LANG=POSIX, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=POSIX (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages shared-mime-info depends on:
ii libc6 2.24-14
ii libglib2.0-0 2.52.3-1
ii libxml2 2.9.4+dfsg1-3
shared-mime-info recommends no packages.
shared-mime-info suggests no packages.
-- no debconf information
-------------- next part --------------
diff -r 940e528ef852 doc/manual.xml.head
--- a/doc/manual.xml.head Tue Dec 18 20:46:33 2012 -0800
+++ b/doc/manual.xml.head Wed Dec 19 12:22:14 2012 +0100
@@ -4621,6 +4621,37 @@
</sect1>
+<sect1 id="mailto-allow">
+<title>Control allowed header fields in a mailto: URL</title>
+
+<para>Usage:</para>
+
+<cmdsynopsis>
+<command>mailto_allow</command>
+<group choice="req">
+<arg choice="plain">
+<replaceable class="parameter">*</replaceable>
+</arg>
+<arg choice="plain" rep="repeat">
+<replaceable class="parameter">header-field</replaceable>
+</arg>
+</group>
+</cmdsynopsis>
+
+<para>
+As a security measure, Mutt will only add user-approved header fields from a
+<literal>mailto:</literal> URL. This is necessary since Mutt will handle
+certain header fields, such as <literal>Attach:</literal>, in a special way.
+The <literal>mailto_allow</literal> and <literal>unmailto_allow</literal>
+commands allow the user to modify the list of approved headers.
+</para>
+<para>
+Mutt initializes the default list to contain only the <literal>Subject</literal>
+and <literal>body</literal> header fields, which are the only requirement specified
+by the <literal>mailto:</literal> specification in RFC2368.
+</para>
+</sect1>
+
</chapter>
<chapter id="advancedusage">
diff -r 940e528ef852 doc/muttrc.man.head
--- a/doc/muttrc.man.head Tue Dec 18 20:46:33 2012 -0800
+++ b/doc/muttrc.man.head Wed Dec 19 12:22:14 2012 +0100
@@ -399,6 +399,16 @@
This command will remove all hooks of a given type, or all hooks
when \(lq\fB*\fP\(rq is used as an argument. \fIhook-type\fP
can be any of the \fB-hook\fP commands documented above.
+.PP
+.nf
+\fBmailto_allow\fP \fIheader-field\fP [ ... ]
+\fBunmailto_allow\fP [ \fB*\fP | \fIheader-field\fP ... ]
+.fi
+.IP
+These commands allow the user to modify the list of allowed header
+fields in a \fImailto:\fP URL that Mutt will include in the
+the generated message. By default the list contains only
+\fBsubject\fP and \fBbody\fP, as specified by RFC2368.
.SH PATTERNS
.PP
In various places with mutt, including some of the above mentioned
diff -r 940e528ef852 globals.h
--- a/globals.h Tue Dec 18 20:46:33 2012 -0800
+++ b/globals.h Wed Dec 19 12:22:14 2012 +0100
@@ -159,6 +159,7 @@
WHERE LIST *InlineExclude INITVAL(0);
WHERE LIST *HeaderOrderList INITVAL(0);
WHERE LIST *Ignore INITVAL(0);
+WHERE LIST *MailtoAllow INITVAL(0);
WHERE LIST *MimeLookupList INITVAL(0);
WHERE LIST *UnIgnore INITVAL(0);
diff -r 940e528ef852 init.c
--- a/init.c Tue Dec 18 20:46:33 2012 -0800
+++ b/init.c Wed Dec 19 12:22:14 2012 +0100
@@ -3063,6 +3063,15 @@
mutt_init_history ();
+ /* RFC2368, "4. Unsafe headers"
+ * The creator of a mailto URL cannot expect the resolver of a URL to
+ * understand more than the "subject" and "body" headers. Clients that
+ * resolve mailto URLs into mail messages should be able to correctly
+ * create RFC 822-compliant mail messages using the "subject" and "body"
+ * headers.
+ */
+ add_to_list(&MailtoAllow, "body");
+ add_to_list(&MailtoAllow, "subject");
diff -r 940e528ef852 init.h
--- a/init.h Tue Dec 18 20:46:33 2012 -0800
+++ b/init.h Wed Dec 19 12:22:14 2012 +0100
@@ -3544,6 +3544,8 @@
{ "macro", mutt_parse_macro, 0 },
{ "mailboxes", mutt_parse_mailboxes, M_MAILBOXES },
{ "unmailboxes", mutt_parse_mailboxes, M_UNMAILBOXES },
+ { "mailto_allow", parse_list, UL &MailtoAllow },
+ { "unmailto_allow", parse_unlist, UL &MailtoAllow },
{ "message-hook", mutt_parse_hook, M_MESSAGEHOOK },
{ "mbox-hook", mutt_parse_hook, M_MBOXHOOK },
{ "mime_lookup", parse_list, UL &MimeLookupList },
diff -r 940e528ef852 url.c
--- a/url.c Tue Dec 18 20:46:33 2012 -0800
+++ b/url.c Wed Dec 19 12:22:14 2012 +0100
@@ -283,21 +283,35 @@
if (url_pct_decode (value) < 0)
goto out;
- if (!ascii_strcasecmp (tag, "body"))
+ /* Determine if this header field is on the allowed list. Since Mutt
+ * interprets some header fields specially (such as
+ * "Attach: ~/.gnupg/secring.gpg"), care must be taken to ensure that
+ * only safe fields are allowed.
+ *
+ * RFC2368, "4. Unsafe headers"
+ * The user agent interpreting a mailto URL SHOULD choose not to create
+ * a message if any of the headers are considered dangerous; it may also
+ * choose to create a message with only a subset of the headers given in
+ * the URL.
+ */
+ if (mutt_matches_ignore(tag, MailtoAllow))
{
- if (body)
- mutt_str_replace (body, value);
- }
- else
- {
- char *scratch;
- size_t taglen = mutt_strlen (tag);
-
- safe_asprintf (&scratch, "%s: %s", tag, value);
- scratch[taglen] = 0; /* overwrite the colon as mutt_parse_rfc822_line expects */
- value = skip_email_wsp(&scratch[taglen + 1]);
- mutt_parse_rfc822_line (e, NULL, scratch, value, 1, 0, 0, &last);
- FREE (&scratch);
+ if (!ascii_strcasecmp (tag, "body"))
+ {
+ if (body)
+ mutt_str_replace (body, value);
+ }
+ else
+ {
+ char *scratch;
+ size_t taglen = mutt_strlen (tag);
+
+ safe_asprintf (&scratch, "%s: %s", tag, value);
+ scratch[taglen] = 0; /* overwrite the colon as mutt_parse_rfc822_line expects */
+ value = skip_email_wsp(&scratch[taglen + 1]);
+ mutt_parse_rfc822_line (e, NULL, scratch, value, 1, 0, 0, &last);
+ FREE (&scratch);
+ }
}
}
More information about the Pkg-freedesktop-maintainers
mailing list