Bug#870627: shared-mime-info: text/html magics yield many false positives

Vincent Lefevre vincent at vinc17.net
Thu Aug 3 15:15:57 UTC 2017


Package: shared-mime-info
Version: 1.8-1
Severity: important

text/html magics in /usr/share/mime/packages/freedesktop.org.xml
yield many false positives, which breaks "xdg-mime query filetype"
(according to strace, this program calls /usr/bin/mimetype, which
uses /usr/share/mime/magic).

I can see:

    <magic priority="50">
      <match value="<!DOCTYPE HTML" type="string" offset="0:256"/>
      <match value="<!doctype html" type="string" offset="0:256"/>
      <match value="<HEAD" type="string" offset="0:256"/>
      <match value="<head" type="string" offset="0:256"/>
      <match value="<TITLE" type="string" offset="0:256"/>
      <match value="<title" type="string" offset="0:256"/>
      <match value="<HTML" type="string" offset="0:256"/>
      <match value="<html" type="string" offset="0:256"/>
      <match value="<SCRIPT" type="string" offset="0:256"/>
      <match value="<script" type="string" offset="0:256"/>
      <match value="<BODY" type="string" offset="0"/>
      <match value="<body" type="string" offset="0"/>
      <match value="<!--" type="string" offset="0"/>
      <match value="<h1" type="string" offset="0"/>
      <match value="<H1" type="string" offset="0"/>
      <match value="<!doctype HTML" type="string" offset="0"/>
      <match value="<!DOCTYPE html" type="string" offset="0"/>
    </magic>

but the fact that a text file contains one of these strings
doesn't mean that this is a HTML file!

I've attached a file (which is just a diff file) as an example.
I get:

zira:~> xdg-mime query filetype file
text/html

zira:~> /usr/bin/mimetype file
file:  text/html

though this file doesn't even contain a line of HTML.

This file starts with:

----------------------------------------------------------------------
diff -r 940e528ef852 doc/manual.xml.head
--- a/doc/manual.xml.head	Tue Dec 18 20:46:33 2012 -0800
+++ b/doc/manual.xml.head	Wed Dec 19 12:22:14 2012 +0100
@@ -4621,6 +4621,37 @@
 
 </sect1>
 
+<sect1 id="mailto-allow">
+<title>Control allowed header fields in a mailto: URL</title>
+
+<para>Usage:</para>
+
+<cmdsynopsis>
----------------------------------------------------------------------

The cause is the string "<title". Without it, I get text/x-patch as
expected.

-- System Information:
Debian Release: buster/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 4.11.0-2-amd64 (SMP w/8 CPU cores)
Locale: LANG=POSIX, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=POSIX (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages shared-mime-info depends on:
ii  libc6         2.24-14
ii  libglib2.0-0  2.52.3-1
ii  libxml2       2.9.4+dfsg1-3

shared-mime-info recommends no packages.

shared-mime-info suggests no packages.

-- no debconf information
-------------- next part --------------
diff -r 940e528ef852 doc/manual.xml.head
--- a/doc/manual.xml.head	Tue Dec 18 20:46:33 2012 -0800
+++ b/doc/manual.xml.head	Wed Dec 19 12:22:14 2012 +0100
@@ -4621,6 +4621,37 @@
 
 </sect1>
 
+<sect1 id="mailto-allow">
+<title>Control allowed header fields in a mailto: URL</title>
+
+<para>Usage:</para>
+
+<cmdsynopsis>
+<command>mailto_allow</command>
+<group choice="req">
+<arg choice="plain">
+<replaceable class="parameter">*</replaceable>
+</arg>
+<arg choice="plain" rep="repeat">
+<replaceable class="parameter">header-field</replaceable>
+</arg>
+</group>
+</cmdsynopsis>
+
+<para>
+As a security measure, Mutt will only add user-approved header fields from a
+<literal>mailto:</literal> URL.  This is necessary since Mutt will handle
+certain header fields, such as <literal>Attach:</literal>, in a special way.
+The <literal>mailto_allow</literal> and <literal>unmailto_allow</literal>
+commands allow the user to modify the list of approved headers.
+</para>
+<para>
+Mutt initializes the default list to contain only the <literal>Subject</literal>
+and <literal>body</literal> header fields, which are the only requirement specified
+by the <literal>mailto:</literal> specification in RFC2368.
+</para>
+</sect1>
+
 </chapter>
 
 <chapter id="advancedusage">
diff -r 940e528ef852 doc/muttrc.man.head
--- a/doc/muttrc.man.head	Tue Dec 18 20:46:33 2012 -0800
+++ b/doc/muttrc.man.head	Wed Dec 19 12:22:14 2012 +0100
@@ -399,6 +399,16 @@
 This command will remove all hooks of a given type, or all hooks
 when \(lq\fB*\fP\(rq is used as an argument.  \fIhook-type\fP
 can be any of the \fB-hook\fP commands documented above.
+.PP
+.nf
+\fBmailto_allow\fP \fIheader-field\fP [ ... ]
+\fBunmailto_allow\fP [ \fB*\fP | \fIheader-field\fP ... ]
+.fi
+.IP
+These commands allow the user to modify the list of allowed header
+fields in a \fImailto:\fP URL that Mutt will include in the
+the generated message.  By default the list contains only
+\fBsubject\fP and \fBbody\fP, as specified by RFC2368.
 .SH PATTERNS
 .PP
 In various places with mutt, including some of the above mentioned
diff -r 940e528ef852 globals.h
--- a/globals.h	Tue Dec 18 20:46:33 2012 -0800
+++ b/globals.h	Wed Dec 19 12:22:14 2012 +0100
@@ -159,6 +159,7 @@
 WHERE LIST *InlineExclude INITVAL(0);
 WHERE LIST *HeaderOrderList INITVAL(0);
 WHERE LIST *Ignore INITVAL(0);
+WHERE LIST *MailtoAllow INITVAL(0);
 WHERE LIST *MimeLookupList INITVAL(0);
 WHERE LIST *UnIgnore INITVAL(0);
 
diff -r 940e528ef852 init.c
--- a/init.c	Tue Dec 18 20:46:33 2012 -0800
+++ b/init.c	Wed Dec 19 12:22:14 2012 +0100
@@ -3063,6 +3063,15 @@
 
   mutt_init_history ();
 
+  /* RFC2368, "4. Unsafe headers"
+   * The creator of a mailto URL cannot expect the resolver of a URL to
+   * understand more than the "subject" and "body" headers. Clients that
+   * resolve mailto URLs into mail messages should be able to correctly
+   * create RFC 822-compliant mail messages using the "subject" and "body"
+   * headers.
+   */
+  add_to_list(&MailtoAllow, "body");
+  add_to_list(&MailtoAllow, "subject");
   
   
   
diff -r 940e528ef852 init.h
--- a/init.h	Tue Dec 18 20:46:33 2012 -0800
+++ b/init.h	Wed Dec 19 12:22:14 2012 +0100
@@ -3544,6 +3544,8 @@
   { "macro",		mutt_parse_macro,	0 },
   { "mailboxes",	mutt_parse_mailboxes,	M_MAILBOXES },
   { "unmailboxes",	mutt_parse_mailboxes,	M_UNMAILBOXES },
+  { "mailto_allow",	parse_list,		UL &MailtoAllow },
+  { "unmailto_allow",	parse_unlist,		UL &MailtoAllow },
   { "message-hook",	mutt_parse_hook,	M_MESSAGEHOOK },
   { "mbox-hook",	mutt_parse_hook,	M_MBOXHOOK },
   { "mime_lookup",	parse_list,	UL &MimeLookupList },
diff -r 940e528ef852 url.c
--- a/url.c	Tue Dec 18 20:46:33 2012 -0800
+++ b/url.c	Wed Dec 19 12:22:14 2012 +0100
@@ -283,21 +283,35 @@
     if (url_pct_decode (value) < 0)
       goto out;
 
-    if (!ascii_strcasecmp (tag, "body"))
+    /* Determine if this header field is on the allowed list.  Since Mutt
+     * interprets some header fields specially (such as
+     * "Attach: ~/.gnupg/secring.gpg"), care must be taken to ensure that
+     * only safe fields are allowed.
+     *
+     * RFC2368, "4. Unsafe headers"
+     * The user agent interpreting a mailto URL SHOULD choose not to create
+     * a message if any of the headers are considered dangerous; it may also
+     * choose to create a message with only a subset of the headers given in
+     * the URL.
+     */
+    if (mutt_matches_ignore(tag, MailtoAllow))
     {
-      if (body)
-	mutt_str_replace (body, value);
-    }
-    else
-    {
-      char *scratch;
-      size_t taglen = mutt_strlen (tag);
-     
-      safe_asprintf (&scratch, "%s: %s", tag, value);
-      scratch[taglen] = 0; /* overwrite the colon as mutt_parse_rfc822_line expects */
-      value = skip_email_wsp(&scratch[taglen + 1]);
-      mutt_parse_rfc822_line (e, NULL, scratch, value, 1, 0, 0, &last);
-      FREE (&scratch);
+      if (!ascii_strcasecmp (tag, "body"))
+      {
+	if (body)
+	  mutt_str_replace (body, value);
+      }
+      else
+      {
+	char *scratch;
+	size_t taglen = mutt_strlen (tag);
+
+	safe_asprintf (&scratch, "%s: %s", tag, value);
+	scratch[taglen] = 0; /* overwrite the colon as mutt_parse_rfc822_line expects */
+	value = skip_email_wsp(&scratch[taglen + 1]);
+	mutt_parse_rfc822_line (e, NULL, scratch, value, 1, 0, 0, &last);
+	FREE (&scratch);
+      }
     }
   }
 


More information about the Pkg-freedesktop-maintainers mailing list