[DRE-maint] Bug#963808: ruby-sanitize: CVE-2020-4054: HTML sanitization bypass in Sanitize

Salvatore Bonaccorso carnil at debian.org
Sun Jul 12 14:11:30 BST 2020


On Sat, Jun 27, 2020 at 09:10:01PM +0200, Salvatore Bonaccorso wrote:
> Source: ruby-sanitize
> Version: 4.6.6-2
> Severity: grave
> Tags: security upstream
> Justification: user security hole
> 
> Hi,
> 
> The following vulnerability was published for ruby-sanitize.
> 
> CVE-2020-4054[0]:
> | In Sanitize (RubyGem sanitize) greater than or equal to 3.0.0 and less
> | than 5.2.1, there is a cross-site scripting vulnerability. When HTML
> | is sanitized using Sanitize's "relaxed" config, or a custom config
> | that allows certain elements, some content in a math or svg element
> | may not be sanitized correctly even if math and svg are not in the
> | allowlist. You are likely to be vulnerable to this issue if you use
> | Sanitize's relaxed config or a custom config that allows one or more
> | of the following HTML elements: iframe, math, noembed, noframes,
> | noscript, plaintext, script, style, svg, xmp. Using carefully crafted
> | input, an attacker may be able to sneak arbitrary HTML through
> | Sanitize, potentially resulting in XSS (cross-site scripting) or other
> | undesired behavior when that HTML is rendered in a browser. This has
> | been fixed in 5.2.1.o

Attached ist a preliminary debdiff with the fix, but two prerequisites
before "fix: Don't treat :remove_contents as `true` when it's an
Array" and "feat: Remove useless filtered element content by default".

Antonio, would it be possible to let it go trough your second pair of
eyes, with the pre-knolege that I'm not familiar with the package but
trying to address the CVE-2020-4054.

If those look correct, the plan would be to do 4.6.6-2.1~deb10u1 based
on that for buster-security.

Regards,
Salvatore
-------------- next part --------------
diff -Nru ruby-sanitize-4.6.6/debian/changelog ruby-sanitize-4.6.6/debian/changelog
--- ruby-sanitize-4.6.6/debian/changelog	2019-02-07 21:15:34.000000000 +0100
+++ ruby-sanitize-4.6.6/debian/changelog	2020-07-12 15:02:54.000000000 +0200
@@ -1,3 +1,13 @@
+ruby-sanitize (4.6.6-2.1) unstable; urgency=medium
+
+  * Non-maintainer upload.
+  * fix: Don't treat :remove_contents as `true` when it's an Array
+  * feat: Remove useless filtered element content by default
+  * Fix sanitization bypass in HTML foreign content (CVE-2020-4054)
+    (Closes: #963808)
+
+ -- Salvatore Bonaccorso <carnil at debian.org>  Sun, 12 Jul 2020 15:02:54 +0200
+
 ruby-sanitize (4.6.6-2) unstable; urgency=medium
 
   * Team upload.
diff -Nru ruby-sanitize-4.6.6/debian/patches/Fix-sanitization-bypass-in-HTML-foreign-content.patch ruby-sanitize-4.6.6/debian/patches/Fix-sanitization-bypass-in-HTML-foreign-content.patch
--- ruby-sanitize-4.6.6/debian/patches/Fix-sanitization-bypass-in-HTML-foreign-content.patch	1970-01-01 01:00:00.000000000 +0100
+++ ruby-sanitize-4.6.6/debian/patches/Fix-sanitization-bypass-in-HTML-foreign-content.patch	2020-07-12 15:02:54.000000000 +0200
@@ -0,0 +1,134 @@
+From: Ryan Grove <ryan at wonko.com>
+Date: Mon, 15 Jun 2020 14:27:07 -0700
+Subject: Fix sanitization bypass in HTML foreign content
+Origin: https://github.com/rgrove/sanitize/commit/a11498de9e283cd457b35ee252983662f7452aa9
+Bug: https://github.com/rgrove/sanitize/security/advisories/GHSA-p4x4-rw2p-8j8m
+Bug-Debian: https://bugs.debian.org/963808
+Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2020-4054
+
+https://github.com/rgrove/sanitize/security/advisories/GHSA-p4x4-rw2p-8j8m
+
+[Salvatore Bonaccorso: Backport to 4.6.6 for context changes]
+---
+ README.md                      | 11 +++++++++++
+ lib/sanitize/config/default.rb |  2 +-
+ test/test_clean_element.rb     | 30 ++++++++++++++++++++----------
+ test/test_malicious_html.rb    | 13 +++++++++++++
+ 4 files changed, 45 insertions(+), 11 deletions(-)
+
+--- a/README.md
++++ b/README.md
+@@ -73,6 +73,11 @@ Sanitize can sanitize the following type
+ * Standalone CSS stylesheets
+ * Standalone CSS properties
+ 
++However, please note that Sanitize _cannot_ fully sanitize the contents of
++`<math>` or `<svg>` elements, since these elements don't follow the same parsing
++rules as the rest of HTML. If this is something you need, you may want to look
++for another solution.
++
+ ### HTML Fragments
+ 
+ A fragment is a snippet of HTML that doesn't contain a root-level `<html>`
+@@ -417,6 +422,12 @@ elements not in this array will be remov
+ ]
+ ```
+ 
++**Warning:** Sanitize cannot fully sanitize the contents of `<math>` or `<svg>`
++elements, since these elements don't follow the same parsing rules as the rest
++of HTML. If you add `math` or `svg` to the allowlist, you must assume that any
++content inside them will be allowed, even if that content would otherwise be
++removed by Sanitize.
++
+ #### :protocols (Hash)
+ 
+ URL protocols to allow in specific attributes. If an attribute is listed here
+--- a/lib/sanitize/config/default.rb
++++ b/lib/sanitize/config/default.rb
+@@ -70,7 +70,7 @@ class Sanitize
+       # the specified elements (when filtered) will be removed, and the contents
+       # of all other filtered elements will be left behind.
+       :remove_contents => %w[
+-        iframe noembed noframes noscript script style
++        iframe math noembed noframes noscript plaintext script style svg xmp
+       ],
+ 
+       # Transformers allow you to filter or alter nodes using custom logic. See
+--- a/test/test_clean_element.rb
++++ b/test/test_clean_element.rb
+@@ -192,21 +192,16 @@ describe 'Sanitize::Transformers::CleanE
+         .must_equal ''
+     end
+ 
+-    it 'should escape the content of removed `plaintext` elements' do
+-      Sanitize.fragment('<plaintext>hello! <script>alert(0)</script>')
+-        .must_equal 'hello! <script>alert(0)</script>'
+-    end
+-
+-    it 'should escape the content of removed `xmp` elements' do
+-      Sanitize.fragment('<xmp>hello! <script>alert(0)</script></xmp>')
+-        .must_equal 'hello! <script>alert(0)</script>'
+-    end
+-
+     it 'should not preserve the content of removed `iframe` elements' do
+       Sanitize.fragment('<iframe>hello! <script>alert(0)</script></iframe>')
+         .must_equal ''
+     end
+ 
++    it 'should not preserve the content of removed `math` elements' do
++      Sanitize.fragment('<math>hello! <script>alert(0)</script></math>')
++        .must_equal ''
++    end
++
+     it 'should not preserve the content of removed `noembed` elements' do
+       Sanitize.fragment('<noembed>hello! <script>alert(0)</script></noembed>')
+         .must_equal ''
+@@ -222,6 +217,11 @@ describe 'Sanitize::Transformers::CleanE
+         .must_equal ''
+     end
+ 
++    it 'should not preserve the content of removed `plaintext` elements' do
++      Sanitize.fragment('<plaintext>hello! <script>alert(0)</script>')
++        .must_equal ''
++    end
++
+     it 'should not preserve the content of removed `script` elements' do
+       Sanitize.fragment('<script>hello! <script>alert(0)</script></script>')
+         .must_equal ''
+@@ -232,6 +232,16 @@ describe 'Sanitize::Transformers::CleanE
+         .must_equal ''
+     end
+ 
++    it 'should not preserve the content of removed `svg` elements' do
++      Sanitize.fragment('<svg>hello! <script>alert(0)</script></svg>')
++        .must_equal ''
++    end
++
++    it 'should not preserve the content of removed `xmp` elements' do
++      Sanitize.fragment('<xmp>hello! <script>alert(0)</script></xmp>')
++        .must_equal ''
++    end
++
+     strings.each do |name, data|
+       it "should clean #{name} HTML" do
+         Sanitize.fragment(data[:html]).must_equal(data[:default])
+--- a/test/test_malicious_html.rb
++++ b/test/test_malicious_html.rb
+@@ -189,4 +189,17 @@ describe 'Malicious HTML' do
+       end
+     end
+   end
++
++  # https://github.com/rgrove/sanitize/security/advisories/GHSA-p4x4-rw2p-8j8m
++  describe 'foreign content bypass in relaxed config' do
++    it 'prevents a sanitization bypass via carefully crafted foreign content' do
++      %w[iframe noembed noframes noscript plaintext script style xmp].each do |tag_name|
++        @s.fragment(%[<math><#{tag_name}>/*</#{tag_name}><img src onerror=alert(1)>*/]).
++          must_equal ''
++
++        @s.fragment(%[<svg><#{tag_name}>/*</#{tag_name}><img src onerror=alert(1)>*/]).
++          must_equal ''
++      end
++    end
++  end
+ end
diff -Nru ruby-sanitize-4.6.6/debian/patches/feat-Remove-useless-filtered-element-content-by-defa.patch ruby-sanitize-4.6.6/debian/patches/feat-Remove-useless-filtered-element-content-by-defa.patch
--- ruby-sanitize-4.6.6/debian/patches/feat-Remove-useless-filtered-element-content-by-defa.patch	1970-01-01 01:00:00.000000000 +0100
+++ ruby-sanitize-4.6.6/debian/patches/feat-Remove-useless-filtered-element-content-by-defa.patch	2020-07-12 15:02:54.000000000 +0200
@@ -0,0 +1,238 @@
+From: Ryan Grove <ryan at wonko.com>
+Date: Sat, 13 Oct 2018 17:47:21 -0700
+Subject: feat: Remove useless filtered element content by default
+Origin: https://github.com/rgrove/sanitize/commit/faf9a0f432fda3cef29f0f8aad99d4dedf079d67
+
+[Salvatore Bonaccorso: Backport to 4.6.6 for context changes]
+---
+ lib/sanitize/config/default.rb             |  4 +-
+ lib/sanitize/transformers/clean_element.rb |  6 +-
+ test/test_clean_comment.rb                 |  6 +-
+ test/test_clean_element.rb                 | 64 +++++++++++++++++-----
+ test/test_malicious_html.rb                |  6 +-
+ test/test_parser.rb                        |  4 +-
+ test/test_sanitize.rb                      |  6 +-
+ 7 files changed, 66 insertions(+), 30 deletions(-)
+
+--- a/lib/sanitize/config/default.rb
++++ b/lib/sanitize/config/default.rb
+@@ -69,7 +69,9 @@ class Sanitize
+       # If this is an Array or Set of element names, then only the contents of
+       # the specified elements (when filtered) will be removed, and the contents
+       # of all other filtered elements will be left behind.
+-      :remove_contents => false,
++      :remove_contents => %w[
++        iframe noembed noframes noscript script style
++      ],
+ 
+       # Transformers allow you to filter or alter nodes using custom logic. See
+       # README.md for details and examples.
+--- a/lib/sanitize/transformers/clean_element.rb
++++ b/lib/sanitize/transformers/clean_element.rb
+@@ -97,8 +97,10 @@ class Sanitize; module Transformers; cla
+         end
+       end
+ 
+-      unless @remove_all_contents || @remove_element_contents.include?(name)
+-        node.add_previous_sibling(node.children)
++      unless node.children.empty?
++        unless @remove_all_contents || @remove_element_contents.include?(name)
++          node.add_previous_sibling(node.children)
++        end
+       end
+ 
+       node.unlink
+--- a/test/test_clean_comment.rb
++++ b/test/test_clean_comment.rb
+@@ -20,7 +20,7 @@ describe 'Sanitize::Transformers::CleanC
+ 
+       # Special case: the comment markup is inside a <script>, which makes it
+       # text content and not an actual HTML comment.
+-      @s.fragment("<script><!-- comment --></script>").must_equal '<!-- comment -->'
++      @s.fragment("<script><!-- comment --></script>").must_equal ''
+ 
+       Sanitize.fragment("<script><!-- comment --></script>", :allow_comments => false, :elements => ['script'])
+         .must_equal '<script><!-- comment --></script>'
+@@ -40,10 +40,6 @@ describe 'Sanitize::Transformers::CleanC
+       @s.fragment("foo <!-- <!-- <!-- --> --> -->bar").must_equal 'foo <!-- <!-- <!-- --> --> -->bar'
+       @s.fragment("foo <div <!-- comment -->>bar</div>").must_equal 'foo <div>>bar</div>'
+ 
+-      # Special case: the comment markup is inside a <script>, which makes it
+-      # text content and not an actual HTML comment.
+-      @s.fragment("<script><!-- comment --></script>").must_equal '<!-- comment -->'
+-
+       Sanitize.fragment("<script><!-- comment --></script>", :allow_comments => true, :elements => ['script'])
+         .must_equal '<script><!-- comment --></script>'
+     end
+--- a/test/test_clean_element.rb
++++ b/test/test_clean_element.rb
+@@ -8,25 +8,22 @@ describe 'Sanitize::Transformers::CleanE
+   strings = {
+     :basic => {
+       :html       => '<b>Lo<!-- comment -->rem</b> <a href="pants" title="foo" style="text-decoration: underline;">ipsum</a> <a href="http://foo.com/"><strong>dolor</strong></a> sit<br/>amet <style>.foo { color: #fff; }</style> <script>alert("hello world");</script>',
+-
+-      :default    => 'Lorem ipsum dolor sit amet .foo { color: #fff; } alert("hello world");',
+-      :restricted => '<b>Lorem</b> ipsum <strong>dolor</strong> sit amet .foo { color: #fff; } alert("hello world");',
+-      :basic      => '<b>Lorem</b> <a href="pants" rel="nofollow">ipsum</a> <a href="http://foo.com/" rel="nofollow"><strong>dolor</strong></a> sit<br>amet .foo { color: #fff; } alert("hello world");',
+-      :relaxed    => '<b>Lorem</b> <a href="pants" title="foo" style="text-decoration: underline;">ipsum</a> <a href="http://foo.com/"><strong>dolor</strong></a> sit<br>amet <style>.foo { color: #fff; }</style> alert("hello world");'
++      :default    => 'Lorem ipsum dolor sit amet  ',
++      :restricted => '<b>Lorem</b> ipsum <strong>dolor</strong> sit amet  ',
++      :basic      => '<b>Lorem</b> <a href="pants" rel="nofollow">ipsum</a> <a href="http://foo.com/" rel="nofollow"><strong>dolor</strong></a> sit<br>amet  ',
++      :relaxed    => '<b>Lorem</b> <a href="pants" title="foo" style="text-decoration: underline;">ipsum</a> <a href="http://foo.com/"><strong>dolor</strong></a> sit<br>amet <style>.foo { color: #fff; }</style> '
+     },
+ 
+     :malformed => {
+       :html       => 'Lo<!-- comment -->rem</b> <a href=pants title="foo>ipsum <a href="http://foo.com/"><strong>dolor</a></strong> sit<br/>amet <script>alert("hello world");',
+-
+-      :default    => 'Lorem dolor sit amet alert("hello world");',
+-      :restricted => 'Lorem <strong>dolor</strong> sit amet alert("hello world");',
+-      :basic      => 'Lorem <a href="pants" rel="nofollow"><strong>dolor</strong></a> sit<br>amet alert("hello world");',
+-      :relaxed    => 'Lorem <a href="pants" title="foo>ipsum <a href="><strong>dolor</strong></a> sit<br>amet alert("hello world");',
++      :default    => 'Lorem dolor sit amet ',
++      :restricted => 'Lorem <strong>dolor</strong> sit amet ',
++      :basic      => 'Lorem <a href="pants" rel="nofollow"><strong>dolor</strong></a> sit<br>amet ',
++      :relaxed    => 'Lorem <a href="pants" title="foo>ipsum <a href="><strong>dolor</strong></a> sit<br>amet ',
+     },
+ 
+     :unclosed => {
+       :html       => '<p>a</p><blockquote>b',
+-
+       :default    => ' a  b ',
+       :restricted => ' a  b ',
+       :basic      => '<p>a</p><blockquote>b</blockquote>',
+@@ -35,7 +32,6 @@ describe 'Sanitize::Transformers::CleanE
+ 
+     :malicious => {
+       :html       => '<b>Lo<!-- comment -->rem</b> <a href="javascript:pants" title="foo">ipsum</a> <a href="http://foo.com/"><strong>dolor</strong></a> sit<br/>amet <<foo>script>alert("hello world");</script>',
+-
+       :default    => 'Lorem ipsum dolor sit amet <script>alert("hello world");',
+       :restricted => '<b>Lorem</b> ipsum <strong>dolor</strong> sit amet <script>alert("hello world");',
+       :basic      => '<b>Lorem</b> <a rel="nofollow">ipsum</a> <a href="http://foo.com/" rel="nofollow"><strong>dolor</strong></a> sit<br>amet <script>alert("hello world");',
+@@ -171,10 +167,10 @@ describe 'Sanitize::Transformers::CleanE
+         .must_equal 'foo bar baz quux'
+ 
+       Sanitize.fragment('<script>alert("<xss>");</script>')
+-        .must_equal 'alert("<xss>");'
++        .must_equal ''
+ 
+       Sanitize.fragment('<<script>script>alert("<xss>");</<script>>')
+-        .must_equal '<script>alert("<xss>");</<script>>'
++        .must_equal '<'
+ 
+       Sanitize.fragment('< script <>> alert("<xss>");</script>')
+         .must_equal '< script <>> alert("");'
+@@ -196,6 +192,46 @@ describe 'Sanitize::Transformers::CleanE
+         .must_equal ''
+     end
+ 
++    it 'should escape the content of removed `plaintext` elements' do
++      Sanitize.fragment('<plaintext>hello! <script>alert(0)</script>')
++        .must_equal 'hello! <script>alert(0)</script>'
++    end
++
++    it 'should escape the content of removed `xmp` elements' do
++      Sanitize.fragment('<xmp>hello! <script>alert(0)</script></xmp>')
++        .must_equal 'hello! <script>alert(0)</script>'
++    end
++
++    it 'should not preserve the content of removed `iframe` elements' do
++      Sanitize.fragment('<iframe>hello! <script>alert(0)</script></iframe>')
++        .must_equal ''
++    end
++
++    it 'should not preserve the content of removed `noembed` elements' do
++      Sanitize.fragment('<noembed>hello! <script>alert(0)</script></noembed>')
++        .must_equal ''
++    end
++
++    it 'should not preserve the content of removed `noframes` elements' do
++      Sanitize.fragment('<noframes>hello! <script>alert(0)</script></noframes>')
++        .must_equal ''
++    end
++
++    it 'should not preserve the content of removed `noscript` elements' do
++      Sanitize.fragment('<noscript>hello! <script>alert(0)</script></noscript>')
++        .must_equal ''
++    end
++
++    it 'should not preserve the content of removed `script` elements' do
++      Sanitize.fragment('<script>hello! <script>alert(0)</script></script>')
++        .must_equal ''
++    end
++
++    it 'should not preserve the content of removed `style` elements' do
++      Sanitize.fragment('<style>hello! <script>alert(0)</script></style>')
++        .must_equal ''
++    end
++
+     strings.each do |name, data|
+       it "should clean #{name} HTML" do
+         Sanitize.fragment(data[:html]).must_equal(data[:default])
+--- a/test/test_malicious_html.rb
++++ b/test/test_malicious_html.rb
+@@ -65,7 +65,7 @@ describe 'Malicious HTML' do
+ 
+     it 'should not be possible to inject <script> via a malformed <img> tag' do
+       @s.fragment('<img """><script>alert("XSS")</script>">').
+-        must_equal '<img>alert("XSS")">'
++        must_equal '<img>">'
+     end
+ 
+     it 'should not be possible to inject protocol-based JS' do
+@@ -117,12 +117,12 @@ describe 'Malicious HTML' do
+   describe '<script>' do
+     it 'should not be possible to inject <script> using a malformed non-alphanumeric tag name' do
+       @s.fragment(%[<script/xss src="http://ha.ckers.org/xss.js">alert(1)</script>]).
+-        must_equal 'alert(1)'
++        must_equal ''
+     end
+ 
+     it 'should not be possible to inject <script> via extraneous open brackets' do
+       @s.fragment(%[<<script>alert("XSS");//<</script>]).
+-        must_equal '<alert("XSS");//<'
++        must_equal '<'
+     end
+   end
+ 
+--- a/test/test_parser.rb
++++ b/test/test_parser.rb
+@@ -19,8 +19,8 @@ describe 'Parser' do
+   end
+ 
+   it 'should not have the Nokogiri 1.4.2+ unterminated script/style element bug' do
+-    Sanitize.fragment('foo <script>bar').must_equal 'foo bar'
+-    Sanitize.fragment('foo <style>bar').must_equal 'foo bar'
++    Sanitize.fragment('foo <script>bar').must_equal 'foo '
++    Sanitize.fragment('foo <style>bar').must_equal 'foo '
+   end
+ 
+   it 'ambiguous non-tag brackets like "1 > 2 and 2 < 1" should be parsed correctly' do
+--- a/test/test_sanitize.rb
++++ b/test/test_sanitize.rb
+@@ -25,7 +25,7 @@ describe 'Sanitize' do
+ 
+       it 'should sanitize an HTML document' do
+         @s.document('<!doctype html><html><b>Lo<!-- comment -->rem</b> <a href="pants" title="foo">ipsum</a> <a href="http://foo.com/"><strong>dolor</strong></a> sit<br/>amet <script>alert("hello world");</script></html>')
+-          .must_equal "<html>Lorem ipsum dolor sit amet alert(\"hello world\");</html>\n"
++          .must_equal "<html>Lorem ipsum dolor sit amet </html>\n"
+       end
+ 
+       it 'should not modify the input string' do
+@@ -42,7 +42,7 @@ describe 'Sanitize' do
+     describe '#fragment' do
+       it 'should sanitize an HTML fragment' do
+         @s.fragment('<b>Lo<!-- comment -->rem</b> <a href="pants" title="foo">ipsum</a> <a href="http://foo.com/"><strong>dolor</strong></a> sit<br/>amet <script>alert("hello world");</script>')
+-          .must_equal 'Lorem ipsum dolor sit amet alert("hello world");'
++          .must_equal 'Lorem ipsum dolor sit amet '
+       end
+ 
+       it 'should not modify the input string' do
+@@ -71,7 +71,7 @@ describe 'Sanitize' do
+         doc.xpath('/html/body/node()').each {|node| frag << node }
+ 
+         @s.node!(frag)
+-        frag.to_html.must_equal 'Lorem ipsum dolor sit amet alert("hello world");'
++        frag.to_html.must_equal 'Lorem ipsum dolor sit amet '
+       end
+ 
+       describe "when the given node is a document and <html> isn't whitelisted" do
diff -Nru ruby-sanitize-4.6.6/debian/patches/fix-Don-t-treat-remove_contents-as-true-when-it-s-an.patch ruby-sanitize-4.6.6/debian/patches/fix-Don-t-treat-remove_contents-as-true-when-it-s-an.patch
--- ruby-sanitize-4.6.6/debian/patches/fix-Don-t-treat-remove_contents-as-true-when-it-s-an.patch	1970-01-01 01:00:00.000000000 +0100
+++ ruby-sanitize-4.6.6/debian/patches/fix-Don-t-treat-remove_contents-as-true-when-it-s-an.patch	2020-07-12 15:02:54.000000000 +0200
@@ -0,0 +1,102 @@
+From: Ryan Grove <ryan at wonko.com>
+Date: Sat, 13 Oct 2018 17:38:30 -0700
+Subject: fix: Don't treat :remove_contents as `true` when it's an Array
+Origin: https://github.com/rgrove/sanitize/commit/54dcf57ff1c16a861621ccf0089d2da9b6c3e5d7
+
+---
+ README.md                                  |  8 ++++----
+ lib/sanitize/config/default.rb             |  6 +++---
+ lib/sanitize/transformers/clean_element.rb |  2 +-
+ test/test_clean_element.rb                 | 20 ++++++++++++++------
+ 4 files changed, 22 insertions(+), 14 deletions(-)
+
+diff --git a/README.md b/README.md
+index 4532a668188d..5cc3c5a19010 100644
+--- a/README.md
++++ b/README.md
+@@ -441,13 +441,13 @@ include the symbol `:relative` in the protocol array:
+ 
+ #### :remove_contents (boolean or Array or Set)
+ 
+-If set to `true`, Sanitize will remove the contents of any non-whitelisted
++If this is `true`, Sanitize will remove the contents of any non-whitelisted
+ elements in addition to the elements themselves. By default, Sanitize leaves the
+ safe parts of an element's contents behind when the element is removed.
+ 
+-If set to an array of element names, then only the contents of the specified
+-elements (when filtered) will be removed, and the contents of all other filtered
+-elements will be left behind.
++If this is an Array or Set of element names, then only the contents of the
++specified elements (when filtered) will be removed, and the contents of all
++other filtered elements will be left behind.
+ 
+ The default value is `false`.
+ 
+diff --git a/lib/sanitize/config/default.rb b/lib/sanitize/config/default.rb
+index 09e1348afae4..72c6557081c6 100644
+--- a/lib/sanitize/config/default.rb
++++ b/lib/sanitize/config/default.rb
+@@ -66,9 +66,9 @@ class Sanitize
+       # leaves the safe parts of an element's contents behind when the element
+       # is removed.
+       #
+-      # If this is an Array of element names, then only the contents of the
+-      # specified elements (when filtered) will be removed, and the contents of
+-      # all other filtered elements will be left behind.
++      # If this is an Array or Set of element names, then only the contents of
++      # the specified elements (when filtered) will be removed, and the contents
++      # of all other filtered elements will be left behind.
+       :remove_contents => false,
+ 
+       # Transformers allow you to filter or alter nodes using custom logic. See
+diff --git a/lib/sanitize/transformers/clean_element.rb b/lib/sanitize/transformers/clean_element.rb
+index fbbe9c8e8785..124384a9605c 100644
+--- a/lib/sanitize/transformers/clean_element.rb
++++ b/lib/sanitize/transformers/clean_element.rb
+@@ -67,7 +67,7 @@ class Sanitize; module Transformers; class CleanElement
+       @whitespace_elements = config[:whitespace_elements]
+     end
+ 
+-    if config[:remove_contents].is_a?(Set)
++    if config[:remove_contents].is_a?(Enumerable)
+       @remove_element_contents.merge(config[:remove_contents].map(&:to_s))
+     else
+       @remove_all_contents = !!config[:remove_contents]
+diff --git a/test/test_clean_element.rb b/test/test_clean_element.rb
+index 8113cb67bcab..cf99764e604f 100644
+--- a/test/test_clean_element.rb
++++ b/test/test_clean_element.rb
+@@ -344,16 +344,24 @@ describe 'Sanitize::Transformers::CleanElement' do
+       ).must_equal 'foo bar   '
+     end
+ 
+-    it 'should remove the contents of specified nodes when :remove_contents is an Array of element names as strings' do
+-      Sanitize.fragment('foo bar <div>baz<span>quux</span><script>alert("hello!");</script></div>',
++    it 'should remove the contents of specified nodes when :remove_contents is an Array or Set of element names as strings' do
++      Sanitize.fragment('foo bar <div>baz<span>quux</span> <b>hi</b><script>alert("hello!");</script></div>',
+         :remove_contents => ['script', 'span']
+-      ).must_equal 'foo bar  baz '
++      ).must_equal 'foo bar  baz hi '
++
++      Sanitize.fragment('foo bar <div>baz<span>quux</span> <b>hi</b><script>alert("hello!");</script></div>',
++        :remove_contents => Set.new(['script', 'span'])
++      ).must_equal 'foo bar  baz hi '
+     end
+ 
+-    it 'should remove the contents of specified nodes when :remove_contents is an Array of element names as symbols' do
+-      Sanitize.fragment('foo bar <div>baz<span>quux</span><script>alert("hello!");</script></div>',
++    it 'should remove the contents of specified nodes when :remove_contents is an Array or Set of element names as symbols' do
++      Sanitize.fragment('foo bar <div>baz<span>quux</span> <b>hi</b><script>alert("hello!");</script></div>',
+         :remove_contents => [:script, :span]
+-      ).must_equal 'foo bar  baz '
++      ).must_equal 'foo bar  baz hi '
++
++      Sanitize.fragment('foo bar <div>baz<span>quux</span> <b>hi</b><script>alert("hello!");</script></div>',
++        :remove_contents => Set.new([:script, :span])
++      ).must_equal 'foo bar  baz hi '
+     end
+ 
+     it 'should not allow arbitrary HTML5 data attributes by default' do
+-- 
+2.27.0
+
diff -Nru ruby-sanitize-4.6.6/debian/patches/series ruby-sanitize-4.6.6/debian/patches/series
--- ruby-sanitize-4.6.6/debian/patches/series	2019-02-07 21:15:34.000000000 +0100
+++ ruby-sanitize-4.6.6/debian/patches/series	2020-07-12 15:02:54.000000000 +0200
@@ -1,2 +1,5 @@
 no-relative-path.patch
 Fix-test-against-Nokogiri-1.10.patch
+fix-Don-t-treat-remove_contents-as-true-when-it-s-an.patch
+feat-Remove-useless-filtered-element-content-by-defa.patch
+Fix-sanitization-bypass-in-HTML-foreign-content.patch


More information about the Pkg-ruby-extras-maintainers mailing list