Bug#880368: YAML::XS::Load expects utf8 octets, not perl's encoding; use slurp_raw
Andrej Shadura
andrew.shadura at collabora.co.uk
Fri Dec 13 13:23:46 GMT 2019
On Sun, 05 Nov 2017 18:32:48 +0100 Dominique Dumont <dod at debian.org> wrote:
> On Monday, 30 October 2017 15:27:32 CET you wrote:
> > YAML::XS::Load (and *hopefully* the other implementations of
> > YAML::Any::Load?) expect utf8 octets on input, not perl's internal
> > encoding.
>
> Uh ? I thought I had gotten rid of YAML::Any... Well, after checking, it turns
> out that I've updated Config;:Model::Backend::Yaml, but I forgot to update
> Dpkg::Scanner.
>
> Anyway, using YAML::Any has several problems:
> - it's deprecated
> - it may load YAML or YAML::XS which have some security issues [1]
>
> > Thus, slurp_raw should be used instead of slurp_utf8. [Though really,
> > YAML::XS::Load should probably do the right thing if is_utf8 is on,
> > anyway.]
>
> Unfortunately, the strings returned by YAML::XS is not tagged as utf-8, which
> leads to writing mojibake when cme is used to update debian/copyright.
>
> Given the security issues of YAML and YAML::XS, I'm not going to tweak the
> structure returned by YAML::XS to fix the utf8 flag of each scalar contained
> the structure (and may be all hash keys ..)
>
> Instead, I'm going to replace YAML::Any with YAML::Tiny (which is more than
> enough in this case).
Unfortunately, YAML::Tiny disallows some valid YAML markup, in
particular what pyyaml generates by default and which is very difficult
to change without in-depth hacking of it:
".*":
"license": |-
GPL-2
"debian/":
"copyright": "A B <a at a>\n B C <b at b>\n C\
\ D <c at c>\n D E <d at d>\n E F\
\ <e at e>\n F G <f at f>\n G H <g at g>"
"license": |-
GPL-2+
As a temporary workaround, I patched the locally used version to use
YAML::XS, but as I see you won’t accept this patch upstream. Is there a
solution that would satisfy both conditions of how having security
issues and supporting proper YAML? By the way, what are those security
issues and how serious and relevant to scan-copyrights are they?
> Thanks for the report . This helps me improve dpkg model for cme (and led to
> the release of Config::Model::Tester 3.003 which did not handle utf-8
> correctly while checking file content).
--
Cheers,
Andrej
More information about the pkg-perl-maintainers
mailing list