[xml/sgml-pkgs] Bug#676717: dh_installcatalogs transition and w3c-dtd-xhtml removal bugs
Jakub Wilk
jwilk at debian.org
Tue Jun 26 12:59:52 UTC 2012
* Jakub Wilk <jwilk at debian.org>, 2012-06-26, 08:31:
>We should implement here a real TR9401 parser. This shouldn't be very
>difficult. I'll try to write such a parser today.
As promised, attached.
I'll leave integrating this with update-catalog as an excercise to
reader^WHelmut. :)
--
Jakub Wilk
-------------- next part --------------
#!/usr/bin/perl
use strict;
use warnings;
# Reference: https://www.oasis-open.org/specs/a401.htm
my $catalog_tokens = qr{
( (?: \s+ | -- .*? --)+ # whitespace and comments
| ' .*? ' | " .*? " # literal
| (?: \S+ )+ # other tokens
)
}sx;
sub parse_catalog {
my ($filename) = @_;
open my $fh, '<', $filename;
local $/;
my $contents = <$fh>;
my $in_catalog = 0;
while ($contents =~ m/$catalog_tokens/g) {
my $token = $1;
if ($in_catalog) {
next if $token =~ m/^\s|^--/;
$token =~ s/^(['"])(.*)\1$/$2/;
print "$token\n";
$in_catalog = 0;
} elsif ("\L$token" eq 'catalog') {
$in_catalog = 1;
}
}
close $fh;
}
map { parse_catalog $_ } @ARGV;
# vim:ts=4 sw=4 et
More information about the debian-xml-sgml-pkgs
mailing list