Bug#381359: very hard to handle relative links in feeds

Joey Hess joeyh at debian.org
Thu Aug 3 21:15:44 UTC 2006


Package: libxml-feed-perl
Version: 0.10-1
Severity: normal

If I'm parsing a feed with XML::Feed and it happens to contain a
relative link, I'm somewhat out of luck if I want to correctly turn that
into an absolute link:

* Maybe it's an atom feed that uses xml:base attributes to set the base
  url to derelevatise links. But if it does, there seems to be no way to
  get at that info once XML::Feed has parsed the feed.
* Maybe a Content-Location http header is used. But there's no way to
  tell once XML::Feed has downloaded the feed.
* Maybe neither of the above is true, and so I have to fall back to
  poorly defined heuristics like using the url of the feed itself as the
  base url. And munge the content html myself, as well as checking for
  relative links in the feed's own link attribute, as well as the link
  attributes of individual entries in the feed, etc.

So 2/3 of the time it's impossible and 1/3 of the time it's enormously
painful and probably not possible to do right anyway. Ugh.

XML::Feed should hide all this insane complexity and ugliness from the
user by fixing up all relative url in feeds.

Here's how the python feed parser does it:
http://feedparser.org/docs/resolving-relative-links.html

-- 
see shy jo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.alioth.debian.org/pipermail/pkg-perl-maintainers/attachments/20060803/ebccdd66/attachment.pgp


More information about the pkg-perl-maintainers mailing list