[MKDoc-dev] Setting the User-Agent field for RSS GET requests
Chris Croome
chris at webarchitects.co.uk
Thu Feb 2 14:38:18 GMT 2006
Hi
Google News blocks wget and LWP from making RSS requests -- try getting
these feeds in a browser and wget:
http://news.google.com/news?q=sheffield&ie=UTF-8&output=rss
The User-Agent field can be set for LWP:Simple [1] and I guess having a
env var for this would make sense, and it could be set to something
sensible by default, eg "MKDoc 1.6 RSS fetcher" and then it would also
be easy to change it's if this doesn't work for some sites...
For now adding a line like this to tools/cron/031..rss_routine.pl and
tools/cron/030..rss_troubleshooter.pl does the trick
$ua->agent('Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)');
Chris
[1] http://search.cpan.org/~gaas/libwww-perl-5.805/lib/LWP/UserAgent.pm
--
Chris Croome <chris at webarchitects.co.uk>
web design http://www.webarchitects.co.uk/
web content management http://mkdoc.com/
More information about the MKDoc-dev
mailing list