Tuesday, April 17, 2007

Parsing Digg's RSS Feeds

I thought someone might find this useful, but when you are trying to pull an RSS feed via something like PHP, it simply won't work on its own. For instance, if you try to do something along the lines of it will give you an error, yet when you go to your browser and pull up the feed everything is gravy, ftw!?

So after an hour or so of googling and finding nothing because I had nowhere to start, I decided to pull up telnet and give it a shot the old fashioned way :)

telnet digg.com 80;
get / http/1.1 \n\n


so... ok...

telnet digg.com 80;
Host: Digg.com
get / http/1.1 \n\n

Still nothing!

Finally, I try

telnet digg.com 80;
Host: whatever.com
Referer: something.net
User-agent: some browser
get / http/1.1 \n\n


Wala, one of these is the magic one, through process of elimination, I find the are requiring a referrer. Weird... so... how do you fix this in PHP you ask? Simple...

ini_set('user_agent', 'Anything here');

Well, there you have it, your one line fix for parsing Digg.


Jonathan said...

Thanks for posting this. I've been trying to parse my digg history for about 15 minutes now. I finally emailed digg reporting a problem, but then stumbled across your blog post. Again, thanks.

pawel said...

thanks a lot

curtis said...

Dude!!! Mad props to that. I've been trying to figure that out for about a week now. It's so simple, it's scary. Thanks a million!