dive into mark

You are here: dive into markArchivesJune 2002Google recommends

Sunday, June 2, 2002

Google recommends

blogrollfinder.py has been updated again; there are two major changes.

First, it now looks for the published subscription file for Radio-controlled weblogs, and uses this as the blogroll if found. (For example, here are Sam Ruby’s subscriptions.) If not found, it falls back to the old logic of scraping the HTML finding lists of links separated only by tags and whitespace.

Second, I added a findNewBlogsByGoogleRelated function, which takes your URL and uses the Google API to find sites related to yours, then checks each of those sites’ blogrolls, aggregates the results, and diffs them against your blogroll. This answers the question, What people are the people related to you reading that you’re not reading?

Here’s my list:

Google recommends
Name Links
scripting.com5
chris.pirillo.com5
Dan Gillmor4
kottke.org4
evhead.com4
doc.weblogs.com4
zeldman.com3
www.codingtheweb.com3
wmf.editthispage.com3
unchartedshores.com/blogger/blogger3.html3
tomalak.org3
rageboy.com/blogger.html3
radio.weblogs.com/0100243/3
plasticbag.org3
oreillynet.com/~rael/3
nickdenton.org3
mightygirl.net3
megnut.com3
harrumph.com3

This would make an interesting web service. The table above was output verbatim by the script, except for one particularly long URL which I hand-edited later. The whole process is not horrendously slow (15 seconds max — all the HTML requests run in parallel threads and timeout rapidly, so the script never spends too much time waiting for a single slow site), but then again, the results should only change slowly over time, so they could be computed once and then cached. Unfortunately, it takes 3 queries to Google to compile the related list, and I have a limit of 1000 queries a day. (And, due to early runaway bugs, I’m almost tapped for today.) Maybe tomorrow.

Update: I wrote the web service, so you can try it yourself and find your own neighborhood.

Filed under , , ,

Respond privately

I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)



Recent Stuff For You, Special Price Stay Here
  • Greasemonkey Hacks
Good Stuff Buy The Cow Go Away
Dive Into Python
Powered by Google Drink The Milk Don't Steal

 

posts / comments
© 2001-8 Mark Pilgrim