Yahoo China
From AJS.COM
Yahoo! Slurp China appears to be hitting my Web site (log entry broken up for line length):
202.160.178.230 - - [18/Oct/2007:15:15:22 -0400] "GET /gallery2/main.php?g2_itemId=4585 HTTP/1.0" 200 9740 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp China;
I'm familiar with Yahoo!'s Slurp Web crawler, but had never heard of one that was specifically China-bound. I was wondering why they would have anything different in China, and then it hit me: The Great Firewall of China
is the reason. If they run a separate instance of their crawler from inside China, then the huge chunk of the World Wide Web
that they can't hit from China because of government blocking won't end up in their index, thus assuring that their database complies with Chinese government restrictions on content (if the government is unhappy with something in a Yahooo! China search result, all they have to do is start blocking it, and Yahoo! Slurp China won't see it anymore). Smart, if scary.
In case you're curious, this is the image on my gallery that they were hitting:
http://mush.ajs.com/gallery2/main.php?g2_itemId=4585
Not that that's one of my better photographs or anything, it's just the access that I happened to notice.
