RSS feeds on Google March 21, 2007 11:04 PM   Subscribe

Can the RSS feeds be hidden from Google? They're pretty much useless in search results (like the first hit on this one).
posted by bonaldi to Bugs at 11:04 PM (16 comments total)

I know Safari will render that feed, but in FireFox I just get the "add to RSS reader page", which is moy frustrating.
posted by bonaldi at 11:05 PM on March 21, 2007


Unless I'm not understanding you, that's a preference you can change in Firefox.

But having said that, I think this would probably be a good change.
posted by roll truck roll at 11:29 PM on March 21, 2007


I can't hide them from Google, people using Google Reader need to see them. It might be a http header/mime-type thing on my end. I see the google results don't even call it out as RSS and offer HTML conversion like they normally do for real xml files.
posted by mathowie (staff) at 11:46 PM on March 21, 2007


I think you need to contact Google on this, but yeah, it's been frustrating me too.
posted by IndigoRain at 11:47 PM on March 21, 2007


You could just add REL="nofollow" on the links to feeds.
posted by Rhomboid at 11:55 PM on March 21, 2007


mathowie: you nailed it with the mime-type: the feeds are set to "application/xhtml+xml". Try using "application/rss+xml".
posted by boo_radley at 6:23 AM on March 22, 2007


or you could try searching like this

I know Safari will render that feed, but in FireFox I just get the "add to RSS reader page", which is moy frustrating.

Are you using Firefox 2.0? They show up for me.
posted by delmoi at 8:17 AM on March 22, 2007


delmoi: that's okay, unless you are looking for an answer on rss!

So then maybe we can all add this search term to google: -inurl:rss. Except now that we have verbose URLs, any question with "RSS" in the title (which presumably is highly relevant to a question about, say, rss) will be excluded from these search results too.

hmmm.
posted by misterbrandt at 8:40 AM on March 22, 2007


How about -filetype:xml?
posted by Aloysius Bear at 9:22 AM on March 22, 2007


AB, I just tried that on the original search. Doesn't work. But maybe once the mime-type is tweaked it will?
posted by misterbrandt at 9:53 AM on March 22, 2007


I can't hide them from Google, people using Google Reader need to see them. It might be a http header/mime-type thing on my end. I see the google results don't even call it out as RSS and offer HTML conversion like they normally do for real xml files.

Matt, that's actually cool: "When users add your feed to their Google homepage, Google's Feedfetcher attempts to obtain the content of the feed in order to display it. Since Feedfetcher requests come from explicit action by human users, Feedfetcher has been designed to ignore robots.txt guidelines." (link)

So you could do a:

User-agent: Googlebot
Disallow: /*rss$


in MeFi's robots.txt file, and it would, presumably, not interfere with Google Reader.
posted by WCityMike at 10:55 AM on March 22, 2007


Sorry, but no wildcards allowed for the URL robots.txt.

Look, this is really trivial to fix. Just change <a href="http://xml.metafilter.com/rss.xml">foo</a> into <a rel="nofollow" href="http://xml.metafilter.com/rss.xml">foo</a> everywhere a link to a feed is generated.
posted by Rhomboid at 11:49 AM on March 22, 2007


Rhomboid: thats true for the spec, but the GoogleBot allows more flexibility:

Additionally, Google has introduced increased flexibility to the robots.txt file standard through the use asterisks. Disallow patterns may include "*" to match any sequence of characters, and patterns may end in "$" to indicate the end of a name. To remove all files of a specific file type (for example, to include .jpg but not .gif images), you'd use the following robots.txt entry:
User-agent: Googlebot-Image
Disallow: /*.gif$

posted by rsanheim at 4:29 PM on March 22, 2007


Not sure what you're referring to in the linked-to document, Rhomboid. And Google's webmaster info on robots.txt thing has a Disallow: /*.gif$ sample line, which is what I borrowed that from.
posted by WCityMike at 5:21 PM on March 22, 2007


Oops. Now why didn't that show up until I just posted?
posted by WCityMike at 5:22 PM on March 22, 2007


What I was referring to was that the official robots.txt spec doesn't allow for this. But google does, apparently, which I didn't know until now. I was just fooling around with [-intitle:"posts tagged with"], and that seems to do a pretty good job, since it gets rid of tag pages and tag page feeds.
posted by Rhomboid at 5:29 PM on March 22, 2007


« Older Preferred method of quoting others' comment?   |   No officer, I'm not small. I'm TALL! Newer »

You are not logged in, either login or create an account to post comments