RSS feeds on Google March 21, 2007 11:04 PM Subscribe
Can the RSS feeds be hidden from Google? They're pretty much useless in search results (like the first hit on this one).
Unless I'm not understanding you, that's a preference you can change in Firefox.
But having said that, I think this would probably be a good change.
posted by roll truck roll at 11:29 PM on March 21, 2007
But having said that, I think this would probably be a good change.
posted by roll truck roll at 11:29 PM on March 21, 2007
I can't hide them from Google, people using Google Reader need to see them. It might be a http header/mime-type thing on my end. I see the google results don't even call it out as RSS and offer HTML conversion like they normally do for real xml files.
posted by mathowie (staff) at 11:46 PM on March 21, 2007
posted by mathowie (staff) at 11:46 PM on March 21, 2007
I think you need to contact Google on this, but yeah, it's been frustrating me too.
posted by IndigoRain at 11:47 PM on March 21, 2007
posted by IndigoRain at 11:47 PM on March 21, 2007
You could just add REL="nofollow" on the links to feeds.
posted by Rhomboid at 11:55 PM on March 21, 2007
posted by Rhomboid at 11:55 PM on March 21, 2007
mathowie: you nailed it with the mime-type: the feeds are set to "application/xhtml+xml". Try using "application/rss+xml".
posted by boo_radley at 6:23 AM on March 22, 2007
posted by boo_radley at 6:23 AM on March 22, 2007
or you could try searching like this
I know Safari will render that feed, but in FireFox I just get the "add to RSS reader page", which is moy frustrating.
Are you using Firefox 2.0? They show up for me.
posted by delmoi at 8:17 AM on March 22, 2007
I know Safari will render that feed, but in FireFox I just get the "add to RSS reader page", which is moy frustrating.
Are you using Firefox 2.0? They show up for me.
posted by delmoi at 8:17 AM on March 22, 2007
delmoi: that's okay, unless you are looking for an answer on rss!
So then maybe we can all add this search term to google: -inurl:rss. Except now that we have verbose URLs, any question with "RSS" in the title (which presumably is highly relevant to a question about, say, rss) will be excluded from these search results too.
hmmm.
posted by misterbrandt at 8:40 AM on March 22, 2007
So then maybe we can all add this search term to google: -inurl:rss. Except now that we have verbose URLs, any question with "RSS" in the title (which presumably is highly relevant to a question about, say, rss) will be excluded from these search results too.
hmmm.
posted by misterbrandt at 8:40 AM on March 22, 2007
AB, I just tried that on the original search. Doesn't work. But maybe once the mime-type is tweaked it will?
posted by misterbrandt at 9:53 AM on March 22, 2007
posted by misterbrandt at 9:53 AM on March 22, 2007
Sorry, but no wildcards allowed for the URL robots.txt.
Look, this is really trivial to fix. Just change <a href="http://xml.metafilter.com/rss.xml">foo</a> into <a rel="nofollow" href="http://xml.metafilter.com/rss.xml">foo</a> everywhere a link to a feed is generated.
posted by Rhomboid at 11:49 AM on March 22, 2007
Look, this is really trivial to fix. Just change <a href="http://xml.metafilter.com/rss.xml">foo</a> into <a rel="nofollow" href="http://xml.metafilter.com/rss.xml">foo</a> everywhere a link to a feed is generated.
posted by Rhomboid at 11:49 AM on March 22, 2007
Rhomboid: thats true for the spec, but the GoogleBot allows more flexibility:
Additionally, Google has introduced increased flexibility to the robots.txt file standard through the use asterisks. Disallow patterns may include "*" to match any sequence of characters, and patterns may end in "$" to indicate the end of a name. To remove all files of a specific file type (for example, to include .jpg but not .gif images), you'd use the following robots.txt entry:
User-agent: Googlebot-Image
Disallow: /*.gif$
posted by rsanheim at 4:29 PM on March 22, 2007
Additionally, Google has introduced increased flexibility to the robots.txt file standard through the use asterisks. Disallow patterns may include "*" to match any sequence of characters, and patterns may end in "$" to indicate the end of a name. To remove all files of a specific file type (for example, to include .jpg but not .gif images), you'd use the following robots.txt entry:
User-agent: Googlebot-Image
Disallow: /*.gif$
posted by rsanheim at 4:29 PM on March 22, 2007
What I was referring to was that the official robots.txt spec doesn't allow for this. But google does, apparently, which I didn't know until now. I was just fooling around with [-intitle:"posts tagged with"], and that seems to do a pretty good job, since it gets rid of tag pages and tag page feeds.
posted by Rhomboid at 5:29 PM on March 22, 2007
posted by Rhomboid at 5:29 PM on March 22, 2007
You are not logged in, either login or create an account to post comments
posted by bonaldi at 11:05 PM on March 21, 2007