4 posts tagged with Search and indexing.
Displaying 1 through 4 of 4. Subscribe:
Suggested use of robots.txt for better searching of MeFi and related subdomains
Robots.txt for the <tagname>.metafilter.com domains should exclude all rebots. Currently the robots reindex the entire site for each subdomain.
Yahoo vs Google
If it's true that Yahoo! has a more complete index of Metafilter than Google does - see here, here and here (just below) for tantalizing discussion - wouldn't it make more sense to point folks to Yahoo! rather than Google on the site's various search and posting pages? Anyone up for a semi-scientific test?
AskMe indexing?
Is there a chunk of AskMe that has not been indexed? If not, why can't I find a certain question? (more inside)
To Index or Not?
The site is getting pummelled lately, so I ran stats on the past few days to see if there was a national news story or something. Of the 300k page views in the past four days, 100k, or 1/3 of the traffic was solely due to the googlebot.
It appears that having 13k threads filled with 200k comments of google-loving ascii is acting as some sort of honeypot, attracting the google indexers like mad. Broken down by day, the Googlebot appears to visit over 25k pages at metafilter.com PER DAY. If you look at browser/OS stats, the googlebot visits metafilter more often than all Netscape clients combined. Also, the googlebot exceeds all visits by people using Mac operating systems.
Although I'm impressed with the results (google searches are the #1 referrer), is it worth basically bringing down the machine and keeping humans from being able to access it? If I were to include a robots exlusion file and block all search bots, would the net community be at a loss for not being able to find information discussed here?
I guess the big question is, does the utility of having the site indexed outweigh the problems the indexing causes?
It appears that having 13k threads filled with 200k comments of google-loving ascii is acting as some sort of honeypot, attracting the google indexers like mad. Broken down by day, the Googlebot appears to visit over 25k pages at metafilter.com PER DAY. If you look at browser/OS stats, the googlebot visits metafilter more often than all Netscape clients combined. Also, the googlebot exceeds all visits by people using Mac operating systems.
Although I'm impressed with the results (google searches are the #1 referrer), is it worth basically bringing down the machine and keeping humans from being able to access it? If I were to include a robots exlusion file and block all search bots, would the net community be at a loss for not being able to find information discussed here?
I guess the big question is, does the utility of having the site indexed outweigh the problems the indexing causes?
Page:
1