Why Yahoo Over Google? October 20, 2006 8:15 PM   Subscribe

Why does does mefi search use yahoo instead of google? I am not particularly interested in individual opinions of the two engines, I'd rather base answers on factual results.
posted by negative1 to MetaFilter-Related at 8:15 PM (21 comments total)

Because the Google search wasn't cutting it and causing lots of double posts. I believe this has been discussed numerous times.
posted by Manhasset at 8:18 PM on October 20, 2006


Yahoo has a better index of metafilter, iirc.
posted by jessamyn (staff) at 8:21 PM on October 20, 2006


my GOD i need to not post when i'm drunk. i'm talking about the search engine, and i didn't even use it to check for double (/triple/quadruple) posts. this has been answered many times in other MeTa threads. my sincere apologies for being lame.
posted by negative1 at 8:25 PM on October 20, 2006


Not to mention your choice of words :P

Anyway, some datapoints:

Google for "yahoo search" on metatalk - 577 results
Yahoo for "yahoo search" on metatalk - 1,570 results
Google for "google search" on metatalk - 1,610 results
Yahoo for "google search" on metatalk - 4,490 results

Google for "double post" on metatalk - 1,570 results
Yahoo for "double post" on metatalk - 3,990 results
posted by Chuckles at 8:33 PM on October 20, 2006


Anecdotally, Yahoo seems to have a complete index of all MetaFilter sites, where Google misses a lot.
posted by Chuckles at 8:35 PM on October 20, 2006


Maybe a Sitemap would help Google index everything.
posted by evariste at 8:49 PM on October 20, 2006


sitemaps don't work when you have 50,000 threads on multiple subdomains.
posted by mathowie (staff) at 9:15 PM on October 20, 2006


Out of curiosity then, why the difference between google and yahoo? Don't they both work by following links?
posted by Rumple at 12:27 AM on October 21, 2006


Matt-why not? Just make one for each of the subdomains, all based on a single common piece of sitemap-generating code. Treat publishing a sitemap.xml the same way you treat publishing subdomain RSS feeds, for instance. I don't see how it's conceptually much different...
posted by evariste at 12:46 AM on October 21, 2006


Correction: treat publishing multiple sitemaps.xml the same way you do RSS feeds.
posted by evariste at 12:48 AM on October 21, 2006


I admit it's a pain in the ass. I haven't bothered making a sitemap either because I can't be bothered. But if you want one, having multiple subdomains shouldn't stop you.

For some reason, Google indexes every single URL on my site, so I never had a problem with that. But if my website was on the scale of MetaFilter and Google's bot was for some reason too dumb to figure out all my URLs, you bet I'd make a sitemap.
posted by evariste at 12:51 AM on October 21, 2006


Why not just auto-generate a sitemap, on a nightly basis, for threads 1-foox for each subdomain, where foox is the current maximal thread index for a given subdomain x? Google will puke elegantly on the missing threads, all is well, and in the worse case we see no particular improvement because it turns out Google's Sitemap functionality is a grand Milgramesque behavioral study.

Which would be pretty cool.
posted by cortex at 1:52 AM on October 21, 2006


Google employees have popped up in the past here, but they've never bothered to explain this odd problem. Matt, have you considered emailing them to ask why Google's index of MeFi is so shitty?
posted by mediareport at 6:17 AM on October 21, 2006


Because they're NDAed?
posted by stet at 7:29 AM on October 21, 2006


Why not just auto-generate a sitemap, on a nightly basis, for threads 1-foox for each subdomain, where foox is the current maximal thread index for a given subdomain x?

Wouldn't this make Google index deleted threads? I assume that is counter-productive.
posted by popechunk at 9:42 AM on October 21, 2006


I blew off previewing because I figured it would nail the subscripts, but it happened anyway. Maybe we should be able to edit our posts
posted by popechunk at 9:44 AM on October 21, 2006


Wouldn't this make Google index deleted threads? I assume that is counter-productive.

It might. Doing the extra bit of work to generate (and each night merely append) the lists excluding non-existent and deleted threads would probably be simple enough. But I'm trying to give Matt the elevator pitch here. :)
posted by cortex at 9:57 AM on October 21, 2006


They'll do a much better job of indexing after they buy the place.
posted by thatweirdguy2 at 10:31 AM on October 21, 2006


The post when it was changed from Google to Yahoo.
posted by smackfu at 2:33 PM on October 21, 2006


Google sitemaps have an upper limit of something like 40,000 pages. Each wing of mefi has more than that, so it's off the table as a solution.
posted by mathowie (staff) at 11:22 PM on October 21, 2006


Oh, I didn't know that. Damn, that's unusually limiting.
posted by evariste at 11:43 PM on October 21, 2006


« Older Job Locations   |   Vancouver meet-up photos Newer »

You are not logged in, either login or create an account to post comments