Is an automated search feature possible? June 25, 2002 9:41 AM   Subscribe

Is an automated search feature possible? Before submitting let's sayMahir's "I kiss you" or "All your base are belong to us" the posted url would be filtered through existing posts.

Poster would receive a message saying something like : thanks for the effort but someone posted that already with a link to the original post.

Possible?

B.
posted by Baud to Feature Requests at 9:41 AM (13 comments total)

Too polite. It should read, "Listen, Fuckwit...

posted by ColdChef at 9:49 AM on June 25, 2002


It was my understanding that this was already implemented. (In fact, I was quite certain of it.) The recent lawfirm DP seems to indicate that it's not the case, however, unless the script is too stupid to see the difference between www.metafilter.com and www.metafilter.com/. Or unless the poster ignored the warning.
posted by Marquis at 9:52 AM on June 25, 2002


I did the law firm double-post earlier today. The confirmation page gave me the all-clear on the law firm address. I figured this was a pretty specific post (not like a news story where one article could be replicated on numerous sites), so I thought I was safe. Still, I went ahead and did a google search anyway. It gave me an all clear as well.
posted by mrbula at 10:10 AM on June 25, 2002


I tested it.

http://www.ppbfh.com/ returns a complaint that mrbula's post already exists.

http://www.ppbfh.com returns this one and mrbula's dp. So yeah, there's something going on with that bit of code.
posted by perplexed at 10:11 AM on June 25, 2002


Search does seem to treat "foo.com" and "www.foo.com" as different sites (the recent "origami boulder" thread fell into that trap.) Perhaps the search and the d'oh! check could be expanded to check for that situation -- and for the trailing "/" as well? Still wouldn't catch all double posts, of course, but it might help.
posted by ook at 10:17 AM on June 25, 2002


bah! but i like when bored little freaks scream DOUBLE POST whenever possible!
posted by jcterminal at 10:25 AM on June 25, 2002


Hey! Who you calling "little"?
posted by ook at 10:39 AM on June 25, 2002


I've also wondered, anyone try to do something like www.goatse.cx? Did Matt block that? I was going to try one time, but I'm really paranoid that I'd click post instead of preview.

G.
posted by geoff. at 10:47 AM on June 25, 2002


It's already being done Baud, but it needs to be improved.
posted by mathowie (staff) at 10:53 AM on June 25, 2002


www.goatse.cx doesn't even work any more at google...so Skippy the Search EngineTM don't stand a chance!
posted by dash_slot- at 11:27 AM on June 25, 2002


It's actually not that bad of a problem to solve, and it makes for an interesting algorithim. I wrote up a soloution (in PHP, because I dont know CMFL syntax that well, but I'm sure they're similar). It'll get rid of any double links, as long as there is isn't a subdomain in the URL (read: the URL has to start with http://www.....).

I'll post it here, if anyone wants to take a look at it.
posted by SweetJesus at 2:03 PM on June 25, 2002


I think the regular search could use some tweaking.

I was trying to find the www site with the the searchable top 1000 names
the Name o Meter

So I search for
name top
no results.

but if do a search for name I get 700 results, then just using find(ctrl-f) on those results, looking for top, I find it...
The search seems to assume you want the words together, is there a way to go around this?

posted by Iax at 9:27 PM on June 25, 2002


Yes, there is. But it hammers the database server pretty badly on a decent-size data set.
posted by majick at 9:39 PM on June 25, 2002


« Older Metatalk for sportsfilter and blogroots? here?   |   Someone set us up the bomb Newer »

You are not logged in, either login or create an account to post comments