Why does the link search find things the site search doesn't? May 4, 2006 8:33 AM   Subscribe

Why does the link search find things the site search doesn't?
posted by Gamblor to Bugs at 8:33 AM (27 comments total)

I find something that I think would make a good post.

I search on the entire URL - http://www.samuelsdesign.com/comics/

No results found.

I search for just the domain - samuelsdesign

No results found.

I spend the time to compose the post. I submit it. Now I'm told it's a double:

The link you entered (http://www.samuelsdesign.com/comics/) was found in 1 previous thread.

So why does the link search find this, when the site search doesn't?
posted by Gamblor at 8:34 AM on May 4, 2006


I also tried the "comics" tag, and searching for "comic book covers" and didn't find it, which is, I suppose, a separate issue.
posted by Gamblor at 8:34 AM on May 4, 2006


It's happened to me to. Now I always try the link search first.
posted by CunningLinguist at 8:39 AM on May 4, 2006


that's an excellent question. when we have the answer, I'd like to ask that someone make a note of it in the wiki and faq.
posted by shmegegge at 8:40 AM on May 4, 2006


Now I always try the link search first.

How do you do that without actually submitting a post?
posted by Gamblor at 8:42 AM on May 4, 2006


oh, and if I remember correctly, crunchland's excellent guide to posting mentions first putting each link you'd like in your post into the main link box, to make sure you're not doubling something important, or that you can at least mention the double if it's tangential to your main point. then, when you've checked all your important main links, you can safely go ahead and construct your post.

am i making sense? here's an example if I'm not:

my post could say "i hate brocolli." and the link behind "brocolli" (which I don't hate, btw) wouldn't be checked by the link search, because it only checks links in the first words of your post, and brocolli happens after some unlinked text. so, before i construct the post, i would first attempt a dummy post where the link for brocolli was in the main link text box and submit it to see if it's a double. if it's not, then i would hit back and make the real post this time, secure in my non-double-itude.

i know this doesn't answer your question, but hopefully its helpful anyway.
posted by shmegegge at 8:44 AM on May 4, 2006


How do you do that without actually submitting a post?

the post form goes to a confirmation page, even if there's nothing wrong with your post that it can detect. just don't submit AFTER that confirmation page, and hit "back" instead, and you're all good.
posted by shmegegge at 8:45 AM on May 4, 2006


Ok, I get your point, shmegegge, but doesn't this seem like an awfully convoluted and elaborate process?

How many places does a person have to search before they can be reasonably confident they're not double posting? That's three by my count:

1. Site search (which doesn't check the urls, only the text)
2. Link search (which doesn't check the text, only the urls)
3. Tags (and this could be pretty elaborate depending on how many tags might apply to your post)

I try to be thorough to avoid doubles, but jeez. Couldn't this be consolidated?
posted by Gamblor at 8:54 AM on May 4, 2006


Gamblor, the link search is only hidden because I don't have the resources to run a full text search on a public interface. So it only gets hit a few dozen times a day when people go to make new posts while I hope that yahoo and google can keep up with recent threads.

The avoiding-of-doubleposts situation kind of sucks without a full text search, but it grinds the database into oblivion the moment I turn one on. So we have the current (inadequate) compromise.
posted by mathowie (staff) at 9:06 AM on May 4, 2006


3. Tags (and this could be pretty elaborate depending on how many tags might apply to your post)

Can someone remind me how to search just for tags? Also, how to view all tags, not just the top 150? Much obliged.
posted by tristeza at 9:25 AM on May 4, 2006


Ok, matt, consider me educated. I've been using the site search first, but from your answer, it would seem that it's not very effective. So when composing a post, we should:

1. Do a link search on any url you plan on using.
2. Do a site wide search for specific terms
3. View the posts tagged with any tags you're considering applying.

Step 3 is optional, I suppose, but that's the gist of it, right?
posted by Gamblor at 9:30 AM on May 4, 2006


"Can someone remind me how to search just for tags?"

Man, I wish I knew. I do a search for the tag I'm looking for, and pick posts that should be tagged with the tag, in the hopes that they are tagged, so I can get to the tag. It isn't a very effective system, but it works.
posted by graventy at 9:31 AM on May 4, 2006


(building it is left as an exercise for the reader, but...)

What about creating a special, small Anti-Doublepost database? This'd be a file that could be as simple as the contents of every post (just the post, not the comments) to the blue. Easy and lightweight to maintain—it gets updated only when a new post is made—and searching it would have to be tremendously less stressful on the server than doing a full text search.

Another notion: every time a new post or new comment is posted, search it for well-formed links and add them to a special Link Database—just comment-id :: url, nothing else. A little more overhead (updating this db for every comment, which is, what, once a second) but again it might provide a much smaller and hence much more usable link search.
posted by cortex at 9:31 AM on May 4, 2006


(For the latter notion, it'd really only be checking every comment; updating the db would only happen if the comment included links.)
posted by cortex at 9:33 AM on May 4, 2006


graventy, maybe there's a better way, but I just type them in manually:

http://www.metafilter.com/tags/xyz

Where xyz = whatever word you're looking for.

But I would also appreciate being able to view all tags, instead of just the top 150. Could we turn this into a pony request?
posted by Gamblor at 9:36 AM on May 4, 2006


They don't even need to be resized by popularity, if that puts too much strain on the db. Just a complete, alphabetical list would be nice.
posted by Gamblor at 9:39 AM on May 4, 2006


"I don't have the resources to run a full text search on a public interface"

Is this something that could be resolved by making that specific search available logged-in members only? Or are the members the "public" you were talking about?
posted by Eideteker at 9:55 AM on May 4, 2006


I presumed he was talking about the search box up in the corner, which I imagine gets most of its use from logged-in members anyway.
posted by cortex at 10:04 AM on May 4, 2006


Step 3 is optional, I suppose, but that's the gist of it, right?

Yeah, that's what I do.

What about creating a special, small Anti-Doublepost database?

The table all posts are in currently has over 50k records and I made the description field a ntext field, which is really slow for searching.

Just a complete, alphabetical list would be nice.

The last time I let it run, the HTML file alone was several megabytes in size, showing thousands and thousands of tags. It's impractical to list every tag ever used.

Is this something that could be resolved by making that specific search available logged-in members only?

Perhaps, but even when I just did the tag search page, the server bumped itself offline from the load, so I have a feeling we'll still suffer some strain with a link search.
posted by mathowie (staff) at 10:39 AM on May 4, 2006


The last time I let it run, the HTML file alone was several megabytes in size, showing thousands and thousands of tags. It's impractical to list every tag ever used.

Understood. It might be more feasible to break it up into separate pages by first letter, but that would have to determined experimentally, and is no doubt a low priority.

Thanks for the answers.
posted by Gamblor at 11:11 AM on May 4, 2006


Gamblor, no other site using tags as a free-for-all massive multi-user thing offers browseable tags aside from the popular ones. delicious, flickr, and others don't do it because it's too much data to deal with.
posted by mathowie (staff) at 11:21 AM on May 4, 2006


Ok, I see your point. Thanks again.
posted by Gamblor at 11:38 AM on May 4, 2006


graventy, maybe there's a better way, but I just type them in manually:
http://www.metafilter.com/tags/xyz


So, is this the answer - is this the best way to search for tags?
posted by tristeza at 1:36 PM on May 4, 2006


Hey, what ever happened to tag "x" being mapped to x.metafilter.com?
posted by Robot Johnny at 1:50 PM on May 4, 2006



"Hey, what ever happened to tag "x" being mapped to x.metafilter.com?"


Like the search, we traded it for some occasional uptime.
posted by Eideteker at 2:25 PM on May 4, 2006


The last time I let it run, the HTML file alone was several megabytes in size, showing thousands and thousands of tags. It's impractical to list every tag ever used.

Just curious, how many tags are there?
posted by MetaMonkey at 4:13 PM on May 4, 2006


graventy, maybe there's a better way, but I just type them in manually:
http://www.metafilter.com/tags/xyz

So, is this the answer - is this the best way to search for tags?


Anyone?
posted by tristeza at 8:24 AM on May 5, 2006


« Older If someone flags a post, is it appropriate for...   |   Is the site over-moderated? Newer »

You are not logged in, either login or create an account to post comments