taxonomy organization volunteers June 2, 2004 8:50 AM Subscribe
As I understand them, Matt's plans for AskMe include some organization of the archives by topic. (I may be wrong about this; if so, stop reading here.) As the archive grows, it becomes a bigger and bigger job going through the individual posts and tagging them with the appropriate (sub)category. So my questions are, first, has any kind of organizing taxonomy been proposed for AskMe, and second, once that's done, would it make sense to organize an effort at distributing the tagging process across willing MeFi volunteers? If so, I volunteer.
Wouldn't people just specify the topic when they make the post? I don't see much wrong with leaving the archives topicless, effectively giving them a special "pre-topics" topic.
posted by reklaw at 9:33 AM on June 2, 2004
posted by reklaw at 9:33 AM on June 2, 2004
Taxonomies are great for random browsing, but they're next to useless for searching. Maybe it's just me, but I can't imagine someone going to the AskMe archives without already having some idea of what they are looking for. What problem are you looking to solve that you can't solve with Google?
posted by fuzz at 9:34 AM on June 2, 2004
posted by fuzz at 9:34 AM on June 2, 2004
I'm going to have categories that are very general, then free flowing keywords like del.icio.us
I will need help tagging everything when the time comes and thanks for volunteering.
posted by mathowie (staff) at 9:54 AM on June 2, 2004
I will need help tagging everything when the time comes and thanks for volunteering.
posted by mathowie (staff) at 9:54 AM on June 2, 2004
If this is the volunteer list thread, I'm volunteering, too.
posted by taz at 10:01 AM on June 2, 2004
posted by taz at 10:01 AM on June 2, 2004
Why not a wiki ?
I can't help though. I have too much crap of my own to organize, some of it on disc and much of it between my ears.
posted by troutfishing at 10:15 AM on June 2, 2004
I can't help though. I have too much crap of my own to organize, some of it on disc and much of it between my ears.
posted by troutfishing at 10:15 AM on June 2, 2004
Tagging rawks. We just added it to flickr, and it is so cool (and useful too!).
posted by ericost at 10:23 AM on June 2, 2004
posted by ericost at 10:23 AM on June 2, 2004
I think fuzz is wrong about browsing, I think in a few years AskMe will be an excellent timesink for people just surfing through the topics and answers. You never opened an encyclopedia as a kid and just read random pages? (Assuming you're older than about 15.)
Matt, can you say more about what you mean by "free flowing keywords?" Is this a list of links to "common term" searches? If so, why not just put in the ask.metafilter.com google field and let people go nuts?
Yes, hyperizer. That's what I was thinking of.
Reklaw: That's good for going forward, but doesn't address the archive, which just gets bigger each day.
posted by luser at 10:27 AM on June 2, 2004
Matt, can you say more about what you mean by "free flowing keywords?" Is this a list of links to "common term" searches? If so, why not just put in the ask.metafilter.com google field and let people go nuts?
Yes, hyperizer. That's what I was thinking of.
Reklaw: That's good for going forward, but doesn't address the archive, which just gets bigger each day.
posted by luser at 10:27 AM on June 2, 2004
Woot, I'll help put the metadata in Ask Metafilter.
It would be nice to see some (maybe Wiki-afied) finding aids for each of the broad categories/keywords with links to AskMe queries. But that's a pony of an entirely different color.
posted by robocop is bleeding at 10:43 AM on June 2, 2004
It would be nice to see some (maybe Wiki-afied) finding aids for each of the broad categories/keywords with links to AskMe queries. But that's a pony of an entirely different color.
posted by robocop is bleeding at 10:43 AM on June 2, 2004
If the data was available via a non-production database, you could take a first pass at it with a script. In fact, you could use a Bayes algorithm, McAffee patent be damned, and train the script to presort for you. Then, you'd simply have the volunteers double check the results.
For the record, I'd volunteer to write a script to do so. :-)
posted by sequential at 10:58 AM on June 2, 2004
For the record, I'd volunteer to write a script to do so. :-)
posted by sequential at 10:58 AM on June 2, 2004
luser, that's what I was saying, taxonomies are great for random browsing. I was just wondering if there were any other uses. Sorry if I sounded contentious.
posted by fuzz at 10:58 AM on June 2, 2004
posted by fuzz at 10:58 AM on June 2, 2004
I'm all about helping to tag things, or if sequential's idea pans out...double checking the tags.
posted by dejah420 at 11:06 AM on June 2, 2004
posted by dejah420 at 11:06 AM on June 2, 2004
Rather than a strict taxonomy (which is necessary when sorting physical items---stuff has to go into one and only one spot), I'd much prefer a pre-defined (with additions possible?) set of non-exclusive categories.
For example, setting up a home network would have both home-related and computer keywords; buying a new Mac would have buying-advice and computer (and probably Apple) keywords. Advice about legal-lawsuiting bad neighbours would be both home-related and legal, and so on...
When posting a new question, there would be a field of check boxes of keywords which could apply to the question. There could be a limit to three or four keys per question if necessary.
When structuring the archive, in a dmoz or Yahoo!-like fashion, you could have a select number of keywords as the top hierarchy, with the others on lower levels. A question thread would turn up as many times as it had keywords in the hierarchy, but by different routes. For example, you could find the home network question by browsing through home-related -> computer, or by the opposite route: computer -> home-related. Also, you would be able to search on an set (or subset) of keywords.
Is this the same thing as tagging?
Finally, how about making the keywords (and the keywords only) open to some sort of vote? If the original author miscategorizes a question, it would be a great help if people could petition (or have a threshold-vote) to modify its keywords.
posted by bonehead at 11:23 AM on June 2, 2004
For example, setting up a home network would have both home-related and computer keywords; buying a new Mac would have buying-advice and computer (and probably Apple) keywords. Advice about legal-lawsuiting bad neighbours would be both home-related and legal, and so on...
When posting a new question, there would be a field of check boxes of keywords which could apply to the question. There could be a limit to three or four keys per question if necessary.
When structuring the archive, in a dmoz or Yahoo!-like fashion, you could have a select number of keywords as the top hierarchy, with the others on lower levels. A question thread would turn up as many times as it had keywords in the hierarchy, but by different routes. For example, you could find the home network question by browsing through home-related -> computer, or by the opposite route: computer -> home-related. Also, you would be able to search on an set (or subset) of keywords.
Is this the same thing as tagging?
Finally, how about making the keywords (and the keywords only) open to some sort of vote? If the original author miscategorizes a question, it would be a great help if people could petition (or have a threshold-vote) to modify its keywords.
posted by bonehead at 11:23 AM on June 2, 2004
I'd help. I can probably even justify doing stuff like this at work.
posted by jessamyn at 11:39 AM on June 2, 2004
posted by jessamyn at 11:39 AM on June 2, 2004
I'll volunteer, as well. Will we be awarded certificates of notable achievement for our contribution?
posted by cmonkey at 12:23 PM on June 2, 2004
posted by cmonkey at 12:23 PM on June 2, 2004
I can help. And thanks, jessamyn, for the suggestion that it might be a tolerable work activity for someone in a library. Off to talk to my boss about a "public service" project.....
posted by donnagirl at 1:05 PM on June 2, 2004
posted by donnagirl at 1:05 PM on June 2, 2004
I'll put my turbo typing skills to use and offer to help as well.
posted by amandaudoff at 1:59 PM on June 2, 2004
posted by amandaudoff at 1:59 PM on June 2, 2004
When I help, do I get credit for my MLIS? Seriously. About the helping, not so much the credit.
posted by stet at 2:24 PM on June 2, 2004
posted by stet at 2:24 PM on June 2, 2004
Totally, this is resume building experience for the Librarian Host that lurks on MeFi. Lord knows I'll be working on it while surfing lisjobs.com for my newly minted MLS position.
posted by robocop is bleeding at 2:45 PM on June 2, 2004
posted by robocop is bleeding at 2:45 PM on June 2, 2004
Tagging rawks. We just added it to flickr, and it is so cool (and useful too!).
Heh. I was just gonna mention that Flickr is doing it too, now. Very snazz.
posted by stavrosthewonderchicken at 4:27 PM on June 2, 2004
Heh. I was just gonna mention that Flickr is doing it too, now. Very snazz.
posted by stavrosthewonderchicken at 4:27 PM on June 2, 2004
Tags I guess would only be applicable (after the initial retrofit) by the poster of the question, yeah?
'cause I could totally see the utility in expanding the tagging system to all threads (even ones in the blue) even after posting, which would then (like in del.icio.us) work like a personal bookmarking system (which I'd love) -- for example I could mark (with my own tags, maybe only visible to me) any thread as 'wonderchicken favorite' or 'wonderchicken pileon' or 'wonderchicken must NEVAR FORGET' or 'hama7 h8s me' or something, and hey hoopla, I've taxonomoficated the entire MeFi oeuvre (at least as far as last week) for myself, for later reference. How cool would that be?
We could even selectively expose taxonomies and share them, making it a friendly competition. Some of our more...obsessive...members might come up with some amazingly useful stuff!
Ooooh, I've got the pony-tingles. I accept that Matt probably wouldn't have time for something like this, but it might be a relatively trivial extension of his tag plans for AskMe, and man, it would rock...
posted by stavrosthewonderchicken at 4:35 PM on June 2, 2004
'cause I could totally see the utility in expanding the tagging system to all threads (even ones in the blue) even after posting, which would then (like in del.icio.us) work like a personal bookmarking system (which I'd love) -- for example I could mark (with my own tags, maybe only visible to me) any thread as 'wonderchicken favorite' or 'wonderchicken pileon' or 'wonderchicken must NEVAR FORGET' or 'hama7 h8s me' or something, and hey hoopla, I've taxonomoficated the entire MeFi oeuvre (at least as far as last week) for myself, for later reference. How cool would that be?
We could even selectively expose taxonomies and share them, making it a friendly competition. Some of our more...obsessive...members might come up with some amazingly useful stuff!
Ooooh, I've got the pony-tingles. I accept that Matt probably wouldn't have time for something like this, but it might be a relatively trivial extension of his tag plans for AskMe, and man, it would rock...
posted by stavrosthewonderchicken at 4:35 PM on June 2, 2004
mmmmkay... stavros, I'd call that taking someone else's pony for a ride
posted by scarabic at 5:38 PM on June 2, 2004
posted by scarabic at 5:38 PM on June 2, 2004
Well, sure, but you can't deny it'd be way cool!
posted by stavrosthewonderchicken at 5:41 PM on June 2, 2004
posted by stavrosthewonderchicken at 5:41 PM on June 2, 2004
With all the coolness of AskMeFi since it's inception, I'de also love to help to give some back!
posted by jmd82 at 6:18 PM on June 2, 2004
posted by jmd82 at 6:18 PM on June 2, 2004
One thing that's excited me about categories (or, even better, multiple keywords that can be assigned to each question) would be the potential for users to define certain keywords that interest them, or that they are experts on. I'm often dissapointed when I browse ask.me and find a day-old question I could have given a good answer to but missed - it would be like, totally cool if, when you hit metafilter, the side-bar said "There are currently active ask.mefi posts on the topics of: guitars, ecology, cunnilingus, waiting for your expert advice".
posted by Jimbob at 7:13 PM on June 2, 2004
posted by Jimbob at 7:13 PM on June 2, 2004
There come these times, all too regularly, when I go ballistically enthusiastic about something, a 7-year-old with a new bike, all flashing eyes and floating hair, and everyone else just kind of stands back and goes 'uh, yeah, right.'
This is another one of those times. Ah well.
posted by stavrosthewonderchicken at 9:56 PM on June 2, 2004
This is another one of those times. Ah well.
posted by stavrosthewonderchicken at 9:56 PM on June 2, 2004
The real question (as with so many things) is whether to do this with a better search indexer, or with hand-assignment of taxonomy flags and/or keywords.
#1 is hard to do as well as Google does it (much as we've now come to expect that level of quality) and #2 comes with probably even more problems, not the least of which is how to get 1,000 random MeFi volunteers to work on 3 billion random threads in any consistent way. There's not much point applying a taxonomy if the process by which that happens isn't standardized pretty well.
I think it would be feasible to break up AskMe into maybe 8 families of question, something along the lines of "Computers and Technology," "Commerce and Law," "The Sciences," "Entertainment," etc., but down to the level of "guitars, ecology, cunnilingus" would be incredibly difficult.
Taking those 3 examples, is there an existing structured taxonomy of all tangible objects, fields of study, and sexual practices that we can work from? No? Crap. I guess we have to develop one before we can get started. Where do you even begin?
I know I'm being pessimistic here, but I've worked on similar projects before (actually even narrower ones than this) and found them to be full of many unforseeable challenges. They can be a huge investment of time and attention for unclear results. Usually, the wisest choice is to just improve search, improve search, improve search.
posted by scarabic at 11:23 PM on June 2, 2004
#1 is hard to do as well as Google does it (much as we've now come to expect that level of quality) and #2 comes with probably even more problems, not the least of which is how to get 1,000 random MeFi volunteers to work on 3 billion random threads in any consistent way. There's not much point applying a taxonomy if the process by which that happens isn't standardized pretty well.
I think it would be feasible to break up AskMe into maybe 8 families of question, something along the lines of "Computers and Technology," "Commerce and Law," "The Sciences," "Entertainment," etc., but down to the level of "guitars, ecology, cunnilingus" would be incredibly difficult.
Taking those 3 examples, is there an existing structured taxonomy of all tangible objects, fields of study, and sexual practices that we can work from? No? Crap. I guess we have to develop one before we can get started. Where do you even begin?
I know I'm being pessimistic here, but I've worked on similar projects before (actually even narrower ones than this) and found them to be full of many unforseeable challenges. They can be a huge investment of time and attention for unclear results. Usually, the wisest choice is to just improve search, improve search, improve search.
posted by scarabic at 11:23 PM on June 2, 2004
scarabic, have you looked at the arbitrary, user-selected way in which tags/keywords are created at del.icio.us and flickr? It's entirely different from the sort of top-down assignment of categories you're talking about here. And it works beautifully!
As Matt said 'I'm going to have categories that are very general, then free flowing keywords like del.icio.us' : this sounds perfect, and it's the second bit that's the fun part, and the powerful one, and it makes 'guitars, ecology, cunnilingus' very easy to do indeed.
Where I was getting excited was the idea of extending that system that has worked so well at del.icio.us in particular to include both private and public keyword/categories, allowing both public and private structuring of information, and to allow it (if only in private 'taxonomies') to be applied after the fact.
Perhaps by using the word 'taxonomy' I was throwing out a red herring. As Matt suggests and you discuss, that's only viable at the top level. The free-tagging is the good stuff, though!
This ancient Metatalk thread has some interesting discussion on categorization. I believe that most of the issues raised can be dealt with by the combination of approaches that Matt mentions upthread here.
Improving search would be good (god knows it's not great at the moment), but this idea of a rigid top layer with a 'free flowing keywords' system underneath is brilliant, and exciting.
posted by stavrosthewonderchicken at 2:34 AM on June 3, 2004
As Matt said 'I'm going to have categories that are very general, then free flowing keywords like del.icio.us' : this sounds perfect, and it's the second bit that's the fun part, and the powerful one, and it makes 'guitars, ecology, cunnilingus' very easy to do indeed.
Where I was getting excited was the idea of extending that system that has worked so well at del.icio.us in particular to include both private and public keyword/categories, allowing both public and private structuring of information, and to allow it (if only in private 'taxonomies') to be applied after the fact.
Perhaps by using the word 'taxonomy' I was throwing out a red herring. As Matt suggests and you discuss, that's only viable at the top level. The free-tagging is the good stuff, though!
This ancient Metatalk thread has some interesting discussion on categorization. I believe that most of the issues raised can be dealt with by the combination of approaches that Matt mentions upthread here.
Improving search would be good (god knows it's not great at the moment), but this idea of a rigid top layer with a 'free flowing keywords' system underneath is brilliant, and exciting.
posted by stavrosthewonderchicken at 2:34 AM on June 3, 2004
You know, I used to get paid the Big Bucks for this kinda evangelistic handwaving...strange days, those were.
posted by stavrosthewonderchicken at 5:14 AM on June 3, 2004
posted by stavrosthewonderchicken at 5:14 AM on June 3, 2004
but down to the level of "guitars, ecology, cunnilingus" would be incredibly difficult.
I'd go with the mineral, vegetable, and animal supergroups, respectively.
posted by mbd1mbd1 at 9:13 AM on June 3, 2004
I'd go with the mineral, vegetable, and animal supergroups, respectively.
posted by mbd1mbd1 at 9:13 AM on June 3, 2004
Top-level cats are already done. The only piece that needs to be worked out is how keyword navigation will work.
And then, uh, implementation.
posted by jjg at 12:12 PM on June 3, 2004
And then, uh, implementation.
posted by jjg at 12:12 PM on June 3, 2004
I haven't really sampled del.icio.us yet. I don't see a clear way to explore their keyword set. But I do see "tech, webdesign, web, webdev, blog, blogs, blogging, design, and programming" all in the same "most popular" list together, as if they were peers. This is a great example (I think) of unstandardized keywording and why it gets confusing.
posted by scarabic at 12:56 PM on June 3, 2004
posted by scarabic at 12:56 PM on June 3, 2004
NEIL: [picks up a large bag of seed packets] OK, I've plowed this bit, right. And now I'm going to sow it. [throws packets of seed down] This self-sufficiency thing really is amazing. We sow the seed, right. Nature grows the seed, and then, we eat the seed. And then, after that, we sow the seed, nature grows the seed, and then, we eat the seed. And then, after that again, we sow the seed, nature grows the seed....
RICK: Oh, shut up, Neil! Shut up! Shut up. It's pathetic. I mean, what about radical magazines? What about Kicker boots?! Can we grow them? No, we can't, can we?! They beauty of your plan, Neil, seems to rest on everyone being really into seeds.
NEIL: No no no, Rick. You don't understand the timeless wonder of the whole thing. We. Sow the seed! Nature grows the seed. We eat the seed. And then....
posted by stavrosthewonderchicken at 4:18 PM on June 3, 2004
RICK: Oh, shut up, Neil! Shut up! Shut up. It's pathetic. I mean, what about radical magazines? What about Kicker boots?! Can we grow them? No, we can't, can we?! They beauty of your plan, Neil, seems to rest on everyone being really into seeds.
NEIL: No no no, Rick. You don't understand the timeless wonder of the whole thing. We. Sow the seed! Nature grows the seed. We eat the seed. And then....
posted by stavrosthewonderchicken at 4:18 PM on June 3, 2004
whoa there, we should start talking nomenclature if this is going to happen. Otherwise we are looking at a huuuuge clusterfuck. Adding metadata without a nomenclature is the reason museum cataloguing really sucks. There are some librarians 'round here, how bout some help.
posted by jmgorman at 7:31 PM on June 3, 2004
posted by jmgorman at 7:31 PM on June 3, 2004
whoa there, we should start talking nomenclature if this is going to happen. Otherwise we are looking at a huuuuge clusterfuck.
No. Nonononono. Not necessary with the ideas Matt mentioned and I've been trying (with little success, it would seem) to elaborate on and extend. Not. Necessary.
It's not strict top-down taxonomy, it's free-floating keywording (under a top level coathanger), emergent order outta chaos, alla that. And it works, or is being proven to do so by the aforementioned examples (perhaps arguably, but it's clear which side I'd argue on).
posted by stavrosthewonderchicken at 10:47 PM on June 3, 2004
No. Nonononono. Not necessary with the ideas Matt mentioned and I've been trying (with little success, it would seem) to elaborate on and extend. Not. Necessary.
It's not strict top-down taxonomy, it's free-floating keywording (under a top level coathanger), emergent order outta chaos, alla that. And it works, or is being proven to do so by the aforementioned examples (perhaps arguably, but it's clear which side I'd argue on).
posted by stavrosthewonderchicken at 10:47 PM on June 3, 2004
Go Stavros, tell it to the people. Go free-floating keywords. I can't wait for this and I think it will be great.
However, someone will still need to manually tag the entire archive as it exists at the point this gets implemented.
posted by stupidsexyFlanders at 2:54 AM on June 4, 2004
However, someone will still need to manually tag the entire archive as it exists at the point this gets implemented.
posted by stupidsexyFlanders at 2:54 AM on June 4, 2004
A set nomenclature doesn't necessarily indicate a top down structure. It just narrows the choices for the index and limits and defines the vocabulary the user needs.
Really stupid example: search for canine and get nothing on dogs.
I know we are all pretty elitist in here, but indexes are for helping people find information, and the people who cannot find it now are probably the ones who need the index and could most greatly benefit from the authoritative nomenclature.
posted by jmgorman at 6:49 PM on June 4, 2004
Really stupid example: search for canine and get nothing on dogs.
I know we are all pretty elitist in here, but indexes are for helping people find information, and the people who cannot find it now are probably the ones who need the index and could most greatly benefit from the authoritative nomenclature.
posted by jmgorman at 6:49 PM on June 4, 2004
Wow stavros, that's a pretty cool MeTa thread and I had to stop about a quarter of the way through 'cause my head was spinning. I think anything but free-floating keywords will get mired in politics and ego (what's a top-level category, what's an acceptable label, etc.). Assuming this pony were ever to be rideable:
1. Could it be used to censor unpopular opinions/ threads?
2. While we're all English speakers, is there a value to . . . I don't know, redundant labels so that dog == canine == puppy == another word that I can tack on after doing a search for that word and not getting any dog-related results?
I apologize for adding to the questions rather than the answers. I'm very interested in things like NLQ, but completely over my head.
posted by yerfatma at 8:59 AM on June 5, 2004
1. Could it be used to censor unpopular opinions/ threads?
2. While we're all English speakers, is there a value to . . . I don't know, redundant labels so that dog == canine == puppy == another word that I can tack on after doing a search for that word and not getting any dog-related results?
I apologize for adding to the questions rather than the answers. I'm very interested in things like NLQ, but completely over my head.
posted by yerfatma at 8:59 AM on June 5, 2004
Finishing that thread, I would say after-the-fact categorization would prevent the concerns Matt mentioned. And probably work a lot better than letting posters tag things themselves. Which you probably all already knew. I can talk to myself offline, so that's enough of that.
posted by yerfatma at 9:06 AM on June 5, 2004
posted by yerfatma at 9:06 AM on June 5, 2004
You are not logged in, either login or create an account to post comments
posted by bingo at 8:54 AM on June 2, 2004