Blame library school, I guess December 23, 2010 8:31 PM Subscribe
Metadata for Metafilter Pony request: would it be possible to have a box that pops up that says something like "We noticed you're using tags X and Y, here are some other tags commonly associated with X and Y," to improve the descriptive qualities of user provided tags?
I figure this would probably be pretty hard on the database, and probably difficult to code, but it seems like it would improve the descriptive qualities of tags. I always hesitate when making a post as I get down to the bottom of the page, since I want to make the metadata as descriptive and useful as possible, but I tend to blank when it gets down to that box. Would it be possible to automate the process and have a box of related tags that populates on preview and provides users with helpful suggestions?
I figure this would probably be pretty hard on the database, and probably difficult to code, but it seems like it would improve the descriptive qualities of tags. I always hesitate when making a post as I get down to the bottom of the page, since I want to make the metadata as descriptive and useful as possible, but I tend to blank when it gets down to that box. Would it be possible to automate the process and have a box of related tags that populates on preview and provides users with helpful suggestions?
I'll say I have a hard time with the tags too and it drives me nuts. I'll spend a while thinking of clever headlines and that doesn't bother me, but the tags are frustrating. I figure if my lack of tags gets bothersome people will suggest them to me in memail.
You can add tags to your contacts' posts.
posted by cjorgensen at 8:34 PM on December 23, 2010 [1 favorite]
You can add tags to your contacts' posts.
posted by cjorgensen at 8:34 PM on December 23, 2010 [1 favorite]
My first thought is that suggesting related tags based on the tags you've added so far is barking up the wrong tree: there's no guarantee that tags the co-occur often with the tags you've chosen for your question will in fact be tags that are appropriate to your actual question.
What would maybe be more effective is to do some sort of tag suggestion based on the content of your actual question, basically parsing the question text for keywords to correlate to the existing corpus of tags others have used. Which wouldn't have to be particularly database intensive at all (especially considering the relative infrequency with which it'd get used, on the order of a hundred or two times a day at most?), but would still be some work to implement.
It'd be an interesting challenge. How good any experimental version of it would be, and more to the point whether it would be good enough to seem really worthwhile, is an open question.
posted by cortex (staff) at 9:22 PM on December 23, 2010 [1 favorite]
What would maybe be more effective is to do some sort of tag suggestion based on the content of your actual question, basically parsing the question text for keywords to correlate to the existing corpus of tags others have used. Which wouldn't have to be particularly database intensive at all (especially considering the relative infrequency with which it'd get used, on the order of a hundred or two times a day at most?), but would still be some work to implement.
It'd be an interesting challenge. How good any experimental version of it would be, and more to the point whether it would be good enough to seem really worthwhile, is an open question.
posted by cortex (staff) at 9:22 PM on December 23, 2010 [1 favorite]
I'll take anal bum cover for $200.
posted by special-k at 9:24 PM on December 23, 2010 [3 favorites]
posted by special-k at 9:24 PM on December 23, 2010 [3 favorites]
oops, wrong thread.
posted by special-k at 9:26 PM on December 23, 2010 [2 favorites]
posted by special-k at 9:26 PM on December 23, 2010 [2 favorites]
Is there a problem with the current descriptive quality of tags? It seems like some people put in great time and effort into tags and some don't. And I'm not sure a tag-suggester would help the people who don't care much about tags anyway.
posted by pb (staff) at 9:31 PM on December 23, 2010
posted by pb (staff) at 9:31 PM on December 23, 2010
What would maybe be more effective is to do some sort of tag suggestion based on the content of your actual question, basically parsing the question text for keywords to correlate to the existing corpus of tags others have used. Which wouldn't have to be particularly database intensive at all (especially considering the relative infrequency with which it'd get used, on the order of a hundred or two times a day at most?), but would still be some work to implement
I was talking about the Blue as well, but that is a good idea. There are certain words inside of posts that wouldn't necessarily be common enough to be ignored, but would probably be frequently used and likely not related to the real content of the thread (self-referential stuff like "metafilter" and "the blue")
I've also wondered if you guys had ever considered putting in something to allow other users to suggest tags, and given enough suggestions having the OP be memailed saying "X users thought this might be a worthwhile tag to add." But that also seems like it could be abused.
Just out of curiosity, how much interest and development time do you guys spend on the tagging system? It's a difficult thing to get right, even for people who are trained in descriptive authority work.
posted by codacorolla at 9:34 PM on December 23, 2010
I was talking about the Blue as well, but that is a good idea. There are certain words inside of posts that wouldn't necessarily be common enough to be ignored, but would probably be frequently used and likely not related to the real content of the thread (self-referential stuff like "metafilter" and "the blue")
I've also wondered if you guys had ever considered putting in something to allow other users to suggest tags, and given enough suggestions having the OP be memailed saying "X users thought this might be a worthwhile tag to add." But that also seems like it could be abused.
Just out of curiosity, how much interest and development time do you guys spend on the tagging system? It's a difficult thing to get right, even for people who are trained in descriptive authority work.
posted by codacorolla at 9:34 PM on December 23, 2010
That is: tagging anything definitively is a difficult task, not coding a tagging system. I'm not writing very clearly today.
posted by codacorolla at 9:35 PM on December 23, 2010
posted by codacorolla at 9:35 PM on December 23, 2010
Simple formula for tagging:
Relevant proper nouns ( "JohnSmith", "MassiveDynamic", "FordPrius","TheRaftOfTheMedusa")
+ broader subject area ("Politics", "Television", "Cars", "Space")
+ geographical area ("Germany", "London", "Texas")
+ time period ("1940s", "80s", "1992")
Usually covers most bases.
posted by Artw at 9:50 PM on December 23, 2010
Relevant proper nouns ( "JohnSmith", "MassiveDynamic", "FordPrius","TheRaftOfTheMedusa")
+ broader subject area ("Politics", "Television", "Cars", "Space")
+ geographical area ("Germany", "London", "Texas")
+ time period ("1940s", "80s", "1992")
Usually covers most bases.
posted by Artw at 9:50 PM on December 23, 2010
Bayesian techniques would be useful in suggesting additional/alternative tags.
posted by five fresh fish at 10:10 PM on December 23, 2010
posted by five fresh fish at 10:10 PM on December 23, 2010
There are certain words inside of posts that wouldn't necessarily be common enough to be ignored, but would probably be frequently used and likely not related to the real content of the thread (self-referential stuff like "metafilter" and "the blue")
Yeah, you'd need to do some sort of frequency analysis up front on a representative sample of posts (possibly just all posts ever, since it's not that big of a dataset) to establish the relative commonality of terms in a local context.
I've also wondered if you guys had ever considered putting in something to allow other users to suggest tags, and given enough suggestions having the OP be memailed saying "X users thought this might be a worthwhile tag to add." But that also seems like it could be abused.
I would say this sounds like serious overkill. It's a neat idea but it's a solution in search of a problem given that tags are an ancillary rather than a central part of the user experience on metafilter. We want tags to be moderately helpful, but engineering complex methods for crowdsourcing tag submissions is more effort in terms of implementation, maintenance, moderation, and user educating than I think we're at all interested in expending.
Just out of curiosity, how much interest and development time do you guys spend on the tagging system? It's a difficult thing to get right, even for people who are trained in descriptive authority work.
We talk about tagging off and on. We've actually explored the frequency/relevance idea with tagging before, for example—the sidebar on the MyAsk tab on the green is driven by some experimental heuristic tag-matching that me and pb built out a year or two back, based in part on some tag frequency computations I did at the time. It's one of those things that worked well enough that it seemed worth throwing up on that page, but was also not so robust and out-of-the-park great that we felt comfortable making it a part of the main user experience.
I find tagging stuff pretty interesting personally, but as far as what's actually ready for prime time we're generally going to opt on the side of the simplest thing that works well. Something significantly more complex than a current system would need to be really significantly better as well to really justify consideration, and I think pb's question up-thread—is there actually a problem with how tags work out right now?—is where that discussion would really have to start. I don't doubt that we could, if we tried, build a not-terrible tag suggestion system, but is the lack of such a system right now actually causing problems for the site or is it just leaving some duty-minded tag-friendly posters feeling a little underperforming now and then?
posted by cortex (staff) at 10:50 PM on December 23, 2010 [1 favorite]
Yeah, you'd need to do some sort of frequency analysis up front on a representative sample of posts (possibly just all posts ever, since it's not that big of a dataset) to establish the relative commonality of terms in a local context.
I've also wondered if you guys had ever considered putting in something to allow other users to suggest tags, and given enough suggestions having the OP be memailed saying "X users thought this might be a worthwhile tag to add." But that also seems like it could be abused.
I would say this sounds like serious overkill. It's a neat idea but it's a solution in search of a problem given that tags are an ancillary rather than a central part of the user experience on metafilter. We want tags to be moderately helpful, but engineering complex methods for crowdsourcing tag submissions is more effort in terms of implementation, maintenance, moderation, and user educating than I think we're at all interested in expending.
Just out of curiosity, how much interest and development time do you guys spend on the tagging system? It's a difficult thing to get right, even for people who are trained in descriptive authority work.
We talk about tagging off and on. We've actually explored the frequency/relevance idea with tagging before, for example—the sidebar on the MyAsk tab on the green is driven by some experimental heuristic tag-matching that me and pb built out a year or two back, based in part on some tag frequency computations I did at the time. It's one of those things that worked well enough that it seemed worth throwing up on that page, but was also not so robust and out-of-the-park great that we felt comfortable making it a part of the main user experience.
I find tagging stuff pretty interesting personally, but as far as what's actually ready for prime time we're generally going to opt on the side of the simplest thing that works well. Something significantly more complex than a current system would need to be really significantly better as well to really justify consideration, and I think pb's question up-thread—is there actually a problem with how tags work out right now?—is where that discussion would really have to start. I don't doubt that we could, if we tried, build a not-terrible tag suggestion system, but is the lack of such a system right now actually causing problems for the site or is it just leaving some duty-minded tag-friendly posters feeling a little underperforming now and then?
posted by cortex (staff) at 10:50 PM on December 23, 2010 [1 favorite]
"HI, I'M CLIPPY, AND I SEE YOU ARE TRYING TO USE TAGS. MAY I SUGGEST...."
posted by edgeways at 2:16 AM on December 24, 2010 [7 favorites]
posted by edgeways at 2:16 AM on December 24, 2010 [7 favorites]
I always hesitate when making a post as I get down to the bottom of the page, since I want to make the metadata as descriptive and useful as possible
You can add tags after the post has gone up; you don't have to get it right when you make the post. If this hesitation causes you to abort a post that you'd otherwise make, then consider just going through with it and adding additional tags if they spring to mind later or are suggested in the thread.
posted by Rhomboid at 2:52 AM on December 24, 2010
You can add tags after the post has gone up; you don't have to get it right when you make the post. If this hesitation causes you to abort a post that you'd otherwise make, then consider just going through with it and adding additional tags if they spring to mind later or are suggested in the thread.
posted by Rhomboid at 2:52 AM on December 24, 2010
cortex's suggestion of content-related tag suggestions is a good one, and an uber-tag-suggester that looked at both the content and the tags initially supplied by the poster would be a beautiful thing. I think, however, that if engineering time were to be devoted to the tagging, a higher priority would be to improve the tag search system. We discussed this a few weeks back, and there were some great suggestions for being able to refine searches or use Boolean terms etc. Of course, once searching for tags is made useful, then improving the tags on posts will become more important.
posted by nowonmai at 2:56 AM on December 24, 2010
posted by nowonmai at 2:56 AM on December 24, 2010
youtube does this, I find that the tags suggested frequently aren't what would be appropriate, maybe 1 out of 5 works.
posted by HuronBob at 3:30 AM on December 24, 2010
posted by HuronBob at 3:30 AM on December 24, 2010
The real solution to this problem is to let arbitrary users tag posts. To avoid spam, maybe they can also delete tags. Or possibly a tag will only be added if N people attempt to add it.
posted by DU at 3:38 AM on December 24, 2010
posted by DU at 3:38 AM on December 24, 2010
Isn't there already a tag suggester built for mefi, for the back-tagging project? IIRC, that worked rather well..
posted by carsonb at 7:16 AM on December 24, 2010
posted by carsonb at 7:16 AM on December 24, 2010
That was a part of Yahoo's API, I believe (a part which I think pb mentioned has since been shut down, too), not something we had built locally. I thought about DIYing it at the time but that was already functional so I figured, eh.
posted by cortex (staff) at 7:25 AM on December 24, 2010
posted by cortex (staff) at 7:25 AM on December 24, 2010
Yes, we had two versions of the backtagging tool. One version provided suggestions by analyzing the post text with the Yahoo Term Extraction API. (Now defunct, but it looks like they have an alternative through another system.)
My hesitation with auto-suggest in general is that it could limit vocabulary. I think you might get a lot of clicking since it's so convenient. I think there's an argument that could work in our favor since we use tags for retrieval. It's easier to work with a set of 10 tags than 100 if you're trying to recall something. But would limiting the tag vocabulary make the tags more descriptive overall? If that's the problem we're trying to solve, I'm not sure that providing tags for people solves it. If lack of tags altogether is the problem, then auto-suggest would help quite a bit. That's why I think it was better suited for the backtagging project where thousands of posts needed to be tagged.
posted by pb (staff) at 7:35 AM on December 24, 2010
My hesitation with auto-suggest in general is that it could limit vocabulary. I think you might get a lot of clicking since it's so convenient. I think there's an argument that could work in our favor since we use tags for retrieval. It's easier to work with a set of 10 tags than 100 if you're trying to recall something. But would limiting the tag vocabulary make the tags more descriptive overall? If that's the problem we're trying to solve, I'm not sure that providing tags for people solves it. If lack of tags altogether is the problem, then auto-suggest would help quite a bit. That's why I think it was better suited for the backtagging project where thousands of posts needed to be tagged.
posted by pb (staff) at 7:35 AM on December 24, 2010
Yeah, the balance of tag variety vs. tag consistency is tricky.
One problem with any tag suggestion system that relies on a pre-computed vocabulary of possible tags is that it will never suggest a nonce tag that might be really apt for a given post (uncommon proper nouns, for example, when someone asks a question or makes a post about the work of super-obscure painter Kresblitz Snogram or whatever).
That said, hopefully that would be among the most obvious ideas a tagger would have themselves when posting.
If we used the whole established working vocabulary of the site as the tag pool, that would avoid the limited-vocab problem to some extent—any word that wasn't a nonce formation in the post text could be recognized as a known word (in the sense of "mefites have typed this on the site before" according to a pre-computed frequency table) and suggested if the heuristic thought it might be especially relevant.
That's still limiting tags to words that have been used in the text of a post, however; suggestions of tags that are semantically related to rather than literally occurring in the post text is a much more complicated problem that I wouldn't even know where to begin with.
posted by cortex (staff) at 7:49 AM on December 24, 2010
One problem with any tag suggestion system that relies on a pre-computed vocabulary of possible tags is that it will never suggest a nonce tag that might be really apt for a given post (uncommon proper nouns, for example, when someone asks a question or makes a post about the work of super-obscure painter Kresblitz Snogram or whatever).
That said, hopefully that would be among the most obvious ideas a tagger would have themselves when posting.
If we used the whole established working vocabulary of the site as the tag pool, that would avoid the limited-vocab problem to some extent—any word that wasn't a nonce formation in the post text could be recognized as a known word (in the sense of "mefites have typed this on the site before" according to a pre-computed frequency table) and suggested if the heuristic thought it might be especially relevant.
That's still limiting tags to words that have been used in the text of a post, however; suggestions of tags that are semantically related to rather than literally occurring in the post text is a much more complicated problem that I wouldn't even know where to begin with.
posted by cortex (staff) at 7:49 AM on December 24, 2010
This is an interesting conversation, thanks guys.
posted by codacorolla at 7:52 AM on December 24, 2010
posted by codacorolla at 7:52 AM on December 24, 2010
There's also, now that Yahoo Term Extraction is no longer with us, the ever-popular Natural Language Toolkit, which is seriously awesome and powerful as far as this sort of thing goes - albeit it's written in Python, and I don't think has a centrally callable API, although I could be wrong. It pretty much can handle anything that you throw at it.
posted by jivadravya at 8:45 AM on December 24, 2010 [2 favorites]
posted by jivadravya at 8:45 AM on December 24, 2010 [2 favorites]
There should be two levels of tagging: major tags from a limited vocabulary, and minor folksonomic tags to provide specialization.
My guess is that the major tags can be derived from an analysis of the folksonomy tags, along with some human oversight. There are going to be broad themes and categories that have arisen naturally; they just need some formal tweaking.
posted by five fresh fish at 9:57 AM on December 24, 2010
My guess is that the major tags can be derived from an analysis of the folksonomy tags, along with some human oversight. There are going to be broad themes and categories that have arisen naturally; they just need some formal tweaking.
posted by five fresh fish at 9:57 AM on December 24, 2010
Consult this, this, this, this, or this as applicable.
(Why no tag cloud in Projects?)
posted by Sys Rq at 10:09 AM on December 24, 2010
(Why no tag cloud in Projects?)
posted by Sys Rq at 10:09 AM on December 24, 2010
Because that would destroy the universe? Is this a trick question?
posted by cortex (staff) at 10:14 AM on December 24, 2010
posted by cortex (staff) at 10:14 AM on December 24, 2010
Oh, wait, I found it!
1. Click on a post
2. Click on a tag
3. Click "View popular tags"
posted by Sys Rq at 10:23 AM on December 24, 2010
1. Click on a post
2. Click on a tag
3. Click "View popular tags"
posted by Sys Rq at 10:23 AM on December 24, 2010
Probably not, just an oversight I'd imagine. We should probably add a Tags link up in the header.
You can get at it from the page for any given tag, at least.
posted by cortex (staff) at 10:24 AM on December 24, 2010
You can get at it from the page for any given tag, at least.
posted by cortex (staff) at 10:24 AM on December 24, 2010
Yep, just an oversight. I added the Tags link to the Projects header.
posted by pb (staff) at 10:53 AM on December 24, 2010
posted by pb (staff) at 10:53 AM on December 24, 2010
Just an oversight I'd imagine. We should probably add one million dollars up in my bank account.
posted by cortex (staff) at 11:02 AM on December 24, 2010 [3 favorites]
posted by cortex (staff) at 11:02 AM on December 24, 2010 [3 favorites]
Artw: Simple formula for tagging:
Relevant proper nouns ( "JohnSmith", "MassiveDynamic", "FordPrius","TheRaftOfTheMedusa")
+ broader subject area ("Politics", "Television", "Cars", "Space")
+ geographical area ("Germany", "London", "Texas")
+ time period ("1940s", "80s", "1992")
Why not add something like this to the new post page? It seems to cover most everything and breaks tagging into simple steps rather than, "Add tags!"
posted by 47triple2 at 3:38 PM on December 24, 2010
Relevant proper nouns ( "JohnSmith", "MassiveDynamic", "FordPrius","TheRaftOfTheMedusa")
+ broader subject area ("Politics", "Television", "Cars", "Space")
+ geographical area ("Germany", "London", "Texas")
+ time period ("1940s", "80s", "1992")
Why not add something like this to the new post page? It seems to cover most everything and breaks tagging into simple steps rather than, "Add tags!"
posted by 47triple2 at 3:38 PM on December 24, 2010
i just want to say that while neologisms usually drive me up a wall, i really like the word 'folksonomy"
posted by empath at 6:21 PM on December 24, 2010 [1 favorite]
posted by empath at 6:21 PM on December 24, 2010 [1 favorite]
I hated it when I first heard it. Since then it's proven itself to be a pretty useful descriptor and it's grown on me.
posted by Artw at 8:26 PM on December 24, 2010
posted by Artw at 8:26 PM on December 24, 2010
You are not logged in, either login or create an account to post comments
posted by codacorolla at 8:32 PM on December 23, 2010