cant just search for "." August 25, 2007 3:43 PM   Subscribe

Morbid question I know, but here goes... I wanted to peruse ALL of the threads in the history of MeFi which are what I'd affectionately refer to as virtual wakes. Is there an ideal way to do this beyond just the obit tag, and how might one do it efficiently? [more inside]
posted by ZachsMind to MetaFilter-Related at 3:43 PM (31 comments total) 1 user marked this as a favorite

The tag RIP only shows thirty. The Dead tag shows fifty. The tag obituary shows 242 as of this writing, but is that really all of them? Is there a way of getting a list of the most favorited replies specifically from virtual wakes? So one could read the absolute best sentiments living MeFites had to say about dead nonMeFites? Is there a way to scan fast for the worst moments, without combing through the whole mess as if I were searching for needles in haystacks?

previously on 12.04.07, 30.07.07, 03.15.06, 13.11.04, 27.03.02, 07.01.01, etc., /me head asplodes.
posted by ZachsMind at 3:43 PM on August 25, 2007

Ask cortex. He has a lock on this sort of info.
posted by mathowie (staff) at 3:48 PM on August 25, 2007

Yeah, I was thinking about this a little while back. And, well:

There's a lot more than that. I had pb throw a query for me a a couple weeks ago when I got curious about the ontogeny, so to speak, of the mefi dot. The output of that was every thread that contained a comment consisting of literally a single '.' character—which includes a lot of things with ironic or adaptive use of the dot for non-obits, and also misses any obit thread lacking that exact sort of dot comment, so it's probably not quite right—but the rough count of threads comes out to 1,053.

Caveats above considered, I'd say that's probably accurate give or take 20%. Many hundred of obit threads, anyway; I'll try to clean up the data I have and make it available for casual perusal at some point, but I could provide an index of the matching threads pretty quickly if you'd like.
posted by cortex (staff) at 3:59 PM on August 25, 2007

posted by quonsar at 4:10 PM on August 25, 2007

Quonsar: "SEARCH ON '.' "

Check the upper left hand corner of your web browser, Quonsar. =)

Cortex: "There's a lot more than that. I had pb throw a query for me a a couple weeks ago when I got curious about the ontogeny---"

*sound of brakes squealing*

*Zach glowers at Cortex*

*Zach reluctantly ambles over to*

Huh. Cool word. Mkay. Carry on.

*sound of scratching record mixed with slowmo of Cortex's voice sped up to normal speed*

"..soooo tooo speeak, of the mefi dot. The output of that was every thread that contained a comment consisting of literally ---"

*sound of needle scratched across grooves of vinyl*

Hey wait a second! What's a 'pb'?
posted by ZachsMind at 4:52 PM on August 25, 2007

"...I'll try to clean up the data I have and make it available for casual perusal at some point, but I could provide an index of the matching threads pretty quickly if you'd like."

Uh. Yeah. Sounds cool. Pretty please with sugar on it and all that. That'd be right neighborly and equitable of ya. Thanks.

..what's a pb?
posted by ZachsMind at 5:26 PM on August 25, 2007

What's a 'pb'?


Twenty bucks, same as in town.

I hope I beat cortex to that.
posted by languagehat at 5:32 PM on August 25, 2007 [2 favorites]

Rather than just publish the list in some obscure little corner, why not send it off the the crack team of backtaggers so that future searchers can just do what you did, look up all posts with the "obit" tag?
posted by Rhomboid at 5:50 PM on August 25, 2007

Zachers... p b.

and you CAN just search for "pb"...
posted by wendell at 5:51 PM on August 25, 2007

I think 'pb' is the chemical symbol for lead, but that doesn't make sense in this context.
posted by pb (staff) at 5:51 PM on August 25, 2007

Hey wait a second! What's a 'pb'?

Peanut Butter Blue!
posted by ericb at 6:19 PM on August 25, 2007

"why not send it off the the crack team of backtaggers..."

We have a crack team of backtaggers? Are they right next to the pb*? Is it bigger than a breadbox? Does it go good with chocolate? Are we playing twenty questions? Cuz twenty questions is fun! Was it alive before or after the Revolutionary War? Can it fly? Is it Animal, Vegetable or Mineral?

*belated congrats, Paul. I prefer to pretend Matt just has little fae creatures that run around magically taking care of MeFi with pixie dust. Thanks for ruining yet another illusion. I also used to pretend that Janeane Garofalo could actually go for me, but that's probably TMI.
posted by ZachsMind at 8:02 PM on August 25, 2007

I am so confused.
posted by lazaruslong at 8:19 PM on August 25, 2007

PB and the Crack Backtaggers would be a great name for a rock band.

Which one of you is Dave Barry?
posted by wendell at 8:37 PM on August 25, 2007

"Crack Backtaggers" sounds like the bullies who go around taping "kick me" messages to the backs of the other kids.
posted by amyms at 8:42 PM on August 25, 2007

"...sounds like the bullies who go around taping "kick me" messages ..."

...those weren't gnomes? I was told that was gnomes doing that.
posted by ZachsMind at 8:59 PM on August 25, 2007

We have a crack team of backtaggers?

Yes we do. (conception) (birth)
posted by Rhomboid at 9:20 PM on August 25, 2007

Sorry I don't keep up with the gossip, but that Free Quonsar stuff was just adorably evil. Thanks for the links Rhomboid. Hopefully I'm all caught up with the rest of MeFi.
posted by ZachsMind at 10:31 PM on August 25, 2007

For dot-thread exploration fun, as I alluded to above:

- an index of threads in which "." has appeared as a content
- a full-text list of the posts' main text, from which the index above was derived with a little regexery.

The index is a quick hack; not tremendously useful, but at least a nice way to scan quickly for things that look like they aren't true obit threads—one could partition these threads into a few different bits: the death of a given person; of multiple people in a small, cohesive group of some sort (murder/suicide? lynyrd skynyrd?); of a non-cohesive group of people caught in a natural disaster, terror attack, or other catastrophe; of a non-person entity (band break up? software company? artists collective?); and, of course, threads where the . is wholly ironic or even a non-sequitur.

But there's no simple, 100% method for doing that programmatically, so index or full text, a human is going to have to look through it if we want to be sure.

I nominate Not Me.
posted by cortex (staff) at 11:13 PM on August 25, 2007

(And my initial reaction, from not-statistically-robust random eyeball sampling of the index, is that my supposition that most of the dot-threads are real obits of one sort or another may be wrong; there are a lot of threads that clearly have nothing to do with someone dying.

Take a look at, for example, threads 50013 through on up. The threads, in order, are about:

- Peter Tomarken dying
- Unflattering old footage of Rush Limbaugh
- Pending Welcome Back Kotter film
- Abu Ghraib torture expose
- Murder of five UNC security folks
- Maureen Stapleton dying
- Anti war speech on Boston Legal (I guess? Link is dead.)
- Kurt Cobain action figure
- Suicide failure rates and consequences
- Snakes on a Plane trailer

Mind, that could be a badly misrepresentive sample. But it's a lot more dots-in-non-obits than I would have guessed.)
posted by cortex (staff) at 11:28 PM on August 25, 2007

not sure what pb's data looks like, but if you had a count of the number of "."-only comments in the threads, you could probably filter out a lot of the non-sequitur usages by looking at the distribution.

i would expect a two-lobed distribution: a bunch of threads with 1 or 2 dots, and a second cluster of threads with a normally distributed number of dots, with a mean of about 20 or 30 or so. lop off the ones or twos and you should be left without much of the ironic usage (because once someone's already made the joke.. well, sometimes they do get run into the ground, but not that far.)

what the hell am i doing up this late?
posted by sergeant sandwich at 12:47 AM on August 26, 2007

posted by grouse at 2:44 AM on August 26, 2007

Yeah, I was thinking along similar lines, sergeant sandwich: I think that'd be a pretty effective way to measure things, since a lot of those "ironic dot" threads seem to have just the one or two. In fact, I'd bet at a sufficient small threshold, you'd be left without more than a tiny percentage of ironic uses.

I'd be curious to see what the graph of that'd look like, charting proportion of ironic threads vs. number-of-dot-comments. I'd wager it'd be very steep, from the 70-80% range at 1, down below 50% by 2, around 10% for 3, and zeroed out (except for maybe a blip) by the time we hit 5 or 6.

Another wager: the ironic dot-threads with the larger dot counts would skew heavily toward US executive/military policy threads.

That underlines how using the word ironic here is a convenience that might not sit right with some folks; while I consider a dot in a thread about e.g. willfully compromised civil liberties to be very different in character than one about the failure of a beknighted software company, neither is really an obit or other person-oriented "wake", as ZachsMind put it, and so both are "ironic". That's not to trivialize the idea that, in the former case but not the latter, the commenter might be feeling some genuine despair or whatever.

I think someone giving the current index an afternoon's examination and coming up with some basic ironic-vs-non data would be fun, but to really do this right we should refine the query somewhat—I'd have to talk to pb and see if we couldn't capture comments with, for example, one or more dots followed by a line break, independent of the rest of the content of the comment.

We'd also want to find a good way to locate obit threads—the backtagging effort has helped with that some: about 40% of the 35K pre-tagging mefi posts have now been tagged, and obit/obituary seems like one of the more intuitive tags to deploy, so it's probably fair to guess we've had solid coverage among those posts.

And if someone's going to bother with all that, then refinement and formalization of the schema for post classification on some Wake-vs-Irony continuum/hierarchy/graph would be very much worth hashing out, since that assignment will certainly have to be done by a human and it'd be silly to do it more than once.
posted by cortex (staff) at 7:26 AM on August 26, 2007

So what about the ontogeny of it? Did you find the genesis thread?
posted by Big_B at 7:31 AM on August 26, 2007

(Also, the reason the current query doesn't yield anything but what it does—links to threads containing one or more of what I've started thinking of as a "Simple Dot"—is that it's purportedly a hell of a lot easier on the db than a full-on comment-by-comment search.)
posted by cortex (staff) at 7:31 AM on August 26, 2007

Whoops nevermind. Following the links above it looks like this is it. Interesting that just before and after there are a lot of empty posts.
posted by Big_B at 7:33 AM on August 26, 2007

Did you find the genesis thread?

Mefi origin-point is something we actually covered earlier this month: tellurian had it on the nose when he ventured that it was this comment in the Kaycee Nicole obit thread. A bit more here, and in a few other comments in the vicinity in that thread.

So the question of point-of-origin, and of initial slow spread, is partly answered already—it's really what happened after those first few conspicuous uses that I think is a fascinating question, with a lot of meaty little branches: who used it first, and most? When did it go from getting used regularly to getting used ironically regularly, and was there ever a distinct period when the two weren't in lockstep? Who uses it ironically most? Does its use in different partitions of the wake-vs-irony continuum bunch to cohesive groups of users? Does an individual repeat-dotter show a trend toward ironic use over time? Where do milestone "What's with the '.'" metatalk threads—and likewise in-thread comments—fall in the dot-use history, and how did they influence it?

And so on.

Another question I'd like answered, going back to origins: where did the first person(s) who deployed it on metafilter prior to it being a meme get it from? Invention? Import from another site? Etc. I suppose I could write a few emails, but "why did you use this punctuation five years ago in a metafilter thread" might not yield a lot of fruit.

(Bonus: carsonb (energetically) on the soundness of dot-plus-content comments.)
posted by cortex (staff) at 7:53 AM on August 26, 2007

Also, a bit from the wiki. The 9/11 dash thing is an interesting progenitor.
posted by cortex (staff) at 9:17 AM on August 26, 2007

Speaking as a member of the crack back-tagging cabal team, as well as the back-cracking tag team, I think patience is in order, and soon enough all your dead shall be "brought out". As Jessamyn has mentioned previously, after tagging is complete there will a brief round of standardization where all the poorly worded tags will be forced to convert to the new way of thinking, or put up against the wall, if you know what I mean. Think of it this way: the revolution is in progress, it is not being televised, and the final stage is killing all our enemies who seek to hide their precious knowledge behind poorly-tagged posts.

posted by blue_beetle at 9:31 AM on August 26, 2007

cortex, maybe make the obit list a wiki. This way everyone who's Not Cortex can improve on the it.
posted by goodnewsfortheinsane at 10:37 AM on August 26, 2007

Yeah, might work. Every time I get to thinking about a project like this, I kind of get the feeling that I should really nail down the other twenty or so mefi research projects—with detailed, specific goals for each, and an outline of how to go about one or the other systematically enough that people don't end up duplicating a lot of ad hoc, piecemeal work—and then put 'em all in one place, and get some folks together to just Have At It.

A wiki for this would work pretty well, I suppose—import that index and let people update one or another entry at a time with genre/count/notes—though I'm not sure how wiki software would deal with concurrent work, and whether that'd be adding more headache than necessary for the intrepid research assistant.

An alternative would be to hack together a little form, something in the spirit of the existing backtaggers interface, that'd serve up a random handful of threads on demand and let someone just do that little piece by reading and filling in existing fields. A little more work to set it up, but more straightforward for the researcher and with a lower learning curve to get started. I know wikis aren't rocket science, but if I was going to try to harness some wonky good will I'd want to have people spend their energy on the work and not on formatting conventions and markup code.
posted by cortex (staff) at 11:11 AM on August 26, 2007

« Older quis custodiet custodes   |   Nuking old RSS feeds ... Newer »

You are not logged in, either login or create an account to post comments