MeFi user matching? June 19, 2009 7:59 PM   Subscribe

You know what would be neat? If MetaFilter could show you a list of users who are similar to you, based on your favoriting histories, the tags on your posts, the tags on AskMe posts to which you've replied, etc. Or if MeFi matched up users who favorited each other's stuff a lot. Sort of like OkCupid's matching engine, but simpler—and not necessarily for dating purposes.

Good idea? Not-so-good idea? Feasible? Not?

Apologies if this has been discussed before. I checked the MeTa archives and didn't see anything.
posted by ixohoxi to Feature Requests at 7:59 PM (79 comments total) 7 users marked this as a favorite

You frequently post a variation of "I'd hit it."

Here are some other users who would also hit it:

posted by ODiV at 8:01 PM on June 19, 2009 [8 favorites]


That kind of thing is computationally extremely difficult, and the size of the computing task rises as the square of the number of members. Which is approaching 100,000.
posted by Chocolate Pickle at 8:04 PM on June 19, 2009


I think the "similar users" idea is something that you notice on your own after a while - especially when you always end up in the same ridiculous threads with the same people.

I like the "favorites" idea, though I don't know how easy it would be to implement. Sometimes I'm too shy or intimidated to add someone as a contact, only to later find out they like my comments, too. It's like middle school! (omg you had a crush on me I had a crush on you)
posted by Solon and Thanks at 8:06 PM on June 19, 2009


Useless for those who truly believe that everyone needs a hug.
posted by carsonb at 8:10 PM on June 19, 2009


Also, one good workaround is obsessively checking up on every favorite you receive. Patterns eventually emerge.
posted by carsonb at 8:12 PM on June 19, 2009 [1 favorite]


Hello user. It appears you are married to Brandon Blatcher. Brandon Blatcher is also married to these 499 people that you may share other interests with.
posted by Pants! at 8:20 PM on June 19, 2009 [9 favorites]


Is this something that could be extracted from the next datadump? Because I'd like numerical confirmation of JHarris and my mutual appreciation society.
posted by jtron at 8:42 PM on June 19, 2009


Hmm. I realize this isn't a straightforward task, programmatically speaking, but I thought someone with a more formal background in algorithm design might be able to attack it.

How about something simpler: just a list of users who have favorited your stuff, in descending order by number of favorites? Like maybe the top ten users or something?

Of course, a plain count wouldn't be very meaningful. You're really interested in what percentage of a user's total favorite count was awarded to you. That will be a rather small number in most cases, but it's still perfectly suitable for this purpose, and you wouldn't necessarily have to display the percentage—just the ranking.

You could call the list a user's "Fan Club" or something.

Eh?
posted by ixohoxi at 8:54 PM on June 19, 2009


We've never done something like this on the site, no. Probably not something that'd make sense as an official feature, really. But it's a fun idea as an independent experiment.

Ambitious folks could go after the (right now temporary and half a year out-of-date cache of the) Infodump to try and do the work that way—comparing shared/grouped favorites is certainly one thing you could try, looking for shared commenting habits is another (do you end up in the same threads?).

We're hoping to bring back the Infodump on-site soonish, and there's a couple of bits of data I've been meaning to add in, category info for metatalk and askme posts being the main thing. Both of those could in theory be useful as well (comparing not just thread-specific collocation inclinations but per-category shared interests to broaden the net).

Comparing shared interests based on the overlap in tags-on-commented-posts would be another avenue. Do you and person x both comment in threads tagged y, even if you don't comment in the same specific threads?

Ferreting useful results out of this could be Hard, depending on how you go about it and what you're hoping to accomplish, but it's certainly got some chewable-looking meat to it for the datamining-inclined.
posted by cortex (staff) at 8:56 PM on June 19, 2009


I experience this on Facebook, to a degree. I friended all of the cool kids from High School 25 years ago, and only the dorky kids respond to my status updates. And I am/was a dorky kid. It's like we are forever trapped in each other's orbits. I think I'm ok with that.
posted by mecran01 at 9:04 PM on June 19, 2009


I didn't know about the infodump. Perhaps a project for this weekend. I'll let y'all know if I come up with anything interesting.
posted by ixohoxi at 9:10 PM on June 19, 2009


I prefer to keep my own grudge list of users that I don't like. I find them much more interesting than the users that I do like.

I also like to listen to Rush Limbaugh at least once a month until my blood boils.
posted by double block and bleed at 10:17 PM on June 19, 2009 [2 favorites]


Better yet join this feature with an automatically generated kill file so you never have to be exposed to a fact you find uninteresting or an opinion with which you disagree. We could all enjoy our own echo chambers!
posted by LarryC at 11:07 PM on June 19, 2009


This is crazy talk, people! Who do you favorite? Who favorites you? Whose posts do you like? Who comments in your posts? No need for fancy algorithms or data dump machinations! Just, you know, be social and pay attention to who is hanging around your little scene. Use MeMail as necessary. Add contacts. Read profiles. Look at tags. Don't rely on technology all the time to do the stuff humans can do better.
posted by iamkimiam at 2:06 AM on June 20, 2009 [5 favorites]


Make a lot of posts and sort through the favorites you get.
posted by Pope Guilty at 5:15 AM on June 20, 2009


I did a version of this

See the second post on my data analysis blog, called Metafilter Power Users

https://sites.google.com/site/joecochrane1/

I only posted the big people because of the size of the Excel file (There were 16,000 Users at the time with at least one favorite either way) was too big for sites.google. So I am happy to send you yours, or those you are interested in, if you mefimail me.

Wow, man, my metatalk is pink, this will be entertaining for another day or so.
posted by rakish_yet_centered at 5:59 AM on June 20, 2009 [2 favorites]


Ixohoxi, based on your most recent Ask question, I think you just have to look for the Mefites with dilated pupils and cups of tea.

(raises hand, begins staring at hand, wondering at its unusual size)
posted by Juliet Banana at 6:44 AM on June 20, 2009 [1 favorite]


Welcome, bobobox!

You have a high compatibility rating with 0 other users.
Sorry, there's not enough information to match you with other users at this time.
You can get started by favoriting or commenting on posts that you like.

View the 60,000 other unmatched users.
View the 50 users with the most matches.


Naw, rather not.

Though I might like to see professional guilds so I could keep track of when the überscientists drop into threads to lay down the facts.
posted by bobobox at 8:00 AM on June 20, 2009


We were listening to Pandora the other day and it wound up totally feeding me the music I listened to in my twenties, which is great because as I face forty I really want to be reminded that the music I like is twenty, twenty-five years old, and that my taste hasn't really evolved much, and I should probably not make fun of The Big Chill anymore because now those people are me, and I am then, listening to the Pogues and chopping up tarragon and cooking up organic whatever.

I wouldn't mind seeing my data trail type my demographic a little less well, honestly.
posted by A Terrible Llama at 9:46 AM on June 20, 2009 [2 favorites]


Or if MeFi matched up users who favorited each other's stuff a lot.

Using data from the out-of-date cache of the Infodump that cortex linked to, I present you my attempt at listing the top 20 members of the MetaFilter Mutual Appreciation Society:
Artw [123] ---- [123] fearfulsymmetry
matteo [105] ---- [73] Blazecock Pileon
nickyskye [69] ---- [95] madamjujujive
nickyskye [71] ---- [69] vronsky
nickyskye [67] ---- [81] UbuRoivas
Blazecock Pileon [50] ---- [137] Pope Guilty
loquacious [116] ---- [45] Ambrosia Voyeur
Avenger [44] ---- [94] Pope Guilty
nickyskye [75] ---- [43] flapjax at midnite
Miko [43] ---- [67] fourcheesemac
orthogonality [66] ---- [42] Blazecock Pileon
DU [42] ---- [153] Pope Guilty
delmoi [38] ---- [53] Blazecock Pileon
amberglow [37] ---- [38] Blazecock Pileon
Blazecock Pileon [41] ---- [36] DU
nickyskye [35] ---- [42] psmealey
nickyskye [71] ---- [35] homunculus
klangklangston [35] ---- [37] Blazecock Pileon
aleahey [34] ---- [61] ginagina
Artw [33] ---- [41] kittens for breakfast
What does all this mean? Using the last line as an example, Artw has favorited 33 posts of kittens for breakfast's, and kittens for breakfast has favorited 41 posts of Artw's. This list is the top 20 such pairings based on whichever of the two favorite counts is lowest in the pairing.

Oh yeah, and whoever's user number is lower gets to be on the left.

The top line is kind of spooky. "fearfulsymmetry" indeed.

I'm sure I screwed this up somehow.
posted by FishBike at 11:34 AM on June 20, 2009 [2 favorites]


This list also assumes similar rates of favoriting. For example, I know users who tend to favorite same things as I do, and we can usually favorite each others comments. But if either they or I tend to favorite significantly more or less things in general than the other, I don't think we'd show up in each others lists like above. Also, if there's a commenting disparity, that's less content for one user of the pair to have available for favoriting (and the converse)...ie. the more prolific user could really like the less prolific one, but that won't be necessarily reflected in the favoriting history, since there's less data for them to favorite from. All I'm saying is that there's more variables that go into this.

I don't know how to do an algorithmic approach, but heuristically I could name off the top of my head who my likely matches are. And of course, I could tell you which posters I like, and for which reasons. For me, that's a more reliable method than nuts and bolts analysis. Still, the analysis could be fun. Is the infodump no longer available?
posted by iamkimiam at 11:56 AM on June 20, 2009


One thing you might do is calculate each of those counts against the user in question's total number of favorites given, normalizing the value accordingly.
posted by cortex (staff) at 11:57 AM on June 20, 2009 [1 favorite]


One thing you might do is calculate each of those counts against the user in question's total number of favorites given, normalizing the value accordingly.

I just tried this. The list is topped by two users who have each given a whopping 2 favorites, and one of those is to each other. So they are the top of the list by virtue of having given 50% of their favorites to each other, which is kind of what I expected when I thought of this originally and thus didn't try it.

Maybe we need to introduce some further criteria, like only including users who have given at least 100 favorites in the analysis. What do you guys think?
posted by FishBike at 12:19 PM on June 20, 2009 [1 favorite]


Hmm, adding criteria that both users must have given at least 50 favorites, and given at least 10 to each other, produces an interesting list, in the sense that most of the names on it are familiar. Anybody want to see that one?
posted by FishBike at 12:30 PM on June 20, 2009 [1 favorite]


Jesus; I'm still importing the TSVs. 600K favorites and counting...
posted by ixohoxi at 2:47 PM on June 20, 2009


Jesus; I'm still importing the TSVs. 600K favorites and counting...

Importing them into what? There's about 1.5 million favorites in the Infodump, so whatever it is, sounds like it's going to take a while. I've got them in an SQL Server database and would be happy to try running any sort of queries people would like to see.
posted by FishBike at 3:06 PM on June 20, 2009 [1 favorite]


Don't rely on technology all the time to do the stuff humans can do better.

Damn straight. I'm thinking about giving up on metafilter too and just asking my friends "like, what's cool on the internet?" It's mostly youtube videos of fat kids falling over, so I'm pretty hip with it all.
posted by slimepuppy at 3:28 PM on June 20, 2009



One thing you might do is calculate each of those counts against the user in question's total number of favorites given, normalizing the value accordingly.

I just tried this. The list is topped by two users who have each given a whopping 2 favorites, and one of those is to each other. So they are the top of the list by virtue of having given 50% of their favorites to each other, which is kind of what I expected when I thought of this originally and thus didn't try it.


OK, nerding out here:
What I would do is to generate a synthetic data set to compare the actual distribution of favorites to. The idea is to generate a bunch of synthetic favorites that have the same distribution as the actual metafilter data - so I would take the actual favorite list but then randomize the comment authors who were favorited so that each user's total number of favorites is the same but the people who they favorite is random. If you were ambitious you could weight the randomization by the number of posts each user has made.

This is your null model, where each user just chooses comments to favorite at random. Generate a few hundred data sets in this way and calculate the average and standard deviation of reciprocal favorites from it, as a function of total number of comments. Now you can ask how many standard deviations from the null hypothesis the actual favorites are given how many comments each pair has made, and calculate a z-score.

I'm sure a statistician will come along with a better way to do this. I'd try it myself except that I have to teach in a half an hour.
posted by pombe at 3:36 PM on June 20, 2009


I'm just going to go ahead and post the revised MetaFilter Mutual Appreciation Society listing since it seems there are still people reading this thread.

So this one is by percentage of each user's total favorites given to the other person. To get into the Society you have to have given at least 50 total favorites, and to be considered for this list, you and your admirer both have to have given each other at least 10 favorites.
aleahey [20.61%] ---- [45.52%] ginagina
sleepy pete [4.09%] ---- [7.60%] cog_nate
Stynxno [41.74%] ---- [4.00%] ThePinkSuperhero
melissa may [3.96%] ---- [6.13%] sleepy pete
loquacious [3.98%] ---- [3.87%] Ambrosia Voyeur
Artw [3.86%] ---- [30.60%] fearfulsymmetry
jessamyn [3.70%] ---- [4.30%] not_on_display
matteo [3.55%] ---- [3.59%] Mayor Curley
jonmc [3.04%] ---- [4.92%] Divine_Wino
yerfatma [2.79%] ---- [3.44%] Mayor Curley
ColdChef [2.57%] ---- [4.14%] yhbc
NortonDC [10.49%] ---- [2.56%] onlyconnect
languagehat [2.39%] ---- [2.48%] Kattullus
ThePinkSuperhero [2.89%] ---- [2.32%] hermitosis
sleepy pete [2.20%] ---- [9.04%] micayetoca
batmonkey [2.12%] ---- [8.85%] SaintCynr
Miko [1.96%] ---- [4.14%] fourcheesemac
DU [1.75%] ---- [1.94%] Pope Guilty
quonsar [1.70%] ---- [1.77%] Krrrlson
languagehat [1.65%] ---- [2.11%] nasreddin
posted by FishBike at 4:01 PM on June 20, 2009 [2 favorites]


And just because this is fun and interesting, here's a list of the top fans/stalkers... that is, who has given the greatest percentage of their favorites to one specific person (provided they have given at least 20 total favorites, or else everyone who has given 1 favorite tops the list):
davey_darling: 91.4% (6953 of 7611) to ThePinkSuperhero
mattfn: 88.0% (22 of 25) to asavage
Horken Bazooka: 77.1% (101 of 131) to Dave Faris
flarbuse: 68.9% (31 of 45) to Poolio
Loto: 65.1% (330 of 507) to Firas
ginagina: 45.5% (61 of 134) to aleahey
Stynxno: 41.7% (139 of 333) to ThePinkSuperhero
fearfulsymmetry: 30.6% (123 of 402) to Artw
banjo_and_the_pork: 25.2% (92 of 365) to robocop is bleeding
up in the old hotel: 23.5% (20 of 85) to Mutant
pruner: 23.0% (44 of 191) to Poolio
aleahey: 20.6% (34 of 165) to ginagina
Tennyson D'San: 20.0% (301 of 1507) to afu
wheelieman: 19.5% (42 of 215) to y2karl
drhydro: 18.7% (23 of 123) to y2karl
bru: 18.2% (63 of 346) to mathowie
contraption: 17.4% (59 of 340) to Ambrosia Voyeur
Rafaelloello: 16.8% (47 of 280) to dawson
bru: 15.6% (54 of 346) to jessamyn
pb: 13.8% (30 of 218) to mathowie
posted by FishBike at 4:10 PM on June 20, 2009 [4 favorites]


jessamyn [3.70%] ---- [4.30%] not_on_display

Cute.
posted by gman at 4:12 PM on June 20, 2009 [2 favorites]


davey_darling: 91.4% (6953 of 7611) to ThePinkSuperhero

Creepy.
posted by gman at 4:22 PM on June 20, 2009


I'm just going to go ahead and post the revised MetaFilter Mutual Appreciation Society listing since it seems there are still people reading this thread.

Heh. Thanks to Recent Activity, I'll never not be reading this thread.

It's interesting, a good third of the pairs in the revised Mutual list are things that I can actually say "well, yeah, because..." in a fairly concrete fashion—whether because there's SOs involved or because of some subject-specific kindred-spirits type thing.

Neat stuff, regardless. Keep crunchin' by all means.
posted by cortex (staff) at 4:40 PM on June 20, 2009


jessamyn [3.70%] ---- [4.30%] not_on_display

Cute.


If memory serves, Stynxno and teeps, sleepy pete and melissa may, banjo and robocop, and contraption and AV are all real-life couples. There are probably some others I don't know about, too.
posted by box at 4:54 PM on June 20, 2009


You forgot Joe Beese and billysumday.
posted by gman at 5:08 PM on June 20, 2009 [1 favorite]


I thought a better version of the fans/stalkers list might be had by looking at what percentage of person B's posts and comments were favorited by person A (instead of what percentage of Person A's favorites were given to Person B). Of course this required importing all the posts and comments data so as to count them per user...

If we just consider everyone, then a person who has made 1 comment and got it favorited will show up with 100% of their comments and posts favorited by this other person. So instead I am only looking at users who have at least 50 posts+comments. Who's favoriting high percentages of a reasonably active user's comments and posts?
davey_darling: 96.4% (6953 of 7214) of ThePinkSuperhero's comments+posts
ginagina: 84.7% (61 of 72) of aleahey's comments+posts
Tennyson D'San: 33.7% (301 of 893) of afu's comments+posts
davey_darling: 25.9% (30 of 116) of cobra_high_tigers's comments+posts
divabat: 23.7% (14 of 59) of Ash3000's comments+posts
tehloki: 23.6% (13 of 55) of haunted by Leonard Cohen's comments+posts
nasreddin: 22.4% (26 of 116) of dyoneo's comments+posts
Loto: 20.6% (330 of 1604) of Firas's comments+posts
kid ichorous: 16.0% (12 of 75) of Law Talkin' Guy's comments+posts
divabat: 13.7% (7 of 51) of miltoncat's comments+posts
divabat: 13.6% (9 of 66) of kaizen's comments+posts
shotgunbooty: 13.6% (24 of 177) of AceRock's comments+posts
divabat: 11.5% (6 of 52) of dkleinst's comments+posts
AceRock: 11.3% (7 of 62) of shotgunbooty's comments+posts
blueberry: 11.2% (28 of 251) of neroli's comments+posts
tehloki: 10.9% (128 of 1178) of East Manitoba Regional Junior Kabaddi Champion '94's comments+posts
divabat: 10.7% (6 of 56) of Chorus's comments+posts
divabat: 10.7% (6 of 56) of joshuak's comments+posts
mattfn: 10.6% (22 of 208) of asavage's comments+posts
divabat: 10.1% (7 of 69) of jacobean's comments+posts
posted by FishBike at 5:18 PM on June 20, 2009 [3 favorites]


You forgot Joe Beese and billysumday.

No, but I did leave out Anonymous from the fans/stalkers list. Ironically, a lot of people are stalking Anonymous.
posted by FishBike at 5:19 PM on June 20, 2009 [2 favorites]


Importing them into what?

MySQL. There's probably some easier way to handle TSVs, but I just knocked together a quick PHP script to parse and insert. Didn't take long for the users table...the favorites import is still running :)

It's about to hit the one million mark, so I guess I'll be doing my analysis tomorrow. Time to hit the bar.
posted by ixohoxi at 5:28 PM on June 20, 2009


[...] I just knocked together a quick PHP script to parse and insert [...]

Ah, this is obviously some strange usage of the word "quick" that I wasn't previously aware of. (With apologies to the late Douglas Adams.)

But seriously, with SQL Server there's a bulk insert command that can read directly from a text file. Takes just seconds to import these that way. I'd be really surprised if MySQL didn't have something equivalent. Just in case you decide you want to import the posts and comments data as well.
posted by FishBike at 5:34 PM on June 20, 2009 [1 favorite]


I don't know if I should be relieved or disappointed that I'm not on any of those lists.
posted by deborah at 10:06 PM on June 20, 2009


I don't know if I should be relieved or disappointed that I'm not on any of those lists.

OMG ME TOO!

We have so much in common!
posted by nebulawindphone at 8:13 AM on June 21, 2009


Ok, one more table. Now that I've got the posts and comments statistics, I thought maybe the Mutual Appreciation Society ought to be based on percent of each other's comments+posts favorited (instead of percentage of own favorites given to each other). Similar to the fans/stalkers list, this is limited to only those users with at least 50 combined posts and comments:
AceRock [11.29%] ---- [13.56%] shotgunbooty
batmonkey [6.60%] ---- [3.22%] SaintCynr
shiu mai baby [2.43%] ---- [7.48%] Marisa Stole the Precious Thing
Grlnxtdr [2.34%] ---- [3.17%] St. Alia of the Bunnies
fourcheesemac [6.37%] ---- [1.98%] shiu mai baby
roger ackroyd [2.73%] ---- [1.88%] Lieber Frau
melissa may [1.87%] ---- [2.66%] sleepy pete
yath [1.85%] ---- [1.77%] roystgnr
sarahkeebs [1.75%] ---- [2.04%] t2urner
sleepy pete [2.51%] ---- [1.69%] cog_nate
abulafa [2.68%] ---- [1.65%] Wonderwoman
nasreddin [22.41%] ---- [1.61%] dyoneo
kidsleepy [4.20%] ---- [1.60%] thetenthstory
lucyleaf [1.92%] ---- [1.59%] scission
lucyleaf [1.96%] ---- [1.59%] jkl345
Eudaimonia [1.75%] ---- [1.59%] Punctual
wireless [1.96%] ---- [1.59%] Dreamcast
billy_the_punk [1.72%] ---- [1.54%] Javed_Ahamed
boyinmiami [1.96%] ---- [1.52%] jkl345
Blazecock Pileon [1.50%] ---- [1.53%] Pope Guilty
In case further explanation is needed, this last line indicates that Blazecock Pileon has favorited 1.50% of Pope Guilty's posts and comments, and Pope Guilty has favorited 1.53% of Blazecock Pileon's posts and comments.

I was thinking I should further limit this to only those posts and comments made since the other person joined MetaFilter. But then it occurred to me all those posts and comments are still there and able to be favorited... if you really admire the other person, wouldn't you go back through all their history and favorite all the good stuff they posted before you signed up? ;)

Ok, but really I guess this tends to bias the results towards pairings of two users who have been on the site about the same length of time, since then neither misses out on the opportunity to favorite a significant percentage of the other's earlier activity. So maybe I should generate another reciprocal favorites table that only considers favorites given since both users have been on the site. There is a lot more computation needed for that, though--storing a simple per-user count of postings+comments won't do!
posted by FishBike at 1:20 PM on June 21, 2009 [2 favorites]


Argh:
another reciprocal favorites table that only considers favorites given postings and comments made since both users have been on the site.
FTFM.
posted by FishBike at 1:23 PM on June 21, 2009 [1 favorite]


I probably ought to stop soon, but here are a couple more versions of the tables. Now instead of the percentages being calculated as "percent of B's postings and comments favorited by A", they're calculated as "percent of B's postings and comments since A joined MetaFilter favorited by A." In other words, it now considers only those posts and comments by B that A had a reasonable opportunity to favorite.

This turned out to be too computationally intensive and I sprained my little computer with the heavy lifting of trying to calculate a table of "number of posts+comments since user B joined" for every user. There are about 40,000 users, so that would give 40,000 x 40,000 = 1.6 billion rows in the resulting table.

So instead of doing that, I cheated slightly and produced a table of "number of posts+comments since (year,month)" for each user. This only produced about 5 million rows. It means I can't get an exact count of how many items user B contributed since user A joined, but I can get how many items user B contributed since the month when user A joined... close enough I think.

Here's the new fans/stalkers top 20 list based on this new methodology:
davey_darling: 96.9% (6953 of 7179) of ThePinkSuperhero's comments+posts
Tennyson D'San: 40.8% (301 of 738) of afu's comments+posts
Loto: 27.1% (330 of 1217) of Firas's comments+posts
davey_darling: 25.9% (30 of 116) of cobra_high_tigers's comments+posts
divabat: 24.6% (14 of 57) of Ash3000's comments+posts
tehloki: 23.6% (13 of 55) of haunted by Leonard Cohen's comments+posts
nasreddin: 22.4% (26 of 116) of dyoneo's comments+posts
Tennyson D'San: 20.9% (200 of 958) of eyeballkid's comments+posts
shotgunbooty: 16.0% (24 of 150) of AceRock's comments+posts
kid ichorous: 16.0% (12 of 75) of Law Talkin' Guy's comments+posts
nicolin: 14.3% (16 of 112) of Upton O'Good's comments+posts
divabat: 13.7% (7 of 51) of miltoncat's comments+posts
Philby: 13.7% (22 of 161) of Smedleyman's comments+posts
divabat: 13.6% (9 of 66) of kaizen's comments+posts
mattfn: 13.4% (22 of 164) of asavage's comments+posts
orrnyereg: 13.3% (92 of 693) of Kattullus's comments+posts
peacheater: 13.1% (8 of 61) of thehmsbeagle's comments+posts
tehloki: 12.2% (261 of 2141) of quonsar's comments+posts
Cassilda: 12.2% (9 of 74) of Upton O'Good's comments+posts
tehloki: 11.6% (33 of 285) of 31d1's comments+posts
And similarly, year another version of the MetaFilter Mutual Appreciation Society:
AceRock [11.29%] ---- [16.00%] shotgunbooty
batmonkey [6.62%] ---- [3.22%] SaintCynr
nasreddin [22.41%] ---- [2.81%] dyoneo
shiu mai baby [2.43%] ---- [9.09%] Marisa Stole the Precious Thing
abulafa [2.68%] ---- [2.39%] Wonderwoman
fourcheesemac [6.37%] ---- [2.32%] shiu mai baby
palmcorder_yajna [3.20%] ---- [2.26%] Smilla's Sense of Snark
pruner [3.31%] ---- [2.02%] Poolio
roger ackroyd [2.73%] ---- [1.99%] Lieber Frau
Amanojaku [5.17%] ---- [1.96%] R_Nebblesworth
cashman [1.96%] ---- [2.70%] Marisa Stole the Precious Thing
banjo_and_the_pork [1.92%] ---- [2.00%] Rinku
ferociouskitty [2.00%] ---- [1.89%] rivenwanderer
nebulawindphone [1.89%] ---- [2.00%] If only I had a penguin...
madamjujujive [1.87%] ---- [1.93%] hadjiboy
melissa may [1.87%] ---- [3.29%] sleepy pete
Pope Guilty [1.86%] ---- [2.38%] Marisa Stole the Precious Thing
yath [1.85%] ---- [2.33%] roystgnr
sotonohito [1.79%] ---- [2.44%] vibrotronica
tzikeh [1.75%] ---- [1.95%] runincircles
posted by FishBike at 2:21 PM on June 21, 2009 [3 favorites]


If I'm overdoing this, somebody please tell me to stop.

I decided to stop looking at the the favorites data, and instead of that, have a look at the posting and comments data. Just to define the terms, I'm referring to a user as being "active" in a post if they either posted it, or commented in the discussion attached to it. I thought it might be interesting to look at who else is active in the same threads that I am.

Unfortunately, both I and ixohoxi are too new around here to be represented in the somewhat out-of-date Infodump that we have to work with. So I decided to post some lists based on posts where cortex has been active. Pre-emptive apologies if that's inappropriate and please delete this comment if so.

First, a list of the top 10 users who have been active in posts where cortex has been active (by simple count of posts they are both active in):
jessamyn [2065]
mathowie [1838]
languagehat [1779]
stavrosthewonderchicken [1438]
mr_crash_davis [1349]
delmoi [1281]
blue_beetle [1242]
loquacious [1235]
quin [1206]
Alvy Ampersand [1195]
Hmm, probably no surprises there, eh? But as usual this would seem to favor people who've been on the site the longest (or at least, longer than cortex). So how about a list that only includes cortex's activity after the comparison user signed up, and ranks by percentage of posts where cortex and the comparison user were both active (rather than simple count)?
jessamyn: 28.9% [2065 of 7152]
mathowie: 25.7% [1838 of 7152]
languagehat: 25.1% [1779 of 7094]
stavrosthewonderchicken: 20.1% [1438 of 7152]
Joe Beese: 19.7% [24 of 122]
Alvy Ampersand: 19.2% [1196 of 6230]
mr_crash_davis: 18.9% [1349 of 7141]
klangklangston: 18.0% [1178 of 6546]
delmoi: 17.9% [1281 of 7144]
loquacious: 17.9% [1235 of 6889]
Quite similar to the first list, but not identical. Note that for users who joined before cortex, we're comparing activity in all 7152 of the posts where cortex has been active. But for users who joined up later, we're only comparing against the posts where cortex was active after their joinup date.

Ok, now how about the other way around? Rather than looking at who has been active in all the posts where cortex has been active... how about we look at everyone else's activity? In what percentage of everyone else's posting activity has cortex also been active? This time, limited to only those posts after cortex signed up, since we are doing the comparison the other way around. Also limited to people who have been active in at least 50 posts (otherwise it's all people active in 1 post where cortex was active):
and hosted from Uranus: 57.5% [289 of 503]
Kwine: 53.3% [416 of 781]
Secretariat: 53.0% [35 of 66]
Duncan: 51.0% [77 of 151]
pb: 48.2% [172 of 357]
waraw: 46.3% [165 of 356]
It's Raining Florence Henderson: 45.4% [864 of 1905]
[@I][:+:][@I]: 43.5% [37 of 85]
Dave Faris: 42.5% [605 of 1424]
cgc373: 41.1% [436 of 1061]
I'm kind of surprised to see that this list is completely different from the first two!

This requires way too much computation to produce some kind of calculation for all users, but if anybody wants one of these lists for their own account instead of cortex's, just MeFiMail me. And let me know if you want the results posted here as a comment, or sent privately.
posted by FishBike at 5:46 PM on June 21, 2009 [4 favorites]


While you're working out the details on this datadump business, I'd like to ask someone to recommend a MetaFilter Contacts yenta who will set me up with some people to link to (and who will perhaps link back to me, too. sigh...)
posted by NikitaNikita at 9:43 PM on June 21, 2009


Ha! Nice. The first two lists definitely read to me as, first and foremost, Metatalk regulars. Longtimers, at that, with Joe Beese as the recent-but-broadly-talkative exception in the second list.

I have a harder time pinning down the final list in general, but it's more interesting what with generally lower raw numbers than the first two. Secretariat, okay, we're married, and pb doesn't comment so much in metatalk but when he does I think it's not uncommon for the two of us to end up dorking out back and forth a little. Flo is hard for me not to pun off of, cgc373 and I seem to just have some dorkiness in common in general, and Dave and I both tended to talk a lot about metatalk policy stuff in general (though as time went on it started to feel more like arguing than discussing a lot of the time).

[@I][:+:][@I] is an IRCer, and I don't think that's his only mefi account but I'm not sure. Maybe there's some IRC-related vector for some of the commonality there?

Kwine I think may be another case of shared-dorkhood in part.

The rest I don't have any first-blush explanations for.
posted by cortex (staff) at 9:46 PM on June 21, 2009


The first two lists definitely read to me as, first and foremost, Metatalk regulars. Longtimers, at that, with Joe Beese as the recent-but-broadly-talkative exception in the second list.

That was my impression too. Although I've only had an account here for a short while, I've been lurking on MetaFilter for years and years. So that was kind of a "hey, I recognize all those names" sort of thing.

The rest I don't have any first-blush explanations for.

I'm relieved there are explanations for many of the names, though. That suggests I haven't completely screwed up the analysis. The queries behind the common activity lists are sufficiently complicated that I can't use the OTLAR test ("oh, that looks about right") as a quality control measure any more.

The thing about being active in the same threads is that it doesn't necessarily suggest agreement or similar world view (the way I think mutual favoriting does). I could imagine appearing on somebody's list because of a thought process like "Oh look, another post from that idiot FishBike. I'd better post a rebuttal." So people who tend to argue fairly often might rank quite highly on the activity in common threads scale.

Yesterday I thought I was out of ideas for this. But I realized this morning, one thing I haven't looked at yet is common favorites--in other words, of the posts and comments I have favorited, who also tends to favorite the same ones? However I would like to control for those people who favorite damn near everything.

I'm thinking something along the lines of treating each user's favorites as a set, A and B, and calculating the ratio of (intersection of A and B) / (union of A and B). With some date range limitations on what gets included in the sets, so that only favorites given during the period both users were signed up are considered. I'll have to see if I can do that for everyone in the database, or if I have to pick an example user again.
posted by FishBike at 7:38 AM on June 22, 2009 [1 favorite]


Another specific thing you could throw at the shared-favorites view is the question of how heavily-favorited the items that Alice and Bob have co-favorited are.

If A and B have both favorited an item that has only been favorited ten times, that seems more meaningful than them both favoriting something that's been favorited a hundred times, for example.
posted by cortex (staff) at 10:37 AM on June 22, 2009


Is it better to be loved much by one, or a teensy bit by many?
posted by carsonb at 12:01 PM on June 22, 2009


(Statistical analysis on that, pls.)
posted by carsonb at 12:01 PM on June 22, 2009


For the shared favorites stuff, I'm going to start off with the easiest numbers to crunch. First, here's a list of the top 20 pairs of users ranked by shared favorites (that is, the number between their names is a count of how many posts and comments they've both favorited):
tehloki --- [1584] --- Pope Guilty
JHarris --- [1078] --- tehloki
divabat --- [962] --- melorama
ifjuly --- [907] --- divabat
Blazecock Pileon --- [816] --- tehloki
schyler523 --- [795] --- tehloki
scrump --- [764] --- tehloki
tehloki --- [721] --- sebastienbailard
arcticwoman --- [718] --- tehloki
divabat --- [651] --- limeonaire
tehloki --- [625] --- flatluigi
axon --- [623] --- tehloki
perilous --- [614] --- tehloki
nasreddin --- [593] --- tehloki
blueberry --- [588] --- tehloki
lalochezia --- [576] --- tehloki
BrotherCaine --- [565] --- tehloki
divabat --- [564] --- yohko
MikeKD --- [550] --- tehloki
schyler523 --- [532] --- scrump
I notice that 15 of 20 involve tehloki. That suggests perhaps tehloki favorites a lot. Indeed, his profile shows "28670 favorites" right now. Good god, man! So the second list is going to be the same thing, with tehloki disqualified from the running. ;)
divabat --- [962] --- melorama
ifjuly --- [907] --- divabat
divabat --- [651] --- limeonaire
divabat --- [564] --- yohko
schyler523 --- [532] --- scrump
dejah420 --- [507] --- blueberry
ifjuly --- [499] --- melorama
scrump --- [485] --- Pope Guilty
blueberry --- [485] --- schyler523
nicolin --- [475] --- Cassilda
scody --- [459] --- blueberry
blueberry --- [458] --- cashman
SisterHavana --- [450] --- divabat
JHarris --- [446] --- Pope Guilty
schyler523 --- [430] --- Pope Guilty
hot soup girl --- [427] --- divabat
divabat --- [424] --- rmm
nickyskye --- [420] --- nicolin
divabat --- [413] --- flibbertigibbet
ifjuly --- [409] --- yohko
Now divabat shows up on quite a few of the top 20 pairs. Some additional experimenting showed that whenever I exclude the top N who appear in this list, the next version will still show one or two people much more than anyone else. Interestingly, tehloki and divabat don't show up in the top 20 list of shared favorites, so even though they both favorite a lot, they don't tend to favorite the same things.

Getting back to the original point of this thread, is a count of shared favorites a good way to find other users that one might have something in common with? Well, here's the top 10 list of cortex's shared favorites (by simple count at this point).
deborah [27]
divabat [27]
tehloki [27]
flibbertigibbet [26]
scrump [26]
Ambrosia Voyeur [25]
graventy [23]
nasreddin [23]
iamkimiam [21]
schyler523 [21]
I notice a lot of the same names as on the previous two lists, so this suggests this is mainly just a list of "people who favorite a lot of stuff". Which is kind of what I thought would happen, so next I'm going to work on that ratio of intersection vs. union idea and see if I can make that work.

I also like cortex's idea of making widely-favorited posts count for 'less' than ones where just two people have favorited them. Perhaps something arbitrary like instead of each shared favorite counting as "1", it counts for 1/sqrt(totalfavorites-1)... if only 2 of you favorited it, it still counts as 1, but if 101 people favorited it, it counts as 0.1
posted by FishBike at 12:10 PM on June 22, 2009 [3 favorites]


Ok, slight side trip to answer carsonb's inquiry, since the queries are easy. Here are the top 10 users ranked by largest single favoriter (the "loved much by one" list):
ThePinkSuperhero: 11350 (6953 from single user)
Anonymous: 21392 (890 from single user)
puke & cry: 1519 (587 from single user)
Firas: 708 (330 from single user)
afu: 879 (301 from single user)
cortex: 11563 (291 from single user)
DU: 7587 (287 from single user)
Astro Zombie: 14429 (266 from single user)
quonsar: 2634 (261 from single user)
delmoi: 4713 (235 from single user)
And here are the top 10 users ranked by total number of favorites received, minus those from their biggest single favoriter (the "teensy bit by many" list):
Anonymous: 20502 (subtracted 890 from single user)
Astro Zombie: 14163 (subtracted 266 from single user)
cortex: 11272 (subtracted 291 from single user)
Pastabagel: 9278 (subtracted 105 from single user)
jessamyn: 9098 (subtracted 128 from single user)
Miko: 7500 (subtracted 73 from single user)
loquacious: 7464 (subtracted 139 from single user)
DU: 7300 (subtracted 287 from single user)
Blazecock Pileon: 7072 (subtracted 199 from single user)
nickyskye: 6857 (subtracted 95 from single user)
Now assuming that favorites indicate love, and more is better, then it's the total number of favorites that matter. The average for the first list is 7677.4, while the average for the second list is 10050.6, so clearly it's better to be loved a teensy bit by many than much by one.

If we exclude Anonymous and whatever is going on with ThePinkSuperhero, the difference becomes even more clear.
posted by FishBike at 12:31 PM on June 22, 2009 [3 favorites]


Favorited so hard I peed a little.
posted by carsonb at 12:37 PM on June 22, 2009


So the second list is going to be the same thing, with tehloki disqualified from the running.

tehloki is an outlier on legs, yeah, god bless him.

If we exclude Anonymous and whatever is going on with ThePinkSuperhero, the difference becomes even more clear.

For reasons I cannot recall, davey_darling committed himself to favoriting the unholy hell out of TPS's stuff for a while (possibly still ongoing, I haven't check in in a while), so, yeah, another thing to put on the outlier list.

It occurs to me that it wouldn't be a terrible idea to put together an actual sheet of data notes for this kind of thing, maybe maintain it on the wiki as an adjunct to the existing datawankery page.
posted by cortex (staff) at 1:20 PM on June 22, 2009


jtron MeFiMailed me to request some user-specific data for his account similar to what I've already posted here. He said it was OK to post here if it was interesting. Since I've pretty much been looking at user-specific results for only cortex so far, data for a second user is automatically interesting. Although not all the reports I tried were interesting due to some low numbers.
Who does jtron favorite the most?
(simple count of favorites)

Astro Zombie [64]
orthogonality [64]
Ambrosia Voyeur [46]
languagehat [32]
Blazecock Pileon [31]
loquacious [27]
Smedleyman [26]
Meatbomb [25]
cortex [25]
Pope Guilty [24]

Who does jtron favorite the most?
(percent of their comments+posts since you joined)
(limited to users you've favorited 5+ times)

3.5% (7 of 199) of anansi's comments+posts
1.7% (5 of 295) of Ragma's comments+posts
1.7% (9 of 532) of ignignokt's comments+posts
1.6% (5 of 306) of Bora Horza Gobuchul's comments+posts
1.5% (8 of 539) of Shepherd's comments+posts
1.4% (12 of 861) of Pater Aletheias's comments+posts
1.3% (5 of 373) of greenie2600's comments+posts
1.2% (18 of 1446) of Avenger's comments+posts
1.1% (15 of 1331) of adipocere's comments+posts
1.1% (6 of 535) of billyfleetwood's comments+posts

Of the threads where jtron has been active, who else has been active in the highest percentage?
(limited to threads active after jtron and the comparison user had both joined MetaFilter)

The Whelk: 31.3% [20 of 64]
Potomac Avenue: 23.4% [15 of 64]
Astro Zombie: 22.3% [120 of 538]
turgid dahlia: 21.0% [34 of 162]
cortex: 20.1% [126 of 626]
Blazecock Pileon: 19.6% [107 of 545]
DU: 18.7% [78 of 418]
filthy light thief: 18.6% [11 of 59]
languagehat: 18.4% [115 of 626]
klangklangston: 18.0% [107 of 595]

Of threads where other users are active, in whose has jtron also been the most active by percentage?
(limited to threads active after jtron and the comparison user had both joined MetaFilter)

daniel_charms: 12.5% [8 of 64]
vapidave: 12.4% [12 of 97]
Combustible Edison Lighthouse: 11.1% [7 of 63]
Minus215Cee: 11.1% [6 of 54]
Reggie Knoble: 10.9% [7 of 64]
every_one_needs_a_hug_sometimes: 10.5% [6 of 57]
saslett: 10.1% [7 of 69]
Donnie VandenBos: 9.6% [5 of 52]
podwarrior: 9.4% [5 of 53]
Flex1970: 9.3% [8 of 86]

Who has favorited the same items as jtron the most?

tehloki [313]
Pope Guilty [252]
scrump [169]
schyler523 [152]
blueberry [133]
JHarris [129]
lalochezia [128]
RussHy [124]
nasreddin [122]
BrotherCaine [118]
Getting fancier with the shared favorites is proving to be more computationally intensive than I expected, so I don't have anything but the straight numbers of shared favorites to show yet.
posted by FishBike at 2:29 PM on June 22, 2009 [2 favorites]


I've received a couple of other requests now, the first from Juliet Banana. Here's what I could get that was interesting:
Of the threads where Juliet Banana has been active, who else has been active in the highest percentage?
(limited to threads active after Juliet Banana and the comparison user had both joined MetaFilter)
Anonymous: 15.7% [44 of 281]
Ambrosia Voyeur: 12.2% [21 of 172]
ikkyu2: 8.5% [24 of 281]
Forktine: 8.3% [12 of 145]
bassjump: 8.2% [11 of 134]
jessamyn: 8.2% [23 of 281]
klangklangston: 8.2% [23 of 281]
desjardins: 7.9% [11 of 139]
thinkingwoman: 7.7% [14 of 183]
gjc: 7.2% [5 of 69]

Of threads where other users are active, in whose has Juliet Banana also been the most active by percentage?
(limited to threads active after Juliet Banana and the comparison user had both joined MetaFilter)


Chimp: 6.0% [3 of 50]
zippity: 6.0% [4 of 67]
rorycberger: 5.9% [5 of 85]
JackarypQQ: 5.8% [3 of 52]
JDC8: 5.0% [5 of 101]
dorisfromregopark: 4.0% [2 of 50]
toaster: 4.0% [2 of 50]
TurkishGolds: 3.9% [4 of 102]
roundrock: 3.9% [2 of 51]
phrits: 3.8% [3 of 78]
(I left out all the favorites-based stats because there don't seem to be enough of them in the Infodump to get anything useful... still trying to figure out if that is really right or if it's a sign of a technical problem somewhere.)
posted by FishBike at 5:04 PM on June 22, 2009


And here are some stats for carsonb:
Who does carsonb favorite the most?
(simple count of favorites)

loquacious [21]
cortex [17]
madamjujujive [13]
y2karl [13]
It's Raining Florence Henderson [12]
Kattullus [12]
goodnewsfortheinsane [11]
klangklangston [11]
jonson [10]
Ambrosia Voyeur [9]

Who favorites carsonb the most?
(simple count of favorites)

tehloki [42]
muymuy [16]
nicolin [15]
nickyskye [14]
madamjujujive [11]
not_on_display [11]
vronsky [11]
DevilsAdvocate [10]
jack_mo [10]
melorama [9]

Who are carsonb's top 10 mutual favoriters?
(by simple count of whoever has favorited the other the least)

carsonb [13] ---- [11] madamjujujive
carsonb [21] ---- [8] loquacious
carsonb [7] ---- [14] nickyskye
carsonb [4] ---- [7] flapjax at midnite
carsonb [8] ---- [4] miss lynnster
carsonb [12] ---- [4] Kattullus
carsonb [4] ---- [11] not_on_display
carsonb [9] ---- [4] mathowie
carsonb [4] ---- [11] vronsky
carsonb [3] ---- [6] katillathehun

Who does carsonb favorite the most?
(percent of their comments+posts since you joined)
(limited to users you've favorited 5+ times)

0.37% (12 of 3267) of Kattullus's comments+posts
0.30% (13 of 4405) of madamjujujive's comments+posts
0.24% (12 of 4967) of It's Raining Florence Henderson's comments+posts
0.22% (21 of 9568) of loquacious's comments+posts
0.22% (11 of 5058) of goodnewsfortheinsane's comments+posts
0.19% (10 of 5250) of jonson's comments+posts
0.18% (9 of 5129) of Anonymous's comments+posts
0.17% (6 of 3603) of Wolfdog's comments+posts
0.17% (9 of 5433) of Ambrosia Voyeur's comments+posts
0.14% (5 of 3549) of tellurian's comments+posts

Who favorites carsonb the most?
(percent of your comments+posts since they joined)

tehloki: 1.73% (42 of 2434) of carsonb's comments+posts
orrnyereg: 1.01% (4 of 395) of carsonb's comments+posts
MsCoco@6:58: 1.01% (4 of 395) of carsonb's comments+posts
not_on_display: 0.92% (11 of 1192) of carsonb's comments+posts
HFSH: 0.83% (1 of 121) of carsonb's comments+posts
pyrex: 0.74% (7 of 951) of carsonb's comments+posts
nicolin: 0.73% (15 of 2065) of carsonb's comments+posts
muymuy: 0.68% (16 of 2364) of carsonb's comments+posts
phllip.phillip: 0.66% (2 of 305) of carsonb's comments+posts
Cassilda: 0.59% (7 of 1192) of carsonb's comments+posts

Of the threads where carsonb has been active, who else has been active in the highest percentage?
(limited to threads active after carsonb and the comparison user had both joined MetaFilter)

cortex: 34.8% [746 of 2143]
jessamyn: 24.3% [520 of 2143]
languagehat: 21.4% [445 of 2083]
mathowie: 19.1% [409 of 2143]
klangklangston: 18.0% [368 of 2046]
quin: 18.0% [374 of 2083]
Alvy Ampersand: 17.9% [356 of 1985]
stavrosthewonderchicken: 17.4% [373 of 2143]
loquacious: 17.2% [353 of 2053]
delmoi: 16.6% [353 of 2129]

Of threads where other users are active, in whose has carsonb also been the most active by percentage?
(limited to threads active after carsonb and the comparison user had both joined MetaFilter)

pb: 18.3% [65 of 356]
Kwine: 17.2% [134 of 781]
and hosted from Uranus: 16.7% [84 of 503]
Duncan: 15.9% [24 of 151]
plant: 15.9% [10 of 63]
phoque: 15.0% [18 of 120]
vapidave: 14.3% [14 of 98]
psmith: 14.1% [19 of 135]
dorisfromregopark: 14.0% [7 of 50]
h00py: 14.0% [35 of 250]

Who has favorited the same items as carsonb the most?

nicolin [71]
nickyskye [60]
flibbertigibbet [56]
ifjuly [54]
divabat [49]
tickingclock [46]
roll truck roll [45]
Cassilda [42]
Haruspex [41]
flapjax at midnite [40]
posted by FishBike at 5:24 PM on June 22, 2009 [3 favorites]


If you're taking requests, FishBike, I'd love to see some data with my name in it.
posted by box at 5:29 PM on June 22, 2009


FishBike, that's amazing. The stats, yes, but also seeing my name so many times. Glorious. In fact, you may have opened Pandora's box with this venture. Good luck handling the memail/data flood.
posted by carsonb at 5:51 PM on June 22, 2009


This is like porn to me. Thanks again, FishBike.
posted by jtron at 5:54 PM on June 22, 2009


Thanks folks, I am presently trying to automate the process a little more, so that it can be a single cut-and-paste affair rather than the "multiple cut and paste with adjusting of headings" process that it is right now.

I trust that some moderator type person will let me know if this per-user stuff is getting out of hand, at which point I can start sending people stuff by MeFiMail instead of posting it here.
posted by FishBike at 6:10 PM on June 22, 2009 [1 favorite]


How fun is this!? Where can I get in line?
posted by iamkimiam at 6:31 PM on June 22, 2009


Ok box, here are your stats. This is pretty much everything I can generate, too, because it's now all in one script that just runs all the queries in sequence. Although all these numbers might be completely wrong, I am rather chuffed at getting SQL Server to generate output suitable for pasting into a MetaFilter comment box without further editing.
Who does box favorite the most?
(simple count of favorites)

Anonymous [60]
jessamyn [38]
flapjax at midnite [19]
ThePinkSuperhero [16]
klangklangston [15]
Miko [15]
cashman [14]
ericb [13]
mathowie [13]
Astro Zombie [12]

Who does box favorite the most?
(percent of their comments+posts since you joined)
(limited to users you've favorited 5+ times)

2.97% (9 of 303) of sixcolors's comments+posts
1.19% (60 of 5038) of Anonymous's comments+posts
0.73% (8 of 1096) of middleclasstool's comments+posts
0.69% (14 of 2038) of cashman's comments+posts
0.68% (8 of 1181) of blahblahblah's comments+posts
0.66% (5 of 757) of four panels's comments+posts
0.64% (5 of 776) of escabeche's comments+posts
0.64% (5 of 778) of Metroid Baby's comments+posts
0.51% (5 of 978) of jbickers's comments+posts
0.51% (5 of 980) of DecemberBoy's comments+posts

Who favorites box the most?
(simple count of favorites)

Rock Steady [12]
not_on_display [10]
blueberry [9]
limeonaire [9]
cashman [8]
orrnyereg [7]
Dreama [7]
tehloki [6]
Nattie [6]
klangklangston [6]

Who favorites box the most?
(percent of your comments+posts since they joined)

Joe Beese: 3.81% (4 of 105) of box's comments+posts
orrnyereg: 1.07% (7 of 657) of box's comments+posts
clark: 0.95% (1 of 105) of box's comments+posts
secondhand: 0.95% (1 of 105) of box's comments+posts
quosimosaur: 0.95% (1 of 105) of box's comments+posts
abc123xyzinfinity: 0.57% (1 of 174) of box's comments+posts
rachaelfaith: 0.57% (1 of 174) of box's comments+posts
not_on_display: 0.55% (10 of 1809) of box's comments+posts
PhoBWanKenobi: 0.46% (3 of 657) of box's comments+posts
Nattie: 0.44% (6 of 1354) of box's comments+posts

Who are box's top 10 mutual favorites?
(by simple count of whoever has favorited the other the least)

box [14] ---- [8] cashman
box [6] ---- [10] not_on_display
box [15] ---- [6] klangklangston
box [38] ---- [4] jessamyn
box [11] ---- [4] Kattullus
box [4] ---- [4] Alvy Ampersand
box [15] ---- [4] Miko
box [5] ---- [4] phrontist
box [4] ---- [4] Pope Guilty
box [5] ---- [3] kittens for breakfast

Who are box's top 10 mutual favorites?
(by percentage favorited of others posts since joining)

box [0.69%] ---- [0.35%] cashman
box [0.33%] ---- [0.28%] blueberry
box [0.27%] ---- [0.32%] limeonaire
box [1.19%] ---- [0.26%] streetdreams
box [0.23%] ---- [0.55%] not_on_display
box [1.49%] ---- [0.20%] polexa
box [0.36%] ---- [0.19%] wongcorgi
box [0.16%] ---- [0.18%] Solomon
box [0.18%] ---- [0.16%] mudpuppie
box [0.29%] ---- [0.15%] milarepa

Of the threads where box has been active, who else has been active in the highest percentage?
(limited to threads active after the comparison user has joined MetaFilter)

Joe Beese: 22.7% [22 of 97]
jessamyn: 12.1% [300 of 2475]
turgid dahlia: 11.1% [83 of 747]
cortex: 10.7% [265 of 2475]
The Whelk: 10.6% [32 of 302]
Fuzzy Skinner: 9.8% [86 of 881]
klangklangston: 9.6% [229 of 2374]
Blazecock Pileon: 8.3% [174 of 2094]
languagehat: 8.0% [198 of 2475]
quin: 8.0% [198 of 2475]

Of the threads where other users have been active, in whose has box also been the most active by percentage?
(limited to threads active after box has joined MetaFilter)

DelusionsofGrandeur: 16.0% [12 of 75]
every_one_needs_a_hug_sometimes: 15.8% [9 of 57]
johnofjack: 14.1% [9 of 64]
Pax: 13.9% [24 of 173]
studentbaker: 13.2% [16 of 121]
An Infinity Of Monkeys: 13.1% [24 of 183]
doncoyote: 12.5% [7 of 56]
Balonious Assault: 12.3% [7 of 57]
punchdrunkhistory: 12.2% [14 of 115]
waraw: 12.1% [43 of 356]

Who has favorited the same items as box the most?

divabat [260]
ifjuly [178]
limeonaire [172]
melorama [169]
yohko [136]
flibbertigibbet [121]
LobsterMitten [113]
nicolin [106]
agregoli [104]
tehloki [93]
posted by FishBike at 6:33 PM on June 22, 2009 [1 favorite]


Of the threads where box has been active, who else has been active in the highest percentage?

I'm starting to suspect that most of the time the presence of my name on this list is a roundabout way of saying that the user in question spends a fair amount of time in Metatalk in general. I'm probably a significant outlier myself in terms of Metatalk commenting activity (jessamyn too).
posted by cortex (staff) at 6:45 PM on June 22, 2009


Stats for:iamkimiam
Who does iamkimiam favorite the most?
(simple count of favorites)

cortex [51]
Miko [49]
jessamyn [39]
Pastabagel [31]
fourcheesemac [28]
Astro Zombie [24]
Anonymous [24]
grumblebee [24]
ericb [22]
mathowie [22]

Who does iamkimiam favorite the most?
(percent of their comments+posts since you joined)
(limited to users you've favorited 5+ times)

2.82% (5 of 177) of tractorfeed's comments+posts
2.79% (7 of 251) of neroli's comments+posts
2.62% (5 of 191) of DaShiv's comments+posts
2.27% (12 of 528) of billyfleetwood's comments+posts
1.58% (8 of 507) of Secret Life of Gravy's comments+posts
1.54% (6 of 389) of L. Fitzgerald Sjoberg's comments+posts
1.48% (5 of 338) of shiu mai baby's comments+posts
1.45% (12 of 830) of Dee Xtrovert's comments+posts
1.35% (5 of 370) of Naberius's comments+posts
1.23% (31 of 2524) of Pastabagel's comments+posts

Who favorites iamkimiam the most?
(simple count of favorites)

nasreddin [14]
roll truck roll [12]
divabat [10]
limeonaire [10]
elisynn [9]
ifjuly [9]
melorama [8]
chicainthecity [7]
flibbertigibbet [7]
sondrialiac [7]

Who favorites iamkimiam the most?
(percent of your comments+posts since they joined)

iamkimiam: 1.01% (17 of 1690) of iamkimiam's comments+posts
St. Alia of the Bunnies: 0.97% (1 of 103) of iamkimiam's comments+posts
Philby: 0.97% (1 of 103) of iamkimiam's comments+posts
MeowForMangoes: 0.97% (1 of 103) of iamkimiam's comments+posts
illenion: 0.97% (1 of 103) of iamkimiam's comments+posts
nasreddin: 0.83% (14 of 1696) of iamkimiam's comments+posts
orrnyereg: 0.78% (4 of 514) of iamkimiam's comments+posts
roll truck roll: 0.71% (12 of 1696) of iamkimiam's comments+posts
liza: 0.70% (7 of 1003) of iamkimiam's comments+posts
Solon and Thanks: 0.63% (4 of 638) of iamkimiam's comments+posts

Who are iamkimiam's top 10 mutual favorites?
(by simple count of whoever has favorited the other the least)

iamkimiam [49] ---- [6] Miko
iamkimiam [6] ---- [12] roll truck roll
iamkimiam [8] ---- [5] cashman
iamkimiam [11] ---- [5] nickyskye
iamkimiam [9] ---- [5] shmegegge
iamkimiam [5] ---- [7] tractorfeed
iamkimiam [12] ---- [5] Blazecock Pileon
iamkimiam [6] ---- [5] madamjujujive
iamkimiam [6] ---- [4] Solon and Thanks
iamkimiam [4] ---- [14] nasreddin

Who are iamkimiam's top 10 mutual favorites?
(by percentage favorited of others posts since joining)

iamkimiam [1.01%] ---- [1.01%] iamkimiam
iamkimiam [0.59%] ---- [0.53%] terranova
iamkimiam [0.50%] ---- [0.63%] Solon and Thanks
iamkimiam [0.49%] ---- [0.71%] roll truck roll
iamkimiam [2.82%] ---- [0.41%] tractorfeed
iamkimiam [0.54%] ---- [0.41%] flibbertigibbet
iamkimiam [0.40%] ---- [0.70%] liza
iamkimiam [4.88%] ---- [0.39%] electrasteph
iamkimiam [0.82%] ---- [0.38%] papafrita
iamkimiam [1.19%] ---- [0.35%] Miko

Of the threads where iamkimiam has been active, who else has been active in the highest percentage?
(limited to threads active after the comparison user has joined MetaFilter)

cortex: 21.1% [246 of 1164]
jessamyn: 21.0% [244 of 1164]
Brandon Blatcher: 17.1% [199 of 1164]
quin: 16.7% [194 of 1164]
turgid dahlia: 16.4% [78 of 476]
languagehat: 15.5% [181 of 1164]
Ambrosia Voyeur: 14.0% [163 of 1164]
Blazecock Pileon: 13.4% [156 of 1164]
blue_beetle: 12.2% [142 of 1164]
ThePinkSuperhero: 12.2% [142 of 1164]

Of the threads where other users have been active, in whose has iamkimiam also been the most active by percentage?
(limited to threads active after iamkimiam has joined MetaFilter)

every_one_needs_a_hug_sometimes: 25.0% [13 of 52]
Combustible Edison Lighthouse: 19.4% [12 of 62]
sambosambo: 16.9% [20 of 118]
shiu mai baby: 15.2% [12 of 79]
Surfurrus: 14.9% [13 of 87]
julen: 13.7% [7 of 51]
JoeXIII007: 13.5% [7 of 52]
tangerine: 13.3% [24 of 180]
OneOliveShort: 13.2% [7 of 53]
educatedslacker: 13.2% [7 of 53]

Who has favorited the same items as iamkimiam the most?

schyler523 [303]
tehloki [297]
flibbertigibbet [275]
scrump [272]
blueberry [248]
graventy [189]
Pope Guilty [187]
LobsterMitten [183]
madamjujujive [179]
shmegegge [174]

posted by FishBike at 6:54 PM on June 22, 2009 [2 favorites]


I'm probably a significant outlier myself in terms of Metatalk commenting activity (jessamyn too).

I have also been wondering to what extent your moderator status skews the common threads statistics? In the sense that I think you are statistically more likely to show up in those incandescent threads (those that generate more heat than light) and maybe certain users are also statistically more likely to be involved in those?
posted by FishBike at 6:54 PM on June 22, 2009 [1 favorite]


Also, when Chocolate Pickle said this is computationally extremely difficult, that was true. This takes about 9-10 minutes to run for one user. (Twice as long when I forget to change the user name and re-run box's a second time, d'oh!)

I can't imagine trying to pre-compute all these stats for the 40,000 active users, nor trying to do it on the fly from a button on the profile page or something. It could probably be made faster with some effort at optimization and caching some more derivative data... but that's like being at work.
posted by FishBike at 6:58 PM on June 22, 2009 [1 favorite]


Oh how embarrassing. I'm my biggest fan.
posted by iamkimiam at 7:04 PM on June 22, 2009


Oh, and thanks. :D
posted by iamkimiam at 7:04 PM on June 22, 2009


Thanks kindly for the stats, FishBike.

After seeing all this data, I'm doing a lot of wondering about favorites, and how people use 'em.
posted by box at 7:08 PM on June 22, 2009


(Hiya, favoriters! I think you're all awesome!)
posted by box at 7:11 PM on June 22, 2009 [1 favorite]


...but that's like being at work.

Huh. I was wondering whether or not this was something you did for a living.

I was maybe hoping you were actually a crocodile wrangler leading a secret double life filled with quantitative analysis.
posted by elizardbits at 7:17 PM on June 22, 2009


I think people use favorites in two distinctly different ways:
  • As bookmarks for posts or comments that they like or find useful, so they can go back and browse through them and refer to them again later.
  • As a show of support for the post or comment, with no intent to ever go back and look at it later.
They're kind of similar though, both indications that "I like this." I sort of suspect most users enjoy receiving favorites, but I gather this is not seen as a particularly good thing as it might lead people to adjust their posting style towards whatever gets them the most favorites. I'm trying not to care so much about my favorites count, but I am definitely susceptible to the sense of validation that comes with receiving them.

Anyway, I'm mentioning this because there are all sorts of other favorites-based statistics that could be calculated. Who has the most? Who averages the most per day since they signed up? Who averages the most per posting? And because I suspect treating the favorites count as a kind of competition is seen as a generally bad thing, I deliberately avoided calculating anything like that.

Then I posted some stats in reply to carsonb's (I'm assuming joking) inquiry and ended up breaking my own self-imposed guideline. Oops. Hopefully the end justifies the means in that case.
posted by FishBike at 7:20 PM on June 22, 2009 [1 favorite]


I was wondering whether or not this was something you did for a living.

Well, sort of. SQL database work used to be an official part of my job description, but it isn't any more. Nevertheless I still find myself getting sucked back into it, especially the performance analysis and optimization stuff, I suppose because I am halfway decent at it.

I especially get sucked into it when people blame stuff that is in my job description (servers and networks) for the poor performance of their application. It is sometimes handy to be able to analyze the problem, and prove that it's due to database design or inefficient code by solving it, rather than just arguing that it isn't the server or the network.

Actually that's not in my job description any more either, as I am theoretically supposed to be managing people the people who do that stuff, and not doing it myself any more. But that is still a highly theoretical thing, which is OK because it's still fun in moderation. As proof of that I offer the following observation: I'm actually on vacation at the moment.
posted by FishBike at 7:31 PM on June 22, 2009 [1 favorite]


I use Favorites as a form of positive reinforcement: people like getting them, so I figure that if I give them to comments I like, that will serve as a motivating force for more comments I like to be made.
posted by Pope Guilty at 10:33 AM on June 23, 2009 [1 favorite]


I really enjoy it when people break rules on my behalf. Thank you, FishBike, you're a low-down stand-up kind of guy. ;)
posted by carsonb at 12:07 PM on June 24, 2009


FishBike,

To show my appreciation for all of your hard work, I just telokied all of your comments in this thread. If you rerun your queries after the next infodump, please rest assured that I have no interest in finding out your home address :)
posted by double block and bleed at 6:05 PM on June 28, 2009


Heh, thanks, I noticed.

I am still keeping an eye on this thread for any further requests, too. I wonder if regular updates of the Infodump are in the works?
posted by FishBike at 6:16 PM on June 28, 2009


« Older BRANDON BLATCHER STOP MARRYIN' PEOPLE   |   "the most dynamic public speaker of our... Newer »

You are not logged in, either login or create an account to post comments