MeFi clustering analysis.
I ran a clustering algorithm on some data from the
Mefi InfoDump, matching each user with the tags used in her posts. I created a
dendrogram (
same image as in first link). Here's a
plain text version that's not as pretty but is searchable.
The data was pared down selecting the users with 25 or more posts and the tags that were used 10% as much as the tag that was used the most, which gives me 556 users and 80 tags.
I also inverted the data and
clustered the tags themselves, which gives some idea of thematic areas.
If I accept users with >= 5 posts, and tags >= 1% max, I get 2268 users and 1274 tags, and
a very tall dendrogram (and
plain text).
The python script I used to extract the data from the dumps is
here, the clustering algorithm was taken from
Programming Collective Intelligence and the clustering & dendrogram drawing script is available from
the author's site, under chapter 3.
posted by signal to MetaFilter-Related at 9:01 PM (81 comments total)
6 users marked this as a favorite
Whoa.
posted by Miko at 9:05 PM on July 22, 2008 [1 favorite]