Carnegie Mellon Study Ranks Most Informative Blogs October 30, 2007 8:12 AM   Subscribe

Carnegie Mellon Study Ranks the 100 Most Informative Blogs. MetaFilter ranks as #33.
posted by ericb to MetaFilter-Related at 8:12 AM (33 comments total) 4 users marked this as a favorite

for the uberlazy, here's the list. My question is how many of the other informative bloggers are also metafilter members?
posted by jessamyn (staff) at 8:19 AM on October 30, 2007

I am uber-uber lazy. Links plz.
posted by hydrophonic at 8:24 AM on October 30, 2007 [1 favorite] informative?
posted by mathowie (staff) at 8:28 AM on October 30, 2007 [1 favorite]

You don't expect me to copy and paste all those, do you!?
posted by Alvy Ampersand at 8:29 AM on October 30, 2007


their guiding question is: "Which blogs should one read to be most up to date, i.e., to quickly know about important stories that propagate over the blogosphere?" Huh? What blogs should you read to efficiently learn about what blogs are talking about?

Their algorithm uses number of posts, number of inlinks and outlinks from other blogs, and number of outlinks in general. So the resulting list is not the blogs that are the "most informative," but seems to be the blogs that are most connected to others. So its a popularity contest. Kinda explains how Malkin came in #5. In fact, there sure seem to be a lot of conservative blogs at the top of the list.

In all, seems more like a quantification of echo-chamber value than quality of content.
posted by googly at 8:30 AM on October 30, 2007 [1 favorite]

hydrophonic writes "I am uber-uber lazy. Links plz."
  1. Instapundit
  2. Don Surber
  3. Science & Politics
  4. Watcher of Weasesls
  5. Michelle Malkin
  6. National Journal's Blogometer
  7. The Modulator
  9. Boing Boing
  10. Atrios
  11. A Blog for All
  12. Gothamist
  13. mparent777
  14. TFS Magnum
  15. Alliance of Free Blogs
  17. Micropersuasion
  18. Pajamas Media
  19. BlogHer
  20. The Jawa Report
  21. Reddit
  22. Soccer Dad
  23. Nose on Your Face
  24. aHistorically
  25. The Anchoress
  26. AmericaBlog
  27. SFist
  28. TBogg
  29. HorsePigCow
  30. Why Homeschool
  31. The Daou Report
  32. Sisu
  33. MetaFilter
  34. Megite
  35. LAist
  36. Captain's Quarters
  37. Shakesville
  38. Guy Kawasaki
  39. Lucy by Lucy
  40. Blue Star Chronicle
  41. Official Google Blog
  42. The Glittering Eye
  44. Read/WriteWeb
  45. Hullabaloo
  46. The Conservative Cat
  47. Phillyist
  48. The Social Customer Manifesto
  49. The Next Net
  50. Gateway Pundit
  51. Crooks and Liars
  52. Right Wing News
  53. 10,000 Birds
  54. O'Reilly Radar
  55. Cowboy Blog
  56. Business Opportunities Weblog
  57. DCist
  58. Creating Passionate Users
  59. Citizens For Legitimate Government
  60. What About Clients?
  61. Rough Type
  62. The Unofficial Apple Weblog
  63. Dans la cuisine d'Audinette
  64. The London Fog
  65. Bostonist
  66. Seattlest
  67. Austinist
  68. Indian Writing
  69. Power Line
  70. Firedoglake
  71. Blog d'Elisson
  72. Rhymes With Right
  73. Written World
  74. The Jeff Pulver Blog
  75. blog d'eMeRY
  76. Hugh MacLeod's gapingvoid
  77. Catymology
  78. Hugh Hewitt
  79. Lifehacker
  81. Econbrowser
  82. A Socialite's Life
  83. Gates of Vienna
  86. A Life Restarted
  87. The Volokh Conspiracy
  88. See Also...
  89. Dr. Sanity
  90. Mudville Gazette
  92. Privacy Digest
  93. Londonist
  94. Shanghaiist
  95. Catholic and Enjoying It
  96. Single Serve Coffee
  97. Jeremy Zawodny's blog
  98. ScienceBlogs
  99. Basic Thinking Blog
  100. Scobleizer

posted by Mitheral at 8:30 AM on October 30, 2007 [5 favorites]

The list with each entry being hyperlinked (BTW -- it's the first hyperlink in the FPP).
posted by ericb at 8:32 AM on October 30, 2007

Man, those weasels work fast.
posted by y2karl at 8:33 AM on October 30, 2007

googly writes "In fact, there sure seem to be a lot of conservative blogs at the top of the list. "

Also the list is very American content heavy. I wonder if it is proportional to internet consumption or if the US is disproportionally represented in the blogsphere.
posted by Mitheral at 8:35 AM on October 30, 2007

I am kind of interested by their second take on "most influential", which penalizes blogs with large numbers of posts. Lot of lesser-known, disproportionately influential, smaller sites on there.
posted by ormondsacker at 8:42 AM on October 30, 2007

A recent Carnegie Mellon study used higher mathematics to answer the question: if you want to be informed about what the entire blogospohere is talking about, but you can only read 100 blogs (out of the millions available), which blogs should you read?

Based on that, how did single serve coffee make it on to the list?
posted by birdlady at 8:47 AM on October 30, 2007

Also, as is the custom here, here is a GreaseMonkey Script to turn plain text urls into links.
posted by blue_beetle at 8:48 AM on October 30, 2007

Oh, I was joking, but thanks, Mitheral and blue_beetle.
posted by hydrophonic at 9:06 AM on October 30, 2007

How come Metafilter is seventeen places behind, which is nothing more than a spam site?
posted by Aloysius Bear at 9:11 AM on October 30, 2007

Number 3 is dead.
posted by weapons-grade pandemonium at 9:18 AM on October 30, 2007

Number 33 is dead to me.
posted by found missing at 9:34 AM on October 30, 2007

On a quick scan, seems very weighted towards the US and US-centric concerns, which to my mind would make it less informative for American readers that venturing further afield. One of the things I enjoy most about MeFi is reading about things I'd not even heard of before.
posted by Abiezer at 10:22 AM on October 30, 2007

In practice, the cost of reading a blog is not simply proportional to the number of posts, since we also need to navigate to the blog (which takes constant effort per blog). Hence, a combination of unit and NP cost is more realistic.

Someone should tell them about RSS feeds.
posted by roofus at 10:29 AM on October 30, 2007

Instapundit, informative?
What a load of crap this list is.

honorable exceptions: BB, Atr, MeFi, C&L, FdL
posted by Bletch at 10:46 AM on October 30, 2007

cuteoverload was ROBBED.
posted by Ambrosia Voyeur at 10:59 AM on October 30, 2007 [3 favorites]

I have a hard time taking Carnegie Mellon seriously. It sounds too much like a lecherous lush asking to cop a feel.
posted by srboisvert at 11:30 AM on October 30, 2007

I have a hard time taking Carnegie Mellon seriously because I graduated from there. Most people seem to take them seriously though.
posted by ludwig_van at 12:02 PM on October 30, 2007

Number 3 is dead.

From Bora Zivkovic over at Number 3:

"And how useful it is to read a dead blog - this one you are at right now, my old blog ranked #3? I abandoned it in June 2006. I occasionally use it for testing stuff or for Google-bombing ;-) If you want to read a really useful blog, go check my current blog, not this one!

How useful it is to rank blogs according to the 2006 data anyway - that is eons ago in Internet time?

This must have been some fuzzy math. I hope the blogosphere responds with a big laugh."

And indeed, the "algorithm" used seems focused on Google-bombing and log-rolling.
posted by 3.2.3 at 12:08 PM on October 30, 2007

So the list is bogus? Phew. For a moment there I thought MetaFilter might actually be informative.
posted by Deathalicious at 12:46 PM on October 30, 2007

We would have been 14... but then I joined. Sorry.
posted by tkchrist at 1:55 PM on October 30, 2007

One of my first thoughts upon reading this was "where did they get their data?" So I skimmed the paper and the expanded tech report, which says:
Here we are interested in blogs that actively participate in discussions, we biased the dataset towards the active part of the blogosphere, and selected a subset from the larger set of 2.5 million blogs of [8].
[8] turned out to be this paper, which was published in 2005. That list of blogs was created by getting lists of updated blogs from "centralized services," and the "services include the update lists from:,,,,, and" It's implied that the data was collected in 2004, but it's also implied that the research might be ongoing.

Still, if we assume that the blog list came directly from the data for this paper, the authors of the current study were working off a list that was two years old. Two years is a very long time as far as weblogs are concerned. Their methodology excludes blogs that had died before 2006, but it wouldn't include any blogs that weren't indexed by one of the source services in 2004. In total, their dataset consisted of 45,000 blogs. As far as testing their outbreak dection algorithm goes, this is probably fine, but you can't really say that it identifies the "most informative blogs" period. At best, it identifies the "most informative" blogs within that set (though there are still the issues mentioned by googly and 3.2.3 above).

The authors seem to understand this, saying "In this work we are not explicitly modeling the spread of information over the network, but rather consider cascades as input to our algorithms." Unfortunately, that remark is easy to miss, and even if they had advocated more caution about generalizing their results (which they should have, IMHO) it's easy for important details like that to get lost in translation. Case in point: the linked blog post from Bloggers Blog in turn links to this post from Data Mining, which says: "It must be noted that this work is a theoretical exploration - the dataset mined to create the list is not a live corpus of blogs; thus some of the blogs may be stale or even abandoned." Bloggers Blog left that bit out, perhaps because it didn't really fit the tone of a "we're in the top 10!" post.

This comment ended up being a lot longer than I intended, but as a researcher myself I tend to get passionate about these sorts of things.
posted by I Said, I've Got A Big Stick at 2:21 PM on October 30, 2007

I am uber-uber lazy. Links plz.

If you use Firefox, the single best extension ever (copying and extending functionality from IE2/Maxthon (which kind of sucks now) allows you to drag and drop anything from a page -- text, links, urls, images, whatever -- and customize what happens when you drop them (like search, define, save, go to, go to in a background tab, save to your fave bookmarking tool, and a million other things, directionally). ROCK.
posted by stavrosthewonderchicken at 5:02 PM on October 30, 2007

That extension looks like it's worth trying or you could copy paste the source of the page.
posted by Mitheral at 6:09 PM on October 30, 2007

33rd? Obviously a Masonic conspiracy. 33rds rule!
posted by Tube at 6:15 PM on October 30, 2007

I laugh out loud at the idea that Boing Boing could be better than metafilter at anything...

...well anything other than showcasing the latest disney-themed, electronic, do-it-yourself, automated, crypto-zoological, automated copyright rant machine.
posted by milarepa at 7:38 PM on October 30, 2007 [2 favorites]

Isn't the owner of, number 85 on the list, a mefite? I seem to recall some MeFi statistic being hosted at
posted by rjs at 1:52 PM on October 31, 2007

Of course. Everybody's a member of Metafilter, or was. Then again, once a MeFite, always a MeFite.waxpancake, aka Andy Baio.
posted by stavrosthewonderchicken at 10:33 PM on October 31, 2007

Yes, of course. Thanks.
posted by rjs at 11:07 PM on October 31, 2007

« Older Deletion Stats?   |   clavs halloween poll Newer »

You are not logged in, either login or create an account to post comments