Maybe long threads should be broken into chunks May 21, 2001 8:17 AM   Subscribe

Over the past couple of days, it's occurred to me that mefi does not scale particularly well when large numbers of comments are posted to a thread. I think that loading 100 comments or more makes for a fairly unwieldy webpage, and probably puts a lot of strain not just on the source database for the comments, but also the bandwidth of mefi. This inefficiency is particularly bad when it comes to such popular threads at the Kaycee ones that have been recently posted, as many people will be accessing it.

I think that comments should be cut up into some slightly more managable chunks. That is, rather than have all of the comments on the same page, they should be split up into groups of 25 or 30 comment pages; clicking on the "new comments" will take you to the correct page, thus bypassing most older comments, and the correct id on the page.

What do you think?
posted by moz to Bugs at 8:17 AM (13 comments total)

I think someone needs to start a new Kaycee thread. It's worked in the past. If 100+ threads were much more common than they are now, I could see automatically paging them out in chunks of 30 or 40 (a good solution, I think) but for now the tried and true approach seems warranted.
posted by sudama at 9:39 AM on May 21, 2001

I know it's much harder technically, but I think I'd prefer that, since we have the "new since your last visit" feature working already, a page could just show the initial post and the comments that are new. elegant, and simple.

Ya know, except for the programming part.
posted by anildash at 9:51 AM on May 21, 2001

sudama, i disagree.

metafilter has, in the past month or so, grown by about 1500 members. it will likely continue to grow. by this token, the number of postings will also, in all likelihood, grow. i think that, if the growth and the consequential strain on metafilter are inevitable, then metafilter should be modified to scale simply to the added strain.

of course, i speak from a programmer's point of view: if it's leaking, patch it now before the water begins to rush. and, speaking as a modem user, i think it's bad for the readers as well to have to download all 250 or 300 comments, when the only ones you will pay any attention to are the 30 or so most recent postings.
posted by moz at 10:17 AM on May 21, 2001

moz, I've always operated with the philosophy that things should be as simply as they possibly can. Start simple, and complicate when necessary is my motto.

I'd agree, when threads get to be >100 comments, pagination seems like it could save the server load a bit. I've never thought of doing it because I hate being on a site that chunks up every 10-20 posts (like a UBB) because it's exhausting to click through on each one.

It's kind of a pain to program, but I'll look into it tonight.
posted by mathowie (staff) at 11:10 AM on May 21, 2001

is the metafilter source available? perhaps i can look into making the necessary modifications.
posted by moz at 11:18 AM on May 21, 2001

If the paging code gets written, I'd suggest that it be a feature that's disabled by default, or enabled at some huge number like 100 or 150 posts. The user could set the number of posts in a text field or by drop down. Drop down may be preferable so that matt, eric and whoever else works on this can fine-tune things so nobody can place an undue load on the server.
posted by sudama at 12:30 PM on May 21, 2001

don't forget the pony and free cake too Sudama, that will be in an optional dropdown.

posted by mathowie (staff) at 1:49 PM on May 21, 2001

sudama, why keep the number so large? and why disable it by default? if, as you say, 100+ comment threads are not common, keeping the size to a low number (say 25 - 30) should not cause most threads to be chunked more than once. if you disable it by default, you severely limit its usefulness--you've essentially reduced code, written to enhance the effiency of metafilter, to a matter of goodwill on the user's part. which i'm sure induces a warm and fuzzy feeling in us all, except for the server that you're stepping all over.

and now that the server's down, and my gut feeling is that is directly related to all of the kaycee threads, i think my argument is only reinforced.
posted by moz at 4:13 PM on May 21, 2001

I like the self limiting nature of this. Long threads are rare and that's good.

It's interesting to watch the folks in Neale's thread (sorry no newbie link, check the archives ;-j) battling against this limit as we speak. With over 1000 posts, it's kind like watching a bunch of people making an assault on Everest. The conditions are tough and bleak, especially for the modem users.


posted by lagado at 5:24 PM on May 21, 2001

Matt, I've done this with Oracle and MySQL and it's really easy. The only variables you need to fill in are page number (set in the URL) and posts per page (set either on the client or server side)

The responses to this thread look really positive.

As far as the comments-per-page variable goes, you can either set it as a constant on the server side or set it as a user preference and store it in the cookie, in which case it should probably be global to all threads that user views (in other words that user ALWAYS sees X posts per page).

Of course, I'm sure you probably already knew all of that...
posted by fooljay at 8:30 PM on May 21, 2001

An alternative approach is to put the newest comments at the top of the page. Then mondem users can kill the download once they've got the new data.

Just a thought. You could even limit to displaying only the most recent n threads, with a tail link to show everything.

posted by andrew cooke at 9:21 AM on May 22, 2001

Great idea for modem users, but it doesn't help the site. By the time the first few comments are shown the "damage" has already been done on the database side, which seems to be the bottleneck.

Hmmmmm, an expiring server-side cache.... Hmmm.... Update triggers.... Hmmmmmm.... Matt, are you listening?
posted by fooljay at 11:01 AM on May 22, 2001

here is my understanding. right now, the thread's table is being iterated over linearly: if there are 300 comments, the table will be iterated over by 300 times. that's actually not so bad.

before text is sent over a network, using bandwidth, it needs to be accessed by the sending function and iterated over. each byte is accessed in this fashion. as far as the load regarding the database is concerned, this is the bottleneck.

if there are 300 comments, with between... let's say, on average, 300 characters per comment, you're transferring 300x300 = 90,000 bytes of text from the database, not counting html tags. 90kb per hit. with lots of people constantly updating, and lots of people simply reading. yah, it's bad from a bandwidth perspective--the html page itself is worth plenty more text than the comment text is. i think it's fair to say that at points, the html pages being transferred per hit were in excess of 130 kb. all of that needs to be processed by the server, per hit.

bandwidth can handle that kind of thing, because their potential throughput does not weaken over time. damage to servers, of course, is cumulative.

so all that is why i think it'd be a good idea to write pagniation code into mefi. and like i said, i'd be willing to volunteer time working on the problem if you'd like.
posted by moz at 2:01 PM on May 22, 2001

« Older 9000th post celebrated here   |   Kaycee unveiled Newer »

You are not logged in, either login or create an account to post comments