How big is MetaFilter? February 8, 2002 9:54 AM   Subscribe

How big is MetaFilter? There's a description of the server on the about page, but I'm interested in how much disk space the site consumes, and how big the DB is. (more inside)
posted by mr_crash_davis to MetaFilter-Related at 9:54 AM (22 comments total)

You don't have to give away any confidential stuff, Matt. I'm just curious. I run a site on my company intranet that's Access-driven rather than SQL, and I've been watching the database grow and grow, and watching the number of pages grow as I add more functionality, and it's getting harder to keep track of all the time. And I've only been doing it for a year and a half. As I was making my daily backups, it occurred to me that MeFi must be relatively huge compared to my little site.

[double entendre type = "obvious"]
So, Matt, how big is it?
[/double entendre]
posted by mr_crash_davis at 9:59 AM on February 8, 2002


you and matt run the same servers, crash. (access = microsoft SQL, or mSQL, which is what matt uses with coldfusion.)
posted by moz at 10:53 AM on February 8, 2002


moz, actually in the grand scheme of things, access is a the little kids toy of databases, it's easy to setup and modify and works well for low-to-moderate traffic, and MS SQL is the "enterprise solution" that's suppose to scale up infinitely (everything on Blogger you've ever posted is in a gigantic multi-gig MS SQL db spread across multiple machines).

I think the SQL db is about 200Mb, the files themselves for the site are only a couple Mb total. The server is pumping out almost 100Gb of html each month, just for MetaFilter.com.
posted by mathowie (staff) at 11:02 AM on February 8, 2002


This kind of stuff comes up a lot. I maintain that Jason should sell walking tours of his apartment, so we can all have a chance to see It in person.
posted by jpoulos at 11:09 AM on February 8, 2002


Errrr, so for us non-tech types, what does that mean? Can you guesstimate Amazon or Google's SQL db ( no idea what that is ) or how many Gb's of html a site like yahoo is pumping out? Just so we can put it in perspective? Pretty please with sugar on top?


posted by remlapm at 11:11 AM on February 8, 2002


If you have real work on Access, I shall be forced to point and laugh.

*points, laughs*

Oops, too late.
posted by NortonDC at 11:11 AM on February 8, 2002


matt:

well, all that i am saying is that both systems use mSQL as the backbone. i can go into access and execute some mSQL-specific code and it goes just fine. (i've done it, in fact.) that's not to say that the systems are both distributed (e.g. blogger) or as fancied up as can be, but they do share a common codebase.

don't laugh at access. i have to work with paradox.

posted by moz at 11:19 AM on February 8, 2002


yo, crash - you might try compacting and rebuilding the access mdb fairly often (nightly or more). don't know how your app is built, but access sux when it comes to cleaning up after itself in terms of temp tables and the like.

as you get anywhere over 100MB your stability goes to heck.

mySQL can run under winDoze, so an ODBC port of that might be in order - good luck!
posted by crankyrobot at 11:25 AM on February 8, 2002


PARADOX!!!
my goodness, I didn't think [anyone] still futzed with that. Do have heart, moz - I just converted a database for a one of the first clients I even had - it was a system originally written in 4.0DOS which then was migrated forward. Hey - a check's a check, yes?

Remeber when PDOX was the ONLY windows database? eeewww.
posted by crankyrobot at 11:29 AM on February 8, 2002


Moz: Actually, just to be picky, mSQL != SQL Server.

mSQL is "mini SQL".

Although one might say MSSQL = SQL Server.

posted by bshort at 11:48 AM on February 8, 2002


crankyrobot:

I said it was a small site, and I wasn't kidding. The DB right now is less than 20MB, and there are usually less than 10,000 page views a week. It's basically a trouble-ticket tracker for our network operations. I compact it weekly anyway, just to have a routine. Like I said, it's been running a year and a half and I'm only up to about 6800 entries.

I'm thinking of porting everything over to MSSQL 7 anyway (it's already in use on some of our products), just because I need to learn something new and it should be fairly uncomplicated but challenging enough to keep me awake on swing shift, and it looks better on a resume. Plus, there's always the hope that our parent corporation will want to use it company-wide, since they're already incorporating a lot of our procedures as the standards. It would be nice to be ready for a larger-scale solution if I needed it.
posted by mr_crash_davis at 11:54 AM on February 8, 2002


i have to work with paradox.

Oh my Christ. Even as a non-techie, I recognize what kind of soul-death this represents. This is like trying to drive a Jag with hamster-wheel propulsion.
posted by Skot at 11:56 AM on February 8, 2002


Actually, moz, you can use Access as a front end to MSSQL. In doing so, you're just sending SQL from Access and using MSSQL's relational engine to provide a result set. If you use Access as a standalone, you're using MS's Jet Database engine, which is really only suited for smaller, non enterprise, non distributed databases. Access is 100% not MSSQL.

MSSQL runs its own services, SQLServer for data access and relational rules, SQLAgent for automated "jobs" and Distributed Transaction Coordinator, all of which allow for constant upkeep. Access' Jet Database Engine acts more like a file access method.

Also, sorry about your Paradox situation... you should try Filemaker sometime.... eeeeck.
posted by eyeballkid at 12:35 PM on February 8, 2002


eyeballkid, bshort, matt:

oops, perdonme. i have a reference book on SQL, and i could have sworn that it stated that microsoft SQL Server is mSQL, but looking through it now i don't see anything like that. i'm sorry -- i get into these feedback loops, every now and then, where i latch onto the wrong answer and tell people they're wrong. sorry, again.
posted by moz at 1:05 PM on February 8, 2002


Errrr, so for us non-tech types, what does that mean?

remlapm, let me give that a try. "db" is shorthand for database - in this case, this is what stores all of the threads, comments, and users on MeFi. 200 megabytes would easily fit three times over on a single CD, so that's not too bad. My guess for the size of Amazon's databases (product listings plus accounts plus all that stuff that lets them tell you what you want to buy plus who knows what else) would be at least a thousand times that, probably more.

As for traffic... I don't know about Yahoo, but just as an idea of how much a site can put out, there is cdrom.com. Almost two years ago, they managed to send out one terabyte (approx. a thousand gigabytes) in a single day (that's 300 times what MeFi pushes out). They may well be up beyond that, now. It's a pure-download site, though, so it basically has people downloading large files continuously. 100 GB per month is pretty impressive for text and a few small images.
posted by whatnotever at 1:09 PM on February 8, 2002


whatnot- that definately helps. My view of MeFi was a little blinking box and a zip drive somewhere in SanFrancisco. If someone spilled their beer accidentally, hundreds of lives would be ruined.




posted by remlapm at 1:24 PM on February 8, 2002


I've got to think that eBay's database is a mother.... I was bidding on 2 auctions by the same seller last night. The second auction was posted about 20 seconds after the first one. The two auction numbers differed by 375. Unless eBay uses non-consecutive item numbers, it's pretty staggering.
posted by crunchland at 1:30 PM on February 8, 2002


FWIW, MS SQL and Access do not share code. Access runs on a database engine called Jet, which is used by a lot of little desktop apps. (CityDesk comes to mind.)

The MS SQL (not, as noted, mSQL) database engine does exist in a desktop version, called MSDE, but that is only used in extremely rare instances, usually just by developers or when an enterprise application built on MS SQL is being scaled down to a desktop or workgroup.

The MeFi server, though? About a foot and a half tall.
posted by anildash at 2:02 PM on February 8, 2002


I use MySQL. How does that rate?
posted by jpoulos at 4:15 PM on February 8, 2002


To further complicate things, there is a little known stripped down version of SQL Server called Microsoft Data Engine (MSDE). It runs standalone (no server process needed) and it's fully compatible with SQL Server. You can use Access as a front-end to it very easily. I believe future versions of Access will be using this engine, rather than Jet. (moz, perhaps this is what you're thinking of)

If you're thinking of using Access for a project, I definitely recommend taking a look at MSDE if there is any chance that you might someday be moving it to SQL Server. The default data store for Access is Jet, and the SQL syntax (Jet SQL) is slightly different than the syntax for SQL Server (TSQL). Migrating from MSDE to SQL Server requires no code changes, whereas Access to SQL Server will require some work (albeit not too difficult). MSDE ships with Office 2000, and presumably Office XP.

And as crankyrobot pointed out, you probably want to compress your Access database on a regular basis. When you delete a record in Jet, it doesn't actually remove it from the database, it just marks the record as deleted. "Compressing" the database actually goes through and removes the record. I've seen 400MB Access DBs drop down to 10MB file size.
posted by kaefer at 6:03 PM on February 8, 2002


A key point if you're writing apps and selling them is that MSDE is redistributable, and is in fact just a cut-down, throttled MSSQL Server. If you're wanting to sell db-based apps to a client base that is (with good reason) hesitant to drop the large $$ involved in purchasing and licensing SQL server itself, and you want more grunt than Access, with an easy upgrade path to the big(ger) iron of SQL Server itself, MSDE is a good solution.
posted by stavrosthewonderchicken at 8:51 PM on February 8, 2002


"I was bidding on 2 auctions by the same seller last night. The second auction was posted about 20 seconds after the first one. The two auction numbers differed by 375."

This sounds completely believable to me. LiveJournal averages around 100 posts per minute... often considerably more during peak evening hours. This doesn't count all the comments people make, either.

Just as a means of comparison, the entire LiveJournal database is 42 GB (approx. 1/30th the size of the Google database) divided amongst several clustered databases running MySQL. We currently serve ~13 Mbps/day.

Not too shabby for MySQL, eh?!
posted by insomnia_lj at 9:58 PM on February 8, 2002


« Older Server busy or unable to fulfill request   |   Help Me Find Defense Spending Thread Newer »

You are not logged in, either login or create an account to post comments