care to explain the server issues? March 21, 2006 3:45 PM   Subscribe

Oh great #1, care to explain the server issues?
posted by wheelieman to Uptime at 3:45 PM (38 comments total)

Please?
posted by wheelieman at 3:49 PM on March 21, 2006


He did a couple days ago.
posted by crunchland at 3:55 PM on March 21, 2006


To use a metaphor, cfmx is a haunted patchwork house of cards built on an indian burial ground.
posted by holloway at 6:10 PM on March 21, 2006


BOOGY WOOGY WOOGY WOOGY
posted by holloway at 6:12 PM on March 21, 2006


Man, this is rough living! Now I know why the government says to keep tuna under my bed.
posted by evariste at 6:17 PM on March 21, 2006


I think I've narrowed down the issues to the database server. There's a long story about how once in a while the db server can't be reached and mefi dies, but I'm lining up a new db server with a bit more horsepower that should fix the issues (and be in the same cage at the colo).
posted by mathowie (staff) at 6:17 PM on March 21, 2006


...so it's not the government then?

Dammit.

*loses $20*
posted by Smedleyman at 6:24 PM on March 21, 2006


mathowie, what database does MeFi use?
posted by evariste at 6:25 PM on March 21, 2006


(Just curious)
posted by evariste at 6:26 PM on March 21, 2006


SQL Server.
posted by killdevil at 6:26 PM on March 21, 2006


Though I didn't know it was running on a separate server. That's good to hear.
posted by killdevil at 6:26 PM on March 21, 2006


It's running SQL Server 2000.
posted by mathowie (staff) at 6:28 PM on March 21, 2006


*wanders in looking gaunt, shaking visibly*
posted by loquacious at 6:33 PM on March 21, 2006


Oh wow, we're a Windows shop? Eek...
posted by evariste at 6:38 PM on March 21, 2006


*looks at Loquacious* JOIN A RPG BOARD ALREADY!!!
posted by wheelieman at 6:39 PM on March 21, 2006


coldfusion
--posted by holloway


SQL Server 2000
--posted by mathowie


Windows
--posted by evariste


I think we've isolated the problem!
posted by cyrusdogstar at 6:57 PM on March 21, 2006


That's like saying you've isolated the source of your car trouble because you know it's somewhere in the car.
posted by nebulawindphone at 7:22 PM on March 21, 2006


...somewhere in the haunted car.
posted by holloway at 7:28 PM on March 21, 2006


Are you using the same server as Slashdot?

Every once in a while this weird thing happens. Internal links get their host appended. E.g. mefi links are normally something like:
/mefi/50248
right now they appear as:
http://www.metafilter.com/mefi/50248
At the same time Slashdot has gone from
//linux.slashdot.org/article.pl?sid=06/03/20/163216
to
http://linux.slashdot.org/article.pl?sid=06/03/21/2245243
This simultanious switching of Mefi & slashdot has happened at least once before.
posted by MonkeySaltedNuts at 7:32 PM on March 21, 2006


he- hello? anyone here? look, I really need a fix. I'm dying, man! It's been hours!
posted by shmegegge at 7:50 PM on March 21, 2006


That's like saying you've isolated the source of your car trouble because you know it's somewhere in the car.

More like, he's isolated the source of his low gas mileage and constant flipping over and tires exploding in the fact that he's driving a Ford Explorer with Firestone tires.
posted by evariste at 7:53 PM on March 21, 2006


BOOGY WOOGY WOOGY WOOGY

::: does the Electric Slide :::
posted by Johnny Assay at 8:27 PM on March 21, 2006


evariste wins for best Product Recall-reference that is curiously analogous to a server running Microsoft code!

also,

Metafilter: built on an indian burial ground.
posted by tweak at 9:06 PM on March 21, 2006


More like, he's isolated the source of his low gas mileage and constant flipping over and tires exploding in the fact that he's driving a Ford Explorer with Firestone tires.

HFS I spat beer and inhaled a potato chip on that one. You almost killed me, sir, I'll have you know.
posted by scarabic at 9:07 PM on March 21, 2006


::Accidentally does "New Electric Slide" and injures self by bumping into Johnny Assay who is doing the original.::
posted by drezdn at 9:33 PM on March 21, 2006


Easy, scarabic :)
posted by evariste at 9:37 PM on March 21, 2006


Hey! Yeah, you! Does it look like I'm talkin' to someone else? Put down those pageloads. 'What pageloads?', you ask all doe-eyed and innocent and shit. Look at you! That's my stash, sucka. You itchin' to get shived or what, man? I got no issues with stickin' you if you got no issues with walkin' off with my satchel, dig?
posted by loquacious at 10:53 PM on March 21, 2006


coldfusion... SQL Server 2000... Windows... Ford Explorer... Firestone.

Those are all US products, right?

Just askin'
posted by Meatbomb at 1:05 AM on March 22, 2006


I think we've isolated the problem!

Yeah, I'll start porting it to c-shell.
posted by eriko at 5:53 AM on March 22, 2006


arrgh holloway is making me laugh so hard... trapped...in cubicle... must try to keep quiet...
posted by Baby_Balrog at 7:17 AM on March 22, 2006


Curious, because I've done some work on high-availability hosting before... have you ever given consideration to hosting MeFi on more than one server? I know this is money out of pocket for you, especially if you expanded the foot print by a couple of servers and the colo costs along with it... but the reality is that no matter how good/bad an OS, etc, you ultimately have to scale out, not up- and it can be cheaper to get several low-end boxes to load balance than to have one beefy box on which the whole of Metafilter rests. Given your frustrations with CFMX, etc, having 2-3 el-cheapo boxes that monitor and reboot each other automatically could be a better solution than figuring out what's actually wrong with the servers, platform, or code.

Even SQL2k can be loadbalanced/striped across multiple servers, and I don't even mean clusters necessarily- distributed partition views and things like Webstore can allow the data to be stripped across by access level or date (especially since 80/90% of all mefi data is historical data that is rarely searched or accessed) so if you haven't gotten some hapless MeFite SQL DBE to offer advice, you should FPP a demand the peons give you their time and expertise selflessly. :)

I think holloway mentioned in that other thread a similar idea to something that's been kicking around in my head for some time; some variation of the posting domain being separate from the reading domains, making a middle-tier layer interpreter with secured transactions so that the bulk of MeFi hosting can be pushed out to a few relatively trusted MeFites to host, and a low-cost self-built Akamaization of the mefi domains implemented. The bulk of CPU and tcp connecting would occur on distributed front-ends, and your server(s) could be dedicated to storing a read/write cache of the active posts/comments in memory, and only periodically flushing them to disk on a separate SQL server(s). With a little work, you could even do a Paxos-like DB implementation with some protection from "poison" DB hosts, so that both the DB/middletier and front-end could be widely distributed without fear that someone who wasn't your wife was editing posts or wreaking havoc with the data/user experience.

It would be a most interesting experiment, and I think eminently doable: a globally distributed, open, yet trustable, read/write high-volume weblog. Have 20 or 30 people running mefi on spare desktops in their apartments and a dynamically auto-managed system such that to the average user, it would be a highly available and responsive, even TCP aggregating to nearby hosts to better improve the user experience and responsiveness. I do believe it'd be the first such site of its kind, whoever built it.
posted by hincandenza at 1:13 AM on March 23, 2006


hmmm. metafilter and slashdot have just switched back to using hostless internal links.
/mefi/34936
//slashdot.org/article.pl?sid=06/03/23/1249235
sure looks like they are doing something in tandem.
posted by MonkeySaltedNuts at 5:54 AM on March 23, 2006


Hrm... Haunted car... Firestone tires... Indian burial ground...

Does this explorer have the built-in TVs? If so, monitor them carefully for white noise that attracts little girls.

Metafilter: Bail out now before the bad spirits get you.
posted by mystyk at 8:15 AM on March 23, 2006


Sounds pretty cool hincandenza :)

In an MVC most Vs can be distributed, so yeah then the issue is about syncronising updates across the nodes. Typically I've done this by each node talking to a centralised server, then the node passes out an invalidate cache message (a granular message, invalidating a story rather than a whole site) which tells each node to refresh from the centralised server. The chatter latency would probably be a few seconds across the globe, but I've only done it locally as REST-style stuff (nothing really complex).

What it should do though to relieve some of the bottleneck is to get the centralised server to gpg sign an update, return it to the node, and for the node itself to push it out. That'd allow you to deny hacked hosts too. You'd need some way of refreshing a whitelist.

(I wouldn't suggest it here -- it's probably overkill and mefi's problems could be solved through alternate tech choices)
posted by holloway at 3:57 PM on March 23, 2006


Yes, but as an experiment... wouldn't it be interesting if you could create a trusted, widely distributed "P2P" website. I've built highly automated sites in multiple DCs, but there we could assume a trusted level of control over the servers, and didn't have to guard against someone hijacking an individual server and putting sniffers/mod-rewrites on it, or other methods of hacking the host.
posted by hincandenza at 7:02 PM on March 23, 2006


Hmm.. yeah it's interesting.

How would you diistribute load across the nodes? ...I guess DNS would do it cheaply. Do ip2country mapping and choose an initial redirect from mainsite.com to node23.mainsite.com which is near them.

The nodes would need some way of complaining about load and shuffling users off them.

Moving sessions would need some syncing. Perhaps you'd centralise the session and cache it locally, so that it was ok to bounce users off because the server software would grab it from the central location.

You could allow people to have ads to help encourage hosts.

And you could use javascript to hash some page content and compare that with some other value, to warn users of tampering (assuming the tampered site didn't strip the javascript).
posted by holloway at 8:57 PM on March 23, 2006


as of now, MeFi and slashdot have again switched to including hosts in their internal links.
posted by MonkeySaltedNuts at 11:51 AM on March 26, 2006


as of now, MeFi and slashdot have again switched to including hosts in their internal links.

Now back to hostless.
posted by MonkeySaltedNuts at 8:04 AM on March 27, 2006


« Older Project Feedback on AskMe   |   RSS Bug Newer »

You are not logged in, either login or create an account to post comments