MTU settings = No Metafilter May 5, 2007 3:47 AM   Subscribe

Admins: Metafilter is blocked for people with incorrect MTU (Maximum Transmission Unit) settings on their router/computer(s). See this thread. I had this problem for several weeks, during which Metafilter's banner would load, but nothing else. Something needs to be tweaked on either the Metafilter server, or the Metafilter routing system (firewall settings?). My router was set to the default values for my ISP when this happened (BT Broadband in the UK). In other words, this isn't caused by people tweaking their settings—some ISPs are effectively incompatible with Metafilter in its current setup.
posted by humblepigeon to Bugs at 3:47 AM (47 comments total) 3 users marked this as a favorite

I've never even looked at MTU settings before. Is there a problem on my end with the DNS setup? Can someone dig the nameservers and tell me which one is messed up?

Or is this problem more on the web server end than a DNS server?
posted by mathowie (staff) at 7:36 AM on May 5, 2007


I'll put $5 on a firewall that is dropping IP fragments. Lowering the sizes of the packets you send means they're less likely to be split up on their way to MF.
posted by popechunk at 8:02 AM on May 5, 2007


Well I'm with BT and I get through fine without tweaking anything. Which doesn't mean you're wrong, maybe not all BT set-ups were created equal.
posted by londongeezer at 8:06 AM on May 5, 2007


Well I'm with BT and I get through fine

I think some hop in other folks' path to MF thinks it needs to break ~1500 byte packets into smaller pieces. I think these IP fragments are getting dropped by a firewall who thinks that these packets are from a bad guy.

I can't get anything larger than 1464 bytes in an ICMP to www.metafilter.com if I set the Don't Fragment Bit.


$ ping -s 1464 -D www.metafilter.com
PING metafilter.com (74.53.68.130): 1464 data bytes
1472 bytes from 74.53.68.130: icmp_seq=0 ttl=116 time=61.996 ms
1472 bytes from 74.53.68.130: icmp_seq=1 ttl=116 time=63.251 ms


$ ping -s 1465 -D www.metafilter.com
PING metafilter.com (74.53.68.130): 1465 data bytes
92 bytes from homeportal.gateway.2wire.net (192.168.0.1): frag needed and DF set (MTU 1492)


posted by popechunk at 8:17 AM on May 5, 2007


It's caused by people not tweaking their settings when they have to. Like you said, it's blocked for people with incorrect settings on their router. If your router doesn't advertise the right MTU to its next hop then it's time to fix your router.

None of this is Metafilter's problem. The way to fix a PMTU blackhole is at the client end, because the client only has to deal with one blackhole, and the server would have to deal with every possible blackhole a client might discover.
posted by mendel at 8:17 AM on May 5, 2007 [1 favorite]


When I had the problem I got the Metafilter page header, and the top menu bar, but nothing else. It's not a DNS issue. popechunk's answer sounds right from the little research I did. Everybody I spoke to said it was a firewall issue. The packets are getting fragmented and the firewall is interpreting this as some kind of attack, so locks-up after serving just a little data. I guess this could be a router hardware issue, before the requests get to the server? I'm at the limit of my knowledge here.

I had this problem on Metafilter and also on one or two other sites. Every other site worked OK.

I fixed the problem by increasing my MTU on my router to 1500, at which point it vanished. I think it had been set at something weird like 1967. I can't remember. I did a few tests, switching settings, to verify the problem.
posted by humblepigeon at 8:19 AM on May 5, 2007


When I had the problem I got the Metafilter page header, and the top menu bar, but nothing else

Then it sounds like your firewall :-)
posted by popechunk at 8:23 AM on May 5, 2007


It's caused by people not tweaking their settings when they have to.

I can confirm that I didn't tweak settings before this problem arose. I wasn't even aware that my router allowed me to tweak the MTU figure, and had to spend some time exploring various menus.

The other reason why your comment doesn't hold water is that virtually every other site worked fine. You can't blame the user for this if Metafilter, out of 1,000,000 sites, is the only one that doesn't work.

I also did several complete resets of my router to try and get back to the beloved blue page, which should have returned settings to default. Still no dice.

At best, then, Metafilter is incompatible with certain router default settings, and at worst, incompatible with certain combinations of router/ISP.
posted by humblepigeon at 8:26 AM on May 5, 2007


When I had the problem I got the Metafilter page header, and the top menu bar, but nothing else

Then it sounds like your firewall :-)


I have no firewall. The router has a NAT, which is all the protection I need. I've just checked and my MacBook's OS X firewall is turned off. Not sure about my other Mac, but the problem occurred on both computers.
posted by humblepigeon at 8:28 AM on May 5, 2007


I have no firewall.

I was mostly kidding. I don't know what the actual problem is. Maybe the traffic is going to two load-balanced firewalls with different frag policies set or something. I'm sure a network professional will show up in this thread soon and school us all.
posted by popechunk at 8:34 AM on May 5, 2007


This is caused by some hop along the way filtering ALL of ICMP, because somebody told them that stopping pings (or some other inane bullshit) was a good idea. But successful Path MTU Discovery requires an certain ICMP message if any hop along the link has a less than 1500 MTU. So whoever is filtering all ICMP indiscriminanly needs to stop. This could be any hop along the path.

See also: pMTU eyechart
posted by Rhomboid at 8:37 AM on May 5, 2007 [1 favorite]


And by the way, if you are using PPPOE or some other tunneling (common with DSL), your proper MTU really is 1492, not 1500, and so forcing it to 1500 because there's a site out there with broken pMTU is the wrong solution!!
posted by Rhomboid at 8:38 AM on May 5, 2007


It's a problem for Metafilter if people give up rather than reconfigure their routers. Maybe.

There are always going to be people who will have problems. But if the percentage is really low and the effort to support them is really great, it isn't really worth it. If "people" give up -- say, 90% -- then it's a problem. If "people" give up -- say, 0.01% -- then it is not.

That's why a lot of people, back in the day, gave up on trying to make their web pages look correctly in Netscape 4. Not too many people used it, and it was even more funky than IE 6 in terms of interpreting standard HTML formatting. It just wasn't worth the effort, even though it meant that there might be a fractional decline in the potential audience.
posted by Steven C. Den Beste at 8:40 AM on May 5, 2007


Oh shit, I just realized... about a week after I setup the new server, I setup apache's gzip stuff in apache 2.0x, so packets are gzipped when sent to clients.

I could turn off the zipping and see if that fixes things for people with weird routers. That's the only thing I changed from the old server to the new one and since it's directly related to packet size, I bet it might be the cause.
posted by mathowie (staff) at 9:19 AM on May 5, 2007


Ok, I think I just turned off all the gzip stuff. People that had a problem, lemme know if you still have any trouble.
posted by mathowie (staff) at 9:26 AM on May 5, 2007


I learned something from a MeTa thread?
posted by sciurus at 9:50 AM on May 5, 2007


I thought Apache gzip compressed the page as a whole. You might end up sending fewer packets, but surely the packet size doesn't change, since it's the page that's being gzipped, not the packets. (I'm not a network or Apache expert, though.)
posted by Aloysius Bear at 10:00 AM on May 5, 2007


GZip doesn't affect packet size, so far as I know. It just reduces the total amount of data sent. (I've got it enabled on my server, and it cut way down on the load on my pipe.)
posted by Steven C. Den Beste at 10:04 AM on May 5, 2007


I've been getting the random hang behaviro pretty much since the server move back in March. Just at home, though, and even that has seemed restricted to only a couple of machines. I've hesitated to post about it (and god knows, I've been tempted numerous times) since I was assuming it was just something crazy on my side that I had yet to figure out.

I tried the MTU change. No luck.

I'm still now getting random hangs, but less than before. I thought the GZip change helped at first, but I still was hanging a bit (especially on the Music, Projects and Jobs sections).

My symptoms have mirrored humblepigeon's pretty much exactly. I'd get the banner and then hang hang hang hang. I'd wait, go do something else, and reload. After a few reloads, things might come up.

Just this morning I was thinking "Maybe this is nature's way of telling me that my metafilter addiction needs some curbing." So, hey, now I don't feel quite as crazy or frustratingly isolated as I did before with the problem.
posted by smallerdemon at 10:21 AM on May 5, 2007


" I just realized... about a week after I setup the new server, I setup apache's gzip stuff in apache 2.0x, so packets are gzipped when sent to clients."

So do I have to have WinZip installed on my computer to decompress Metafilter now?
posted by mr_crash_davis at 11:29 AM on May 5, 2007 [1 favorite]


*somewhere in the middle of the cold atlantic, under miles of crushing seawater, a giant squid tightly caresses a fiber optic cable as it convulses in orgasm*
posted by quonsar at 11:36 AM on May 5, 2007 [8 favorites]


So do I have to have WinZip installed on my computer to decompress Metafilter now?

mod gzip

It doesn't have anything to do with WinZip. Some browsers support it, some do not. During the initialization handshake when setting up an HTTP connection, the server and browser agree on whether to use it, which they will only do if the browser understands it. It's all transparent to the user. About the only thing that a user will notice is that pages which are compressed will load faster.

But it's all happening up at layer 7. It doesn't affect TCP at layer 4 at all.
posted by Steven C. Den Beste at 11:44 AM on May 5, 2007


he was kidding, best dente. sheesh.
posted by quonsar at 2:06 PM on May 5, 2007


matt said the packets were now gzipped. you missed that gaffe.
posted by quonsar at 2:07 PM on May 5, 2007


Ths cmmnt s cmprssd s mch s t cn b.
posted by dhartung at 2:27 PM on May 5, 2007


For the more technically inclined - wouldn't setting Metafilter's webserver to have an artificially low outbound MTU (say, 1300?) solve this neatly? This is clearly a packet size/fragmentation issue, so it seems like Mefi could achieve max compat. by just sending out smaller packets that won't get fragmented.
posted by Ryvar at 3:51 PM on May 5, 2007


(which is also to suggest that mod gzip should be turned back on - it makes the experience cheaper for Matt and faster for the rest of us)
posted by Ryvar at 3:52 PM on May 5, 2007


Apparently this might be a Microsoft problem. I'm installing some hotfixes to hopefully remedy it.
posted by mathowie (staff) at 4:45 PM on May 5, 2007


Ok, I patched the box. Anyone that was having issues, lemme know if they're continuing still.
posted by mathowie (staff) at 4:58 PM on May 5, 2007


I'm glad to hear I wasn't the only person having this problem. I was hesitant to post, because I figured it was something I'd done on my end. I just couldn't figure out what that might have been. With everyone's help yesterday, I reset my MTU which fixed things for me.

I just tested your patch by switching everything back to the way it was, and the blue came up just as pretty as ever. I think we can consider this problem fixed.
posted by Eddie Mars at 6:13 PM on May 5, 2007


I am still unable to reach MeFi with a default MTU of 1492.
posted by hoverboards don't work on water at 6:43 AM on May 6, 2007


Here's one for you... I have this problem on my Mac at home, but not the XP PC, and they're both on the same switch.

It usually seems to hang on the prototype.js load.
posted by smackfu at 8:42 AM on May 6, 2007


I have this problem on my Mac at home, but not the XP PC, and they're both on the same switch.

In that case, your problem is that you're using a Mac.
posted by eyeballkid at 8:52 AM on May 6, 2007


There is something between me and MeFi that's keeping me from using an MTU of 1492. (note that there's 8 bytes of header information....)
eriko# ping -D -s 1484 www.metafilter.com
PING metafilter.com (74.53.68.130): 1484data bytes
ping: sendto: Message too long
ping: sendto: Message too long
1480 bytes, however, works:
eriko# ping -D -s 1472 www.metafilter.com
PING metafilter.com (74.53.68.130): 1472 data bytes
1480 bytes from 74.53.68.130: icmp_seq=0 ttl=117 time=106.856 ms
1480 bytes from 74.53.68.130: icmp_seq=1 ttl=117 time=108.695 ms
Wonder who's eating it? Further testing, elided here, points the finger at www.metafilter.com itself, but that was just a quick test, I Could Be Wrong™.

However, it tells me that MeFi is doing the right thing with a limited MTU. If I allow fragmentation, it works. This means that Path MTU Discovery is going to work as well -- at least, it works from my network to MeFi's.

I suspect the problem is that somebody is not allowing ICMP Type 3, Option 4 "Fragmentation Required" packets. This kills Path MTU discovery. Note that if you disable Path MTU Discovery in Windows, Windows drops the packet size to about 500 to try to make sure you get through, but if you have a router or a firewall between the client and server that's trapping the packets, you're stuck. This is a "Black Hole" for Path MTU Discovery.

There is still hope, though , if Path MTU Blackhole Detection is running on the client. If it isn't, it can't detect that Path MTU Discovery isn't working and work around it. End result? You keep sending packets that are too big for the connection, and you never get the word that you either need to shrink the MTU, or allow fragmented packets. Thus, you get little-to-nothing from the far end.

In short:

Kids, let ICMP through, in particular, ICMP type 3, option 4. ICMP type 0 (echo reply) is handy as well. Heck, the entire ICMP type 3 set is too useful to block blindly.

Yes, they can DDOS with ICMP. They can DDOS you with anything, if they try.

If you have Path MTU Discovery running (and you should), make sure you have Black Hole detection running as well.

If you can't run Path MTU Discovery, knock your MTU down to 500 or so. (Windows, as noted, does this automagically for you.) Really, however, the right answer is ICMP. There are many reasons that we have the Internet Control Message Protocol, and this is one of them.
posted by eriko at 11:31 AM on May 6, 2007 [2 favorites]


Here's one for you... I have this problem on my Mac at home, but not the XP PC, and they're both on the same switch.

Possibilities:

One: Path MTU discovery is working, but the OS X box isn't seeing it.

Two: Path MTU discovery isn't working, blackhole detection is working on XP, but not on the Mac. This is my guess, because a quick check doesn't show a way to set blackhole detection on the mac.

Check your router/firewall, set it to let ICMP message through.
posted by eriko at 11:47 AM on May 6, 2007


I solved mine. Multiple router updates for my old Linksys BEFSR41. Turns out I was several, uh... *coughyearscough* I was behind.

Things seem to be zoom, zoom zoomin' along now.
posted by smallerdemon at 2:41 PM on May 6, 2007


Caveat: I had to disable MTU, which was also set at the aforementioned 1492.
posted by smallerdemon at 2:56 PM on May 6, 2007


this internet, i thought it routed around stuff like this?
posted by quonsar at 3:54 PM on May 6, 2007


This is really weird. At school, one of our campuses is connected via a VPN where I have to manually change each windows-based PC to 1200 in order to access most websites. Going over an encrypted VPN, the default packet sizes weren't getting through. Because of the way our ISP configured the vpn, packets being sent were > 1500 causing a bounce-back from some websites (notably, microsoft) because of the increased likelihood of a DDOS. Either way, our computers with an MTU of 1200 still connects to Metafilter.
posted by jmd82 at 5:22 PM on May 6, 2007


At school, one of our campuses is connected via a VPN where I have to manually change each windows-based PC to 1200 in order to access most websites.

It's one of the something-encapsulated-into-IP variants of a VPN. Thus, you need a smaller MTU. This should be picked up on by Path MTU Discovery, but if ICMP isn't routed through the tunnel, that's the problem.
posted by eriko at 7:06 PM on May 6, 2007

this internet, i thought it routed around stuff like this?
Only if nobody has decided to firewall off the routing information. (That quote has always bothered me.)
posted by hattifattener at 12:13 AM on May 7, 2007


So glad I found this thread, I had to access Metafilter via coral cache for weeks and even that way the pages never loaded completely.

Turned out it was the MTU thing for me too, and I don't even know what I'm talking about here, I just followed instructions. Thanks in particular to eriko for the terminal command on how to find out which MTU to set.

(for anyone else who doesn't have the faintest clue where to start from, and with apologies to everyone who thought this was too obvious to mention, the MTU setting is under 'WAN' on your router, see here)
posted by pleeker at 12:14 PM on May 8, 2007


Turned out it was the MTU thing for me too,

Glad the thread helped, but you shouldn't have still been having problems as of a few days ago when the server was patched. Can I ask, when did you have to change the MTU value?

the MTU setting is under 'WAN' on your router

That's right but on some routers (my 3Com model, for example), it might be found under the "Internet" heading of the setup page.
posted by humblepigeon at 12:28 PM on May 8, 2007


humblepigeon - I know, I had matt's comment that the server was patched, but it changed nothing for me (sorry should have probably sent an email!). I still had to use coral cache.

I changed the MTU setting on the router today, a few minutes before posting that comment above. It worked instantly.
posted by pleeker at 1:43 PM on May 8, 2007


(I had +read+ matt's comment)
posted by pleeker at 1:44 PM on May 8, 2007


Thought that it might be worth pointing out that some people are still experiencing this problem.

Is there any way to find out what sever along the path is dropping the ICMP Type 3 packets, or at least figure out if it's on the customer's ISP's side, or MeFi's ISP's side (the two places where complaining might possibly get anything done about it)?

It's seems a little odd to tell people that they have a problem in their router, when it's only one website that's inaccessible out of the whole Internet. Also, if people just force their MTU to a very low value, seems like that's going to generate a lot more traffic than is strictly necessary, assuming they'd be okay with an MTU of ~1500 for most other sites.

I guess I'm just bothered because this seems like a really inelegant solution.
posted by Kadin2048 at 12:33 PM on May 11, 2007


Up until a few minutes ago, I was still experiencing this problem. It only works now because I added an iptables rule on my router to reduce the TCP MSS to 1400 for connections to metafilter's IP. Like popechunk, I can't get a ping bigger than 1464 bytes through to metafilter with the DF bit set.
posted by narge at 7:12 AM on May 21, 2007


« Older The Corporal's nemesis appears   |   Lets play Mafia Newer »

You are not logged in, either login or create an account to post comments