MTU values still causing problems? May 30, 2007 5:07 AM   Subscribe

MeFi (and sub-domains) are still blocked to people with the wrong MTU settings using OS X (previously). Right now I can't access any of the sites following a complete reset of my router last night. Safari halts on * when the status bar says, "... Completed 3 of 4 items". This happens on both my computers connected to the router. I'm currently accessing via, an anonymous proxy. To be honest, I'm not sure if the MTU value is causing the problem, but there's clearly something wrong. Admins: Please look at the comment on the previous threads about firewalls blocking network discovery traffic (from user eriko). That's your cure. Windows computers work around this automatically. Other computers don't, unless the user tweaks!
posted by humblepigeon to Bugs at 5:07 AM (17 comments total) 1 user marked this as a favorite

Here's the 'previously' for people who aren't using
posted by chrismear at 5:25 AM on May 30, 2007

I changed the link back to a non-siteglove one. We'll have to wait for mathowie/pb to wake up to look into the rest of this.
posted by jessamyn (staff) at 6:51 AM on May 30, 2007

humblepigeon, I need a copy of a traceroute and a ping for from you. The host doesn't believe that the firewall is the problem, and I need some data to back that up.
posted by mathowie (staff) at 7:45 AM on May 30, 2007

(heh) ...Get a windows PC! (heh-heh-heh-heh)
posted by Steven C. Den Beste at 9:32 AM on May 30, 2007 [1 favorite]

I'm having the time of my life exploring the world of proxies in order to access Metafilter. If I ever regain proper access to the site I might post a FPP about what I've found. Seems they're mainly about accessing banned sites from inside school/college firewalls. Fight the establishment!

Back to the matter in hand.

Ping works but is slow:

$ ping
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=107 time=273.186 ms
64 bytes from icmp_seq=1 ttl=107 time=273.102 ms
64 bytes from icmp_seq=2 ttl=108 time=157.470 ms
64 bytes from icmp_seq=3 ttl=107 time=250.814 ms
64 bytes from icmp_seq=4 ttl=107 time=282.312 ms
--- ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 157.470/247.377/282.312/46.138 ms

Traceroute stalls at (

$ traceroute
traceroute to (, 64 hops max, 40 byte packets
1 ( 1.110 ms 0.901 ms 0.862 ms
2 ( 231.206 ms 179.543 ms 214.265 ms
3 ( 242.484 ms 251.991 ms 273.886 ms
4 ( 244.511 ms 283.528 ms 305.461 ms
5 ( 304.385 ms 316.052 ms 303.561 ms
6 ( 334.994 ms 407.199 ms 365.065 ms
7 ( 304.758 ms 398.888 ms 425.662 ms
8 ( 456.906 ms 472.599 ms 455.253 ms
9 ( 605.937 ms 611.072 ms 580.511 ms
10 ( 492.881 ms 408.057 ms 486.382 ms
11 ( 488.199 ms 520.817 ms 486.563 ms
12 ( 558.049 ms 614.605 ms 597.268 ms
13 * * *
14 ( 493.196 ms 541.323 ms 546.990 ms
15 ( 304.314 ms 185.209 ms 272.086 ms
16 ( 245.289 ms 255.448 ms 176.404 ms
17 ( 300.096 ms ( 127.607 ms 242.553 ms
18 ( 274.727 ms 303.482 ms ( 305.911 ms
19 ( 330.539 ms * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
(and so on, until it reaches 64)

I'm running OS X 10.4, fully updated. My router is a 3Com 3CRWDR100A-72. I updated the firmware today to try and solve this problem but no joy.

Both Safari and Firefox jam-up when trying to load any * page. They get the page banner, but no actual content. I don't think it's a browser problem because I can access via a web proxy.

All other websites work fine. This is a 2Mbit ADSL connection provided by BT Broadband here in the UK. I think location might be an issue.
posted by humblepigeon at 9:45 AM on May 30, 2007

(heh) ...Get a windows PC! (heh-heh-heh-heh)

I can break out my Windows PC from storage and try the site on it, if necessary. But that's a lot of messing around. Let me know.
posted by humblepigeon at 9:47 AM on May 30, 2007

Also, the output of dig:

$ dig

; <>> DiG 9.3.4 <>>
;; global options: printcmd
;; Got answer:
;; ->>HEADER< - opcode: query, status: noerror, id: 59418br> ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

; IN A


;; Query time: 356 msec
;; WHEN: Wed May 30 17:51:50 2007
;; MSG SIZE rcvd: 48

posted by humblepigeon at 9:53 AM on May 30, 2007

Hello. I'm back, without a proxy. In other words, I can access * again.

Rather embarrassingly, it looks like the problem was with my router, although it's still a mystery why was the only site that was affected.

In short, my router's firewall component was enabled, probably after the complete reset I did last night. Somehow this switched off black hole detection, which meant the connections to were lost.

So if you're reading this thread in the future, and trying to find out why Metafilter isn't working for you, disable your router's firewall. Or, at the very least, look for an option to enable Path MTU Discovery/Blackhole Discovery.
posted by humblepigeon at 10:17 AM on May 30, 2007

I've been reading this and the previous threads b/c I started having this problem on my PowerBook back whenever the servers changed for Metafilter. Only here's the thing--I don't think my problem is router related, b/c *only* the PowerBook (and not the Mini) is affected, and it's affected at home and at school. I fiddled around with the router settings to no avail, but discovered that if I turn off Norton Personal Firewall (which is not installed on the Mini) all Metafilter sites work. So can anyone tell me why my firewall hates Metafilter all of a sudden (no other sites have this issue), and more importantly, how I can fix it (other than disabling)?
posted by DiscourseMarker at 1:20 PM on May 30, 2007

It's probably the same issue that I had -- path MTU discovery/blackhole discovery.

I don't really know what I'm talking about but I'm guessing that some of us, depending on geographic location, don't have a clear Internet route to the MeFi server. The packets get lost. This can be worked around using the path MTU discovery/blackhole discovery feature built into most routers and operating systems. But some firewalls disable this function.

What OS do you have installed on the PowerBook? I'm guessing not 10.4, because that has a firewall built-in, so you wouldn't need Norton. I think I recall reading that OS X didn't have blackhole discovery until relatively recently, but I might be talking out of my arse.
posted by humblepigeon at 1:56 PM on May 30, 2007

I'm using the latest release of 10.4 on both the PowerBook and the Mini. I use the Norton instead of the built-in firewall, I guess b/c I'm obsessively paranoid (and it was free when I was in grad school). I know the simplest answer is "turn off Norton and turn on the Mac OSX firewall," but it just seems like there should be some way to make Norton do what I want (i.e. let Metafilter through). Especially since it *used* to work just fine on the PowerBook with Norton.
posted by DiscourseMarker at 5:39 PM on May 30, 2007

For god's sake don't use the Norton crap. The Mac versions are on par with the windows ones (pretty damn terrible), and there's no damn reason to use it in the first place. For one thing there's nothing listening for traffic on OS X by default, and furthermore the built-in firewall is perfect. If you want an application-level firewall (to control outbound traffic) use Little Snitch.

The only thing I could imagine anyone needing AV for on a Mac is to scan Microsoft Office files for macro viruses. Not that they'll affect your computer (sure, go ahead and deposit a file in C:\WINNT\SYSTEM32), but just so that people downstream won't complain.
posted by blasdelf at 7:22 PM on May 30, 2007

DiscourseMarker, did you customize your HOSTS file at any time on that powerbook?
posted by mathowie (staff) at 7:24 AM on May 31, 2007

Okay. Lecture again.

The problem is you can't know the Maximum Transmission Unit (MTU) of a given virtual link in a packet switched network like the Internet, because you don't know all the details of the various and sundry physical links.

1492 is common, because a great deal of the backbone traffic is on ATM or ATM over SONET connection. 1492 bytes of data + various headers leaves you with 1536 bytes, which is 32 ATM cells (48 bytes each.) Nice and smooth. Given that Ethernet's MTU is 1500 (by specification), 1492 is the largest combination of ATM cells and Ethernet frames.

However, the Internet isn't all Ether to ATM. There are other links, they may have less than 1492 MTUs. What can we do?

Option one: Fragment. Break the IP packet into two packets, set the Fragment bit in the header, and send the two packets. This is inefficient -- you double the number of IP headers, and you often end up sending one big frame and one little frame, which is even more inefficient.

Option two: Path MTU discovery. As you talk to the site, you set the Do Not Fragment bit and slowly increase your MTU. When you get the ICMP message that says "Fragmentation need, but Do Not Fragment is set", you back down to the last good MTU. That gives you the largest possible MTU for that link.


1) How do we know that this is happening? ICMP, the Internet Control Message Protocol. Amateurs will tell you that ICMP is a drastic security hole, and you must block it.

Note: ICMP runs the internet. Blocking ICMP means that the Internet can't tell you about problems. This is the primary reason that Path MTU fails -- you never ever see the ICMP message that says that you need to fragment the packet.

With Path MTU set, and ICMP blocked, the Internet sort of works. The initial sync works, because there's little to no data involved, thus, the packets make it. When the data starts flowing, you get full packets, which can't reach you because of the combination of Path MTU and ICMP blocked, and you get hangs.

With ICMP running, the Internet does work, mostly. The problem is broken routers that will take a too-large packet with Do Not Fragment set, drop it (correct) and not send out the ICMP message saying they'd done so (WRONG!).

Because of this, we use Black Hole detection. If we can detect that remote routers aren't sending ICMP, we can set a very small MTU -- usually 578 bytes, which should get everywhere. But if you are blocking all ICMP, it'll take a while to figure out that your router is the black hole. The Internet will be slow and flakey, unless you give up and set a small MTU by hand.

BTW: Do you use PPPoE or PPPoA? If so, you will need a lower MTU. Yes, Ethernet can handle 1500, and ATM likes 1492, but your Ethernet frames will have PPP headers surrounding your IP packets. Thus, you need to drop your MTU or MSS to compensate. How much? Depends on the implementation. Try 1452, if that doesn't help, 1400.

I know that handles fragmentation (and by extension, Path MTU Discovery) correctly. You can test your MTU to MetaFilter by using a combination of the -D and -s XXX switch on ping (at least, the FreeBSD and OS X versions...) where XXX is the MTU *minus eight* that you want to test. If you get "sendto: Message to long" then that's the problem. Ping that size again, but without the -D (do not fragment.) If that returns nothing, you've got a black hole problem. If it's your router, fix the ICMP filter. If it isn't, you're stuck with a lower MTU or Black Hole detection.
posted by eriko at 9:52 AM on May 31, 2007 [1 favorite]

Less lecture:

1) Don't block ICMP willy nilly. Ping and Traceroute are handy, and ICMP Type 3 (Destination Unreachable) with its various options to tell you why are critical to data flow.

2) If you're running Ethernet to a router to the net, you can probably safely use an MTU of 1492. If you are using PPPoE or PPPoA, you need less, try 1452, but don't be surprised if 1400 is needed. If you're using a modem, you need *much* less, but mainly for latency reasons.

3) The very clever answer is MSS clamping, which allows a router to rewrite all the packets to a smaller MSS, which implicitly shrinks your MTU. Most home routers don't do this.
posted by eriko at 9:58 AM on May 31, 2007 [1 favorite]

Thanks eriko. That's really helpful info. I think I'm starting to understand now.

What I still can't understand is why (and subdomains) was the only site that didn't work when my router's firewall kicked-in. All other sites worked. Well, obviously I can't prove that all other sites worked, because I didn't get around to trying the billion sites out there. But I didn't find any other site inaccessible, or even slow.
posted by humblepigeon at 12:40 PM on May 31, 2007

I had this same problem until a few weeks back when my ISP changed something in their side. Metafilter was the only site that I found to be affected.

Once a solution is hashed out, could a sidebar link be added about this? I believe there have been a few metatalk and ask.meta questions about this and it continues to be brought up. When it was happening to me, I didn't know to mention it here (during my work hours when I could bring it up) because I thought the problem was isolated to me.
posted by aburd at 9:42 PM on June 6, 2007

« Older Wow, I can be sexual too   |   What a small world... Newer »

You are not logged in, either login or create an account to post comments