Unavailable from comcast? September 27, 2014 11:18 AM   Subscribe

I have a weird MeFi availability issue. Can't establish a connection to it from home, but I can get to it using work's VPN. I'm not driving in to work today to verify, but I'm willing to assume it works from my desktop at work too.

If you're posting about something on the site (complaint, bug, issue, etc), please describe the problem with as much detail as you can. If it's a bug, describe exactly the steps you went through to see it and what browser/OS you are using.

I'm using Firefox/Ubuntu, but it also triggers on Chromium. And wget:

laptop:~$ wget http://www.metafilter.com/user/82435
--2014-09-27 11:03:04-- http://www.metafilter.com/user/82435
Resolving www.metafilter.com (www.metafilter.com)... 54.186.13.33
Connecting to www.metafilter.com (www.metafilter.com)|54.186.13.33|:80... connected.
HTTP request sent, awaiting response...

I have a workaround, and plenty of other things to do over the weekend, but if I'm not the only person affected then Mefi is losing hits and clicks =/ Plus, metafilter only available at work is not a great productivity lifehack.

I have some errands to run in a few hours, but before I leave I'll set up httping to request that URL once a minute in case that helps debugging.
posted by pwnguin to Uptime at 11:18 AM (66 comments total) 1 user marked this as a favorite

Yeah, we've had scattered reports of weirdness for folks since yesterday; pb's looking into it.
posted by cortex (staff) at 11:21 AM on September 27, 2014 [1 favorite]


If it helps, we had this problem too. Time Warner cable, so with the recent merger, I don't know what it really is. We ended up buying a new router, and that fixed it. I have no idea why that would be the case. We were able to get out to metafilter using another network as well, and our router was dodgy. So the upgrade was not a big deal.
posted by [insert clever name here] at 11:36 AM on September 27, 2014


My partner and I have Optimum as our provider, and for most of yesterday, neither of us could access MetaFilter on our phones over wifi. Desktops seemed not to have the problem, oddly enough.
posted by ocherdraco at 11:56 AM on September 27, 2014


Yeah, this is a really odd one. As cortex mentioned we have scattered reports of people not being able to reach MetaFilter at all from across the globe.

We asked Amazon to investigate and a few people have posted their failing traceroutes there. Amazon just got back to us in that thread and said they're investigating. So there's some progress.
posted by pb (staff) at 12:06 PM on September 27, 2014 [1 favorite]


If it was just the router, I wouldn't expect to be able to open a connection to Apache or the load balancer at all. And if the router were hacked and doing funny HTTP level stuff, its doing a bad job of hiding itself.

In fact, a test with nc shows I get a 301 redirect to www if I connect to metafilter.com. To me this suggests the flaw isn't at the network level, or if it is, is between whatever is serving www and apache.
posted by pwnguin at 12:34 PM on September 27, 2014


I reported this yesterday morning, but since then, it's been OK.

Also, scattered reports of weirdness with occasional outbreaks of oddity.
posted by arcticseal at 12:45 PM on September 27, 2014 [2 favorites]


Amazon can't find the problem and the traceroutes posted are not helping. They got back to us with this:
[...]

As far as the traces you've provided, unfortunately they aren't a very good indication as to what the issue is because they appear to be a UDP based traceroute. In order to troubleshoot this issue we will need TCP based traceroute. TCP traceroutes can be run from Windows as well as Linux.

I don't know if it is possible but if you could acquire but this would put us in a much better position to troubleshoot this with you. Windows users can download http://sourceforge.net/projects/tracetcp/ and issue the following:

C:\tracetcp>tracetcp www.metafilter.com:80

Linux users can use:

sudo traceroute -T -p80 www.metafilter.com

[...]

This way we'll have the traceroute that shows either the packets are being dropped some place along the way or we'll see them make it to your instance and we can verify that with the packet capture.
If anyone who is experiencing this can get a TCP traceroute and post it here or on the AWS forum thread that would be a big help.
posted by pb (staff) at 1:09 PM on September 27, 2014


Could it be fallout from Amazon rebooting a bunch of their servers over the next few days?
posted by FreezBoy at 1:55 PM on September 27, 2014


Yes, that's when this problem started.
posted by pb (staff) at 2:50 PM on September 27, 2014


this has repeatedly happened to me with comcast over the past month or two.

i can always access the site over verizon LTE(or the DSL at my work), but it cuts in and out regularly on comcast at home. it's pretty much the only site that does that, too. and oddly, i've had the same issue connecting to the site at work at other locations we have that use comcast business...

i haven't isolated whether this is a DNS issue or not, and i totally should...
posted by emptythought at 3:12 PM on September 27, 2014


Heh, we've gotten a couple emails from folks worried that Something Bad Had Happened, and have assured them that, no, they have not been silently and mysteriously banned.

If you got banned from Metafilter you will probably have a pretty clear idea as to why and it won't take the form of strange connectivity issues.
posted by cortex (staff) at 3:56 PM on September 27, 2014 [10 favorites]


I have Comcast Chicago for my 'net and MeFi was unavailable for me for a few hours yesterday morning before returning to normal. During that time, a downforeveryoneorjustme.com check also said MeFi was down, so I didn't think anything of it.
posted by DirtyOldTown at 4:09 PM on September 27, 2014


Unfortunately, I don't have an AWS acct or the inclination to set one up right now and potentially waste that 1yr free tier.

laptop:~$ sudo traceroute -T -p80 www.metafilter.com
traceroute to www.metafilter.com (54.186.13.33), 30 hops max, 60 byte packets
1 tomato (192.168.2.1) 2.293 ms 2.652 ms 3.043 ms
2 96.120.60.25 (96.120.60.25) 13.466 ms 15.600 ms *
3 * * *
4 * * *
5 * * *
6 * * *
7 * 205.251.226.183 (205.251.226.183) 17.071 ms 205.251.226.179 (205.251.226.179) 20.122 ms
8 * * *
9 * * *
10 * * *
11 * * *
12 * * *
13 * * *
14 ec2-54-186-13-33.us-west-2.compute.amazonaws.com (54.186.13.33) 30.379 ms 28.845 ms 26.131 ms

I don't know what's going on, but if I had root I'd be looking at what's going on higher up the stack than tcp/udp. Varnish rules that got reset on VM reboot, unexpected firewall persistence, etc. Because I can talk HTTP to something over netcat and get a response back. But hey, maybe AWS implements some batshit insane HTTP load balancer that operates at layer 4.
posted by pwnguin at 4:58 PM on September 27, 2014 [1 favorite]


Could it be fallout from Amazon rebooting a bunch of their servers over the next few days?
Yes, that's when this problem started.


Sounds similar to how the big troubles with Google began when they changed their search algorithm a couple years back.

Looks like a lesson about the State of the Internet today - if you're doing business with one of the Big Big Web Companies, you're gonna get stepped on like an ant with one of their Big Big Web Boots. Accidentally, of course, MetaFilter isn't big enough to be more than 'collateral damage'. Just please tell me you have no relationship with Facebook (because after the Ello thread, Zuck and company are probably looking for a way to "oops" us).
posted by oneswellfoop at 5:52 PM on September 27, 2014 [1 favorite]


I'm another of the "can't get there from here" squad; I've been doing some digging and twittering with pb and Matt off and on this afternoon, but thought I might summarize what I'm seeing here:

* The traffic is making it from here to the metafilter servers, and back - it doesn't seem like a routing issue. Watching a packet capture, when I connect with an incognito session, the TCP 3-way handshake completes, my browser sends a GET /, and just hangs for 3 minutes, and finally, after 3 minutes, I get a TCP reset packet back from the metafilter address. During that 3 minutes, my local system sends a TCP Keep-Alive every 45 seconds, and receives a response from the mefi server.

* This is the weird part: when I connect with my normal browser, where I'm logged in, if I connect to port 80, I get back a perfectly good HTTP/301 redirect to port 443. The browser then connects on port 443, completes the 3-way-handshake, sends the TLS Client Hello, and times out after 20 seconds of waiting for the Server Hello. After 20 seconds, the client gives up and sends a FIN-ACK, and receives a valid ACK response.

I'm not 100% sure what it means; I don't know the Amazon hosting environment well enough to know what this might imply, but hopefully the information is useful in troubleshooting.
posted by jferg at 8:00 PM on September 27, 2014


The mods should add something on status.metafilter.com about this. I can only access this site on my mobile data connection, but I can access the status page fine.
posted by Gary at 8:36 PM on September 27, 2014 [8 favorites]


Yeah, I could not get to MetaFilter at all from my office on Friday, which made for a strangely productive day. Home connection is just fine.
posted by Curious Artificer at 8:44 PM on September 27, 2014


Oh, good, it's not just me. I can get here on my phone, but I've got Comcast at home and have been blocked all day.
posted by rewil at 9:49 PM on September 27, 2014 [1 favorite]


I also couldn't get to Mefi from my office! I felt lost all day at work!
posted by radioamy at 9:50 PM on September 27, 2014 [1 favorite]


Clearly anti-Modern theme forces are working to waste valuable pb bug-fixing time.
posted by Dr Dracator at 1:19 AM on September 28, 2014 [2 favorites]


Amazingly, I am thousands of miles away in England and have no problems accessing Mefi! How does that even work?

Functional & Working connection - traceroute from England:

traceroute to www.metafilter.com (54.186.13.33), 30 hops max, 60 byte packets
1 10.239.88.1 (10.239.88.1) 9.395 ms 9.394 ms *
2 * * *
3 * * *
4 * * *
5 * * *
6 * xe-8-0-0.cr2.dca2.us.above.net (64.125.27.33) 93.315 ms 97.422 ms
7 ae4.er2.iad10.us.above.net (64.125.21.58) 98.031 ms 97.855 ms 97.900 ms
8 64.125.31.150 (64.125.31.150) 101.537 ms 101.532 ms 101.511 ms
9 * * *
10 72.21.220.17 (72.21.220.17) 97.773 ms 205.251.244.9 (205.251.244.9) 97.769 ms *
11 * * *
12 * * *
13 * * *
14 * * 205.251.232.38 (205.251.232.38) 157.007 ms
15 * * *
16 * * *
17 * * *
18 ec2-54-186-13-33.us-west-2.compute.amazonaws.com (54.186.13.33) 154.344 ms 158.945 ms *

(why do some just have 3 *'s in there?)
posted by marienbad at 2:35 AM on September 28, 2014


(why do some just have 3 *'s in there?)

Because not everything on the internet responds to ICMP ping, but we know it's there because packets have an expiration (Time To Live) measured in hops.
posted by pwnguin at 2:51 AM on September 28, 2014


> The mods should add something on status.metafilter.com about this.

Is this too fine-grained or sporadic a thing for status? I had the non-connect early yesterday for blue, grey, and green, after which I gave up. Could connect to status OK but saw nothing about it there. No complaint on metachat either, or anyway no thread about it. I concluded it couldn't be the servers, must be just me or something on the path from metafilter to me and I'd just wait it out. My fix worked.
posted by jfuller at 6:08 AM on September 28, 2014


PS, fwiw I am on AT&T, not Comcast.
posted by jfuller at 6:11 AM on September 28, 2014


FYI, I'm also having this problem and have had it for almost two days. I'm on Time-Warner. But I'm able to get here via my VPN.
posted by Ivan Fyodorovich at 7:20 AM on September 28, 2014


Oh, also able to connect on my phone via Sprint so that's how I knew MeFi wasn't down.
posted by Ivan Fyodorovich at 7:23 AM on September 28, 2014


Glad to see I'm not alone, I haven't been able to get mefi to resolve from my home computer for a few days (Cogeco cable in Ontario) but using my neighbours wifi and my iPad it loads just fine, though I know they have Cogeco as we'll. Weird.

I'd post a tracery but y'know, iPad.
posted by angerbot at 7:36 AM on September 28, 2014


I posted a note to the status blog.
posted by pb (staff) at 7:57 AM on September 28, 2014 [3 favorites]


I don't know if this is related or unrelated, but when I try to go to the music subsite, it just times out the way that it does for everything mefi when I'm not using my VPN.
posted by Ivan Fyodorovich at 8:33 AM on September 28, 2014


Sorry, I think that was unrelated. Disregard. :)

If anyone's interested, the explanation is that earlier I wanted to find a way to connect to MeFi without doing the VPN thing because I just yesterday switched to Win8.1 and Microsoft freaked out when my login location suddenly changed to another city right after I installed my VPN client manager and connected with it. For unrelated reasons, I had to do another OS reinstall -- which always happens -- and so I wanted to avoid that problem with VPN until I researched it. So what I did instead was to go looking for a proxy for my browser. And one of the first things I saw was actually just the Chrome extension that runs everything through Google's compression server. My thinking was that if it's a routing problem, that would allow me to connect. And so I installed the extension and, yeah, the first time I tried to browse here, it connected. And then after that it didn't because of reasons. But I thought, well, I'll leave the extension running. So then I installed the VPN client for the service I subscribe to, and that's how I'm here.

But after posting my previous comment, I realized that I'd violated the cardinal rule of not complicating the problem environment, and so I turned off the extension and tried the music page again, and it works fine.

So I'm thinking that the content on the music page is just causing a problem with that extension and/or Google's compression servers.
posted by Ivan Fyodorovich at 8:49 AM on September 28, 2014


Just to add to the dataset, I haven't been able to connect to Metafilter from home at all in the past couple of days (Verizon FiOS). I can get to it on my phone if I use cellular data instead of wifi, and I haven't tried at work yet. I'm just going through a proxy for the time being.

Totally possible this one is an issue on my end, though. I rent and don't have access to the wireless router, so there's only so much I can do to troubleshoot internet things without breaking into my landlord's house.
posted by pemberkins at 9:05 AM on September 28, 2014


> For unrelated reasons, I had to do another OS reinstall -- which always happens --

Everything changes. Except that.
posted by jfuller at 9:15 AM on September 28, 2014


I've spent several days being confused, checking to see if it's down for everyone or just me, and then occasionally checking through a proxy to see what the hell is going on. It's a pain in the ass to configure a proxy to work with my iTouch, though.

I'm on CenturyLink in Albuquerque, New Mexico.

I have no idea what this means or even if the information it's returning is useful, since I'm on OSX, but the traceroute thing brings back the following for me: Version 1.4a12+Darwin
Usage: traceroute [-adDeFInrSvx] [-A as_server] [-f first_ttl] [-g gateway] [-i iface]
[-M first_ttl] [-m max_ttl] [-p port] [-P proto] [-q nqueries] [-s src_addr]
[-t tos] [-w waittime] [-z pausemsecs] host [packetlen]
posted by NoraReed at 1:20 PM on September 28, 2014


The OS X version of traceroute doesn't have the -T flag, but I believe you can use this instead:

sudo traceroute -P TCP -p80 www.metafilter.com
posted by Celsius1414 at 2:43 PM on September 28, 2014


We just made some changes behind the scenes to try to address this. If you were having trouble reaching the site and can read this, can you give it another try and let me know if it helped?
posted by pb (staff) at 3:17 PM on September 28, 2014


I can't check my office connection (Comcast) until tomorrow, but will report back when I do.
posted by Curious Artificer at 3:21 PM on September 28, 2014


I've been unable to access the site from my home internet connection all weekend. But now it appears to be back. So, you know, whatever you did, good job.
posted by wabbittwax at 3:25 PM on September 28, 2014


I'm able to access the site from home now, thanks!
posted by angerbot at 3:35 PM on September 28, 2014


Tadaima!

Seems like the problem has been fixed, at least for me!
posted by Chocolate Pickle at 3:38 PM on September 28, 2014


Looks like it's fixed for me. Do you know what the problem was?
posted by Just this guy, y'know at 3:47 PM on September 28, 2014


Nope, still don't know what the problem is. We moved to an existing Amazon instance that wasn't affected by the recent restart. This is only part of the fix, but it's a good indication that the problem exists only with specific instances.
posted by pb (staff) at 3:50 PM on September 28, 2014


For the first time in several days I have been able to log in.
posted by One Hand Slowclapping at 3:54 PM on September 28, 2014


Works for me now. Hadn't a couple of hours ago.
posted by neuron at 3:57 PM on September 28, 2014


Ditto. I am once again able to access.
posted by jferg at 4:03 PM on September 28, 2014


One troubling thing: I still can't ping www.metafilter.com, even though I can now access it.
posted by Chocolate Pickle at 4:08 PM on September 28, 2014 [2 favorites]


MetaFilter: scattered reports of weirdness with occasional outbreaks of oddity.
posted by Cranberry at 4:11 PM on September 28, 2014 [3 favorites]


I was unable to get here between sometime late Friday/early Saturday and the middle of Sunday afternoon. Comcast in Minneapolis.
posted by gimonca at 4:11 PM on September 28, 2014


IT'S ALIVE

thank god, I was replacing MetaFilter time with BoJack Horseman and I just ran out
posted by NoraReed at 4:26 PM on September 28, 2014


Yep, I can connect, too.
posted by Ivan Fyodorovich at 4:49 PM on September 28, 2014


Oh man, I thought it was just me! Things are fixed it seems now.
posted by Hermione Granger at 5:25 PM on September 28, 2014


Might it have had anything to do with people editing their hosts file to access the beta redesign?
posted by Rhaomi at 5:36 PM on September 28, 2014


Nah, swapping hosts doesn't affect anything.
posted by pb (staff) at 5:42 PM on September 28, 2014


Welp, I can get in again. Good luck with the whole AWS thing! Maybe you can stop by Devops Daycamp some time and tell war stories ;)
posted by pwnguin at 7:28 PM on September 28, 2014


I can access again! Callooh callay!
posted by pemberkins at 8:25 PM on September 28, 2014


Yep; all set here too.

So what did you do, pb? Unplug it and plug it back in again? 'cause that usually works.
posted by Curious Artificer at 6:29 AM on September 29, 2014


Oh, thank god for that. I was losing my mind here.
posted by robself at 8:54 AM on September 29, 2014


Status blog? How neat. I had no idea.
posted by Too-Ticky at 11:33 AM on September 29, 2014


Yup, came back for me today. I had seen some posts on Twitter about the new design effort and thought maybe you were cleaning out the riffraff to support that.
posted by yerfatma at 12:14 PM on September 29, 2014


No blue for two days freaked me right the fuck out!
posted by vrakatar at 9:27 PM on September 29, 2014


But hey, maybe AWS implements some batshit insane HTTP load balancer that operates at layer 4.

It's called a reverse application proxy, and I'm not sure if it's something AWS applies to traffic, or if it's how Metafilter's stack is set up to accept and process traffic, or something else.

I encounter them most often as web application firewalls (and man, did they ever save my bacon during this shellshock fiasco) and in other security contexts, but I've seen them used for load balancing on layer 4 and up.
posted by Slap*Happy at 5:18 AM on September 30, 2014


Digging into it, AWS Elastic is indeed a Layer 4 and Layer 7 load balancer.
posted by Slap*Happy at 5:22 AM on September 30, 2014


Thanks for doing all the back end work, pb et al! The site was down for us all weekend in the UK (though weirdly enough, I could still access it via 4G), but now it's accessible again.
posted by Sonny Jim at 7:06 AM on September 30, 2014


Digging into it, AWS Elastic is indeed a Layer 4 and Layer 7 load balancer.

Which makes sense. My point was that it's likely two separate layers, not a single application, and given I was getting HTTP redirects, the layer capable for issuing them was a more likely candidate than ucarp or an F5 or whatever.

Anyways, any more public speculation and we might end up volunteered to fix it.
posted by pwnguin at 8:42 PM on September 30, 2014 [1 favorite]


"...and thought maybe you were cleaning out the riffraff to support that."

Nope. I'm still here, yerfatma.
posted by terrapin at 4:39 AM on October 2, 2014 [1 favorite]


Hey, we moved servers again yesterday to get a bigger speedier box, but I haven't heard any reports of people not being able to reach the site, but if you spot anything weird like the original poster of this thread, let us know?
posted by mathowie (staff) at 11:34 AM on October 2, 2014


Could it be fallout from Amazon rebooting a bunch of their servers over the next few days?

My first thought for necessitating reboots was shellshock, of course. But for a cloud company like Amazon, Xen probably surpasses it. A bug that breaks hypervisors, that's just deadly to them.
posted by scalefree at 7:31 PM on October 2, 2014


« Older Metafilter, Android, nginix or NSA bug?   |   Has iOS 8 broken Metafilter commenting? Newer »

You are not logged in, either login or create an account to post comments