Working offline? June 14, 2009 10:03 AM   Subscribe

Can I download askmefi questions and answers?

For the next few months, I won't have regular internet access except when I drive out to starbucks to get free wifi. If I download askmefi with RSS I can download all the questions, but is there a simple way to download both the questions and answers so that I can read them offline?
posted by btkuhn to Feature Requests at 10:03 AM (15 comments total) 5 users marked this as a favorite

This isn't something that comes up very often so we're not likely to build something specifically for offline browsing. It sounds like you need an outside tool. You might ry a Google search for offline browser and see if any the results can help you out.
posted by pb (staff) at 10:29 AM on June 14, 2009


From a terminal:
wget -r -l2 -w2 http://ask.metafilter.com

Notes:
- the "-w2" causes it to wait two seconds between requests - it'll take a bit longer, but is the polite thing to do, to avoid crushing the server

- If you're on windows, you'll first need to download wget for windows
posted by chrisamiller at 10:50 AM on June 14, 2009 [9 favorites]


If you want to get fancy, you could set up some filters to exclude the /user/ and /tag/ pages, which you probably won't be needing, and will add a lot of time and bandwidth.

There are also many GUIs for wget that will make this all easier for someone not familiar with the command line.
posted by chrisamiller at 11:00 AM on June 14, 2009


I've used Winwget before. It hasn't been updated in a while, but it did the job. There are a million download-related extensions for Firefox, too, but I don't know which ones are good.
posted by box at 11:17 AM on June 14, 2009


Scrapbook is a Firefox add-on that will download pages, and will also follow embedded links, so if someone posts a link to Amazon, for example, it will also download that Amazon page. You can decide how deep to follow links, and whether you want to download images and Javascript too.

It works well.
posted by SuperSquirrel at 11:26 AM on June 14, 2009 [1 favorite]


Awesome. I want this thread in my Recent Activity, for chrismiller's comment.
posted by iamkimiam at 12:53 PM on June 14, 2009


wget is powerful but it can be a bit clunky to use on dynamic sites. I use HTTrack as my offline browser and it works very well. One of the advantages is you can save sessions configured to download what you want and then update them with just the new content in the future.
posted by Mitheral at 1:40 PM on June 14, 2009


If you are on a Mac, this quickie script can help:

http://nansi.org/delete-me/scrape.bash

You might need to use fink to install some stuff. Once fink is installed, open Terminal and type this:
sudo fink -y install wget links

If you are on windows, install Cygwin, and then you can use this script.
posted by popechunk at 4:50 PM on June 14, 2009 [1 favorite]


If you use Google Reader to subscribe to individual questions that interest you (if you have the time and inclination to do so), it has an offline mode.
posted by IndigoRain at 6:23 PM on June 14, 2009

I want this thread on my recent activity because I'm a greedy motherfucker.
A feature you may not know about: if you favorite a post, you can see it in your recent activity page by clicking the My Favorites tab.
posted by JDHarper at 9:58 PM on June 14, 2009


Re: scrape.bash - Huh. Maybe I've been spoiled by Perl, but grep should have some equivalent to "(\d)+".

I've seen a Yahoo Pipe for getting all the comments for the most recent posts. I'll see if I can find it again.
posted by Pronoiac at 10:45 PM on June 14, 2009


Okay, that (those?) Yahoo Pipe(s?) wouldn't work unless you think that context is for the weak.
posted by Pronoiac at 11:11 PM on June 14, 2009


grep should have some equivalent to "(\d)+"

Assuming you have Gnu grep, [0-9]+ does that job but is not equivalent to [0-9][0-9][0-9][0-9][0-9][0-9] which is more concisely expressed as [0-9]{6}.
posted by flabdablet at 5:23 AM on June 15, 2009


You're mad about the length of the regex, but you completely ignored the year 2100* bug!?!

That's how I'm going to make my retirement money: Y2100 AskMe script consulting.

* assuming 10k threads per year
posted by popechunk at 6:43 AM on June 15, 2009


Eh, it would work, but I was just taking off points for style, not trying to do a reasoned critique, which would require actually looking at the wget man page. *shudder* flabdablet very carefully preserved the y21h/pre-August 2008 bug, which I didn't.
posted by Pronoiac at 8:50 PM on June 15, 2009


« Older Outrage? Why, yes.   |   Pre-10th! Newer »

You are not logged in, either login or create an account to post comments