Has anyone thought of grabbing a snapshot...? March 12, 2006 6:07 PM Subscribe
Has anyone thought of grabbing a snapshot of pages linked from MetaFilter, a la Yahoo MyWeb?
quasshole?
You mean a cache of everything ever linked? That would be difficult and problematic to store and retreive, not to mention possible copyright concerns on the part of site owners that get cached (and they don't get ad revenue when their pages are loaded from this site).
posted by mathowie (staff) at 6:45 PM on March 12, 2006
You mean a cache of everything ever linked? That would be difficult and problematic to store and retreive, not to mention possible copyright concerns on the part of site owners that get cached (and they don't get ad revenue when their pages are loaded from this site).
posted by mathowie (staff) at 6:45 PM on March 12, 2006
That's kind of the web in a nutshell isn't it? That's just how it goes... I think the server resources (not to mention copyright issues) needed to support such a project would be gigantic.
posted by Rhomboid at 6:47 PM on March 12, 2006
posted by Rhomboid at 6:47 PM on March 12, 2006
I can't see the point of trying to replicate archive.org if that's what you mean.
posted by keijo at 7:44 PM on March 12, 2006
posted by keijo at 7:44 PM on March 12, 2006
one day, someone's (or some team of ppl) going to have to go thru all the old posts and either flag em if they're dead, or change the links to go to archive.org, no?
posted by amberglow at 7:55 PM on March 12, 2006
posted by amberglow at 7:55 PM on March 12, 2006
bah! you revisionist, you!
posted by crunchland at 7:58 PM on March 12, 2006
posted by crunchland at 7:58 PM on March 12, 2006
one day, someone's (or some team of ppl) going to have to go thru all the old posts and either flag em if they're dead, or change the links to go to archive.org, no?
No.
posted by dg at 8:31 PM on March 12, 2006
No.
posted by dg at 8:31 PM on March 12, 2006
You could write a clever little script to test for (1) unresponsive domain, (2) 404s, (3) apparently unrelated content (based on some, I dunno, bayesian hash of the thread versus the current content of the target page), and provide an alternative archive.org link automatically.
Implementation left as an exercise for the reader.
posted by cortex at 10:04 PM on March 12, 2006
Implementation left as an exercise for the reader.
posted by cortex at 10:04 PM on March 12, 2006
I had the same thought as cortex. The problem is that archive's URLs are dependant on exactly the date it was backed up. You'd have to search through their archive list for a given page and link to the closest one. Which would be Righter, the last cache before the MeFi link, or the first cache after it?
posted by Plutor at 7:14 AM on March 13, 2006
posted by Plutor at 7:14 AM on March 13, 2006
the first after, i'd say...that way we get whatever response occurred.
posted by amberglow at 9:40 AM on March 13, 2006
posted by amberglow at 9:40 AM on March 13, 2006
« Older When do we get Recent fantastic posts for the blue... | Profile page answer counts shouldn't include... Newer »
You are not logged in, either login or create an account to post comments
posted by MarkO at 6:09 PM on March 12, 2006