Join 3,554 readers in helping fund MetaFilter (Hide)

Bringing out the dead
September 1, 2011 6:35 PM   Subscribe

My first pony.

An organized effort - along the lines of the backtagging project - to go through all past threads, replacing dead links with their archive.org equivalent where possible.

Maybe some kind of script could be written to automate the process.
posted by Trurl to MetaFilter-Related at 6:35 PM (44 comments total) 2 users marked this as a favorite

Now that there... that's a big pony.
posted by MrVisible at 6:37 PM on September 1, 2011 [3 favorites]


This is a big pony.
posted by phunniemee at 6:41 PM on September 1, 2011 [18 favorites]


No.
posted by Brandon Blatcher at 6:41 PM on September 1, 2011


Ok yes, but you have to do the first 2. After that, someone will dive in.
posted by Brandon Blatcher at 6:42 PM on September 1, 2011 [1 favorite]


I would love, love, love to see something like this. There are so many great older posts where many of the links are dead, even if the same content still exists somewhere on the web. Heck, even one I posted six months ago has a couple of dead links, and it's the most popular one I've made. It makes trawling the archives a headache and reduces the value and usefulness of the site's history, especially for people who don't know about Wayback/Google cache.

If we could sign up backtagging-style to fix posts, even just our own posts (with oversight or double-checking of some sort to prevent abuse), I'd have all of mine mended lickety-split.
posted by Rhaomi at 6:43 PM on September 1, 2011 [2 favorites]


We do have dead links marked when we went through and did backtagging. That said, I'm a little happier with the cruft model only because it them presumes that we're doing some sort of retroactive curation at which point every twitchy poster will ask us to please replace whatever old link in their post with a new and better one [I've been tempted to do this myself though I have resisted] and that way lies madness.

There are a few other concerns if we were thinking this way

- what about pages where the website is still an active URL but the content has changed?
- what about pages where the URL resolves to a holding page but is technically not "dead" [there are so few 404s on the web today]?
- what about when archive.org is slow (it happens) and these links don't work?
- what about times where there's a changed URL but it's not at archive.org?
- what about pages archive.org doesn't have indexed?
- what about multi-link posts?

We're easily talking about thousands of links, all so people who are, I guess, surfing through the back list don't have to type a URL in to archive.org themselves? I could see this operating better as some sort of a Greasemonkey right-click to open URL at archive.org addon something. Oh look.
posted by jessamyn (staff) at 6:44 PM on September 1, 2011 [12 favorites]


Could an MEU of modern day Mefites sent back to 1999 take over the web?

Or would they just buy Apple stock and goof off for 20 years?
posted by Brandon Blatcher at 6:46 PM on September 1, 2011 [7 favorites]


This is a big pony.

This is the greatest link in the history of Metafilter.
posted by furiousxgeorge at 6:47 PM on September 1, 2011


Automation definitely has issues, but it's not the only solution. If backtagging is worthy of an organized, people-powered effort, this certainly is -- after all, what's the point of being able to find old posts if the links are broken? Even the finest posts become content graveyards after a couple of years, which is problematic for well-known ones or those with high "referenceability." Restoring even a fraction of them would really improve the utility of the site, especially when it comes to searching old posts and browsing the most popular from the last month/year/all-time.
posted by Rhaomi at 6:53 PM on September 1, 2011 [1 favorite]


(Not to mention I missed most of the backtagging project, and it would feel good to have another opportunity to tangibly FIX ALL THE THINGS!!)
posted by Rhaomi at 6:56 PM on September 1, 2011 [2 favorites]


Honestly, i think it would be far more productive and useful to the site if we went back and fixed all the misspellings, left out words etc.

Maybe the grammar too.
posted by Brandon Blatcher at 6:59 PM on September 1, 2011 [1 favorite]


I once started a project to compile all the folklore links on MetaFilter. I actually gave up a short way out of the barn because most of the links were dead. I'm not sure that's really ours to fix, though. If you're passionately interested in the topic, you can search archive.org yourself, and then so often archive.org doesn't have the content that was deep in a site- just the outward facing pages. If not, you can kind of be like "Oh well, the past is the past."
posted by Miko at 7:01 PM on September 1, 2011


But it's also good to find them gone, and echoes of a crowd foregathered once to talk about what's lost.
posted by Abiezer at 7:01 PM on September 1, 2011 [2 favorites]


Miko: "I'm not sure that's really ours to fix, though. "

Even if we could only fix our own posts, I'd be super-pleased. I take pride in making good posts and it sucks seeing them slowly erode and become useless.

(Plus, the original poster would be better able to remember what the old content was and find accurate replacement links, even if they're on another site instead of on one of the standard caches.)
posted by Rhaomi at 7:19 PM on September 1, 2011


But how would you do that for unique things which are simply gone (as opposed to articles or cartoons that are maybe mirrored elsewhere)? It could never be comprehensive.
posted by Miko at 7:26 PM on September 1, 2011


Could an MEU of modern day Mefites sent back to 1999 take over the web?

Or would they just buy Apple stock and goof off for 20 years?


This would create a Temporal pardox. If this mefi MEU were to go back in time and buy Apple stock, the returns would not yield as much as google stock for example. In other words, there would be no "sitting back", only a paced growth unless you "roll it in". However, the unit could purchase a oil platform and construct a new E-spiritual retreat, a temple of sorts and if required, spy on the Andorians. In summation, 20 years has not passed in your scenerio and logic dictates that a future involving 'sitting back" based on apple stock returns alone may never exist.
posted by clavdivs at 7:29 PM on September 1, 2011


Miko: "It could never be comprehensive."

It doesn't have to be. But there are plenty of ones that can be repaired, even partially. Like my Cartoon Network post or stavros' New Years YouTube extravaganza -- YouTube links die a lot, but there's almost always another version of the same video on YouTube or elsewhere. Even if some links were unique, it's a lot better to have one dead link in ten than one in two.

It just makes it hard to point to older Mefi posts as a good reference. They almost always were at the time, but after awhile every recommendation has to caveat itself with "here's a working version of broken link X, btw." I even had to re-do the big Craig Ferguson post as a comment, so I could link to that instead of the original post whenever I wanted to share a round-up of the show's highlights with people. A few links stayed dead, but it was a heck of a lot more useful than before.
posted by Rhaomi at 7:38 PM on September 1, 2011


I'm going to do this for all of my posts by going back, adding a 'seemyprofile' tag to each, and then reproducing each one with recently-updated/fixed links. Tomorrow, maybe.
posted by carsonb at 8:04 PM on September 1, 2011


I would volunteer for this.
posted by zarq at 8:08 PM on September 1, 2011


I would volunteer for this and totally get in trouble at work.
posted by elizardbits at 8:25 PM on September 1, 2011 [2 favorites]


What crazy timing. I just booted up an old computer and found the zombie pics from an AskMe reply from long ago and put the pics up on Flickr. The links are in my profile.
Also: I hate Flickr.
posted by herrdoktor at 8:36 PM on September 1, 2011 [1 favorite]


some sort of a Greasemonkey right-click to open URL at archive.org addon something. Oh look.

Woah, that looks like the best thing ever - great find, jessamyn!
posted by UbuRoivas at 8:37 PM on September 1, 2011


My first pony.

Hm.
posted by UbuRoivas at 8:41 PM on September 1, 2011


I feel like it'd be a bit saner (and much less of a potentially endless moving target) to approach this in terms of a What To Do About Linkrot Awareness Day thing—put together a nice page with pointers to resources for working around linkrot, like archive.org or the greasemonkey script Jessamyn linked to, sort of a quick "how to try and find it" primer—and let folks do the work on an ad hoc basis when they want to find something.

Backtagging was really useful because it provided a sort of fundamental completeness to the mefi-side archives; we put in a bunch of collective effort to get our stuff in order in a way that we know won't decay because we have complete control over it. The difference in control (to say nothing of effort and complexity) in trying to fix linkrot link-by-link is kind of considerable, and it's a difference in kind besides.
posted by cortex (staff) at 8:41 PM on September 1, 2011 [4 favorites]


I think the biggest problem with trying to fix link rot is that, unlike the backtagging project, this would have to be on perpetual project, since links rot every day.
posted by crunchland at 9:24 PM on September 1, 2011 [1 favorite]


My biggest concern would actually be, "What about situations where archive.org's version of a link doesn't actually contain the same information that was at at the link when the poster made the post?"

There wouldn't be any way for an archivist to know this, but it would constitute a sort of "false positive": You'd think you'd found the archive.org version of a dead link (hooray!) but you wouldn't realize that, eight years ago, the poster had actually posted something entirely different, even if it happened to reside at the same URL at some point.

A very dumb and basic example would be, if someone had linked to the front page of nytimes.com (for some reason), and later nytimes.com went defunct, how would the archivist know which version of the nytimes.com link at archive.org was right? And perhaps none of them would be right.
posted by Conrad Cornelius o'Donald o'Dell at 9:30 PM on September 1, 2011 [1 favorite]


Ah, we had a discussion on this a few months ago.

Going forward, I'm setting up a Mefi-specific Wayback Machine on my own server, capturing what we link. Video would not be the first priority, to say the least.

For items that have already died, there are caches like the Wayback Machine. I have an idea for how to work better with it, but it will take quite a bit of work and discussion. This is also less of a priority, because the Internet Archive is relatively stable.
posted by Pronoiac at 10:48 PM on September 1, 2011



My biggest concern would actually be, "What about situations where archive.org's version of a link doesn't actually contain the same information that was at at the link when the poster made the post?"


Archive.org has dated archives (for just this sort of reason) so you can match them with the post date and be reasonably assured you are seeing the same thing. It's not fool proof but it seems to do a reasonable job of it.
posted by tallus at 2:49 AM on September 2, 2011


my old links are dead
and it just ain't fair
you can click and click
they don't go nowhere
they say "can't be found"
they say "four oh four"
all these dead links, babe
i can't take no more

but it's no use y'all
to try'n bring 'em back
they'd just be zombies
whole thing'd be whack
so let's make new links
yeah, i'll make one right now!
can you guess what it is?
it's a link to a ____ !
posted by flapjax at midnite at 5:06 AM on September 2, 2011


phunniemee: "This is a big pony"

Actually, Seth Green is really small, so that's not really a big pony.
posted by Grither at 5:33 AM on September 2, 2011 [1 favorite]


Also, could the mods go back through all previous threads and ensure they are in line with current thinking on moderation?
posted by biffa at 7:26 AM on September 2, 2011


I'd like a program to replace all my poorly thought out petty comments with something measured and considerate.
posted by The Whelk at 7:44 AM on September 2, 2011 [3 favorites]


We can rebuild them; make them better, stronger, faster.
posted by owtytrof at 7:50 AM on September 2, 2011


Also, could the mods go back through all previous threads and ensure they are in line with current thinking on moderation?

We tried to get Paphnuty to do that back in the day, but it didn't work out very well.
posted by cortex (staff) at 8:34 AM on September 2, 2011 [1 favorite]


I could see a footer section where only the OP and maybe the thread participants could add notes, which might include new links, better links, references, archive links, etc. I completely see the policing challenge with that, but if everyone involved in a post were me-mailed when a note was added, perhaps the community itself could police that element. We could become shepherds of our own posts, at least footer shepherds.

I totally support the effort to make this site more archival and I hope we can figure something out.
posted by Toekneesan at 9:44 AM on September 2, 2011 [1 favorite]


I know it's sort of a wacky ambitious thing for someone to do, but an approach to that idea that wouldn't involve making mefi itself more complicated would be for someone to put together a well-structured The Annotated Metafilter blog/wiki that did thread-specific writeups in a way that'd be easy for a companion browser script to grab and display for interested readers. It could be handled as a multi-user project or with submissions screened by a single curator.
posted by cortex (staff) at 9:59 AM on September 2, 2011


If we come across an old thread with deadlinks and go to the trouble to find the replacements, would it be possible to email the mods and have them fixed? That seems reasonable on an occasional basis, though obviously not as a project.
posted by maryr at 10:29 AM on September 2, 2011


I tried to do this once last summer and sent, via the Contact sheet, some proposed new and improved, in my opinion, titles for those posts that seemingly had no titles and some new links for one past post but I got this from pb:

Thanks for these, I added them. We're hesitant to correct/update old links once the post has been archived because they're a core part of what people were discussing. Changing them after the fact--even though the intentions are good--could make for a confusing archive.

Only the links were not changed in that post* -- nor were any titles changed.

(*The new and improved version would have been thus:

Folk Music. Stefan Wirz and Hideki Watanabe pay homage to their favorites. Check out Hideki's Muscle Shoals page for another slice of his Americana pie. )

Now, not changing the titles, I can understand but not changing the links--that made no sense at all. They were essential to the post and they were the same pages, only with a new URL. I decided this must be a policy thing and did not pursue it thereafter.
posted by y2karl at 11:19 AM on September 2, 2011


Ouch! ...after clicking the ENTER box on those links, I can see why. They lead to NSFW sex stuff. OK, now I understand.
posted by y2karl at 11:24 AM on September 2, 2011


But, then again the ENTER box is only on the Woodstock Legends splash page only and if one clicks on the barn rather than the ENTER box, one goes to the original page sans any porno link. How weird. That does not seem to be the case with the rest of the pages on his site--all those work fine. What a mystery...
posted by y2karl at 11:28 AM on September 2, 2011


For the record, I have emailed Mr. Watanabe regarding the ENTER box on that page just now.

I do not recall that box being there when I wrote about the dead links, be that as it may.
posted by y2karl at 11:36 AM on September 2, 2011


I think if we all had the ability to go back and edit our own posts, MetaFilter would end up looking like a George Lucas remake.
posted by CancerMan at 11:56 AM on September 2, 2011


What if we limited candidates for updating to:
1) high-visibility posts -- ones that have been mentioned on the podcast, featured on the sidebar, or have a lot of favorites (100+, say; there were only seven of those in the last month)

2) that have significant linkrot affecting many links or a few of the most important links

3) which the OP has requested after preparing an accurate replacement in advance?
This would target the posts most likely to have traffic down the line -- people catching up on the podcast, browsing the sidebar archive, searching their favorites or reading the most-popular lists -- and would reduce the burden on the mods by both limiting potential repairs at the outset and requiring the fix to be prepared before any request to fix is made.

And if even that is too many, maybe the mods could deputize some trusted users (there have been several volunteers in this thread already) to help manage requests. I'm picturing a basic Google Docs form where people can send fix requests to (consisting of a link to the thread, a link to the "visibility indicator," an explanation of what fix is needed, and the replacement post). Then any mod or deputy can check the collected requests at their leisure, test out the links, and then implement them (or if that's too much power, simply mark them as acceptable for a mod to implement themselves later). With insta-bans for any spamming, abuse, or changes made other than the requested repairs, of course.

This way, the only additional mod action would be copy-pasting the vetted replacement into the linked thread. With enough motivated people writing up and testing out each other's fixes, I think this could really help improve the usefulness of the archives in a low-hassle way.
posted by Rhaomi at 2:33 PM on September 2, 2011


My First Pony: Deadlinking Is Magic
posted by hippybear at 4:52 PM on September 2, 2011 [2 favorites]


« Older Am I doing this right? I never...  |  Regarding this post, which was... Newer »

You are not logged in, either login or create an account to post comments