Spies from the future February 17, 2011 7:48 PM   Subscribe

I think that there ought to be a caveat somewhere about AskMe questions, pseudonymity, privacy, and social network scraping tools.

For years, the possibilities of social network scraping tools have weighed heavily on my paranoid reclusive internet guy mind. Now that we have some proof that these sorts of things are actively being developed¹ (see the thread here, good Ars Technica summary here, discussion of someone who would be really good at doing this here) it seems like we ought to have a note somewhere that essentially says
      If you are going to be posting any sensitive personal details such as financial information or anything intimate or embarrassing about yourself or anyone else such as a roommate, family member, or friend, remember that the security and privacy of such information depends not only upon your real identity remaining a secret now, in the present, but also continuing to be secret in the future. Small details in any post or comment you write, even years from now, can connect your MetaFilter account to other online accounts and eventually to your real identity, possibly as a result of an automated analysis carried out by a software tool... (blah, blah, quick explanation of the concept of a social network scraping tool that is simultaneously doing searches in tax records and other databases)
My primary concern here isn't so much the welfare of the AskMe posters themselves, since you kinda have to be in charge of your own privacy on the internet (in fact it should be driven home that mathowie & co. are in no way responsible for maintaining anyone's privacy or pseudonymity beyond maybe protecting credit card transactions and server logs, the reasonable stuff a site privacy policy would cover) but because AskMe posters often include lots of sensitive information about other people, family members or roommates who would also be compromised if the AskMe poster's real identity were uncovered.

So I think that we should have a note like the above either when you go to post an AskMe or maybe just in the FAQ. (Or perhaps a compromise, a FAQ note that is linked to prominently when you go to post.)

Sorta previously, sorta previously, sorta previously, related.

1. See? See? I was right all along! JUST BECAUSE I'M PARANOID DOESN'T MEAN THAT THEY AREN'T OUT TO GET ME!!!1eleven!!
posted by XMLicious to Etiquette/Policy at 7:48 PM (62 comments total) 12 users marked this as a favorite

This isn't going to stop me from saying, when opportunity dictates, that I once had a roommate who habitually didn't flush his poop down the toilet.

If necessary, I can give his name and address. I'm sure there's a toilet out there somewhere in desperate need of a flushing!
posted by phunniemee at 7:54 PM on February 17, 2011 [5 favorites]


...mathowie & co. are in no way responsible for maintaining anyone's privacy or pseudonymity beyond maybe protecting credit card transactions and server logs...

Although I should note that this pony for https on everything would be nice at some point in the future, even if only with free self-signed certs that we'd have to manually install in our browsers.
posted by XMLicious at 7:58 PM on February 17, 2011


it seems like most people already don't read the posting page or the faq (which is already linked). adding a paragraph or more of text isn't going to make that problem better.
posted by nadawi at 8:02 PM on February 17, 2011 [5 favorites]


Where would you think this caveat should appear? And not saying it's a bad idea, but even if it did appear, do you think that people would actually read it?

And while it's a good point to remember, going forward, what about all the stuff that's already been written? Could people point to this passage, and reasonably expect the askme question they posted 5 years ago to be deleted?
posted by crunchland at 8:06 PM on February 17, 2011


This is too long of a disclaimer to actually go on the posting page [and if everyone got to add disclaimers, the page would be 7000 words long and at the end, not let you ask a question] but maybe it's worth writing up for the wiki or the faq in a way that we could link to it?

The only major thing along these lines that we've told people is that the Anon feature is not intended to be absolutely anonymous. That is, admins may know which account asked the question (though we could only tell by digging, not by just seeing your username next to it) and we're not prepared to go to jail over your right to ask a totally anonymous question.

At some level, this is a concern with writing anything on the internet ever. So if you're serious about this, it would have to be something for the entire site.
posted by jessamyn (staff) at 8:07 PM on February 17, 2011 [1 favorite]


Yes, it applies to the entire site; so I think the last option I suggested, a FAQ item that is linked to prominently when you go to post an AskMe, is probably the best option.
posted by XMLicious at 8:11 PM on February 17, 2011


I think this caveat is a bad idea for AskMe, but it's a good point in general. Someone I know IRL followed me on my desjardins Twitter account yesterday, which should have been solidly walled off from my Facebook account and my real name in general. She hasn't replied as to how she found my account, and I'm sure she has no ill intent, but it's a little disconcerting.
posted by desjardins at 8:12 PM on February 17, 2011


(Also, to further reply to jessamyn: yes, I think that this is definitely a concern with writing anything on the internet ever, and indeed with writing anything anywhere that might eventually end up on the internet, and in fact with saying anything within range of a video camera or microphone that might eventually get its output collected in a central archive. I would expect that at some point there are going to be multiple projects by multiple governments and corporations around the world that involve warehouses full of Watson-type computers continuously churning through the internet and every database available to them running these kinds of searches. Wish this only were a dystopian fantasy of my deranged mind but it actually seems pretty objectively possible and even likely.)
posted by XMLicious at 8:26 PM on February 17, 2011


at some point there are going to be multiple projects by multiple governments and corporations around the world

At that point we should have multiple MeTa threads about it. For now, I feel like the Venn-diagram overlap between "people who are likely to post revealing information about others to AskMe" and "people who carefully read lengthy disclaimers before posting to AskMe" is fairly small.
posted by staggernation at 8:36 PM on February 17, 2011 [4 favorites]


Something Penelope Trunk wrote once about self-disclosure (online and off) that really resonated with me: "...if I am living an honest life, and my eyes are open, and I’m trying my hardest to be good and kind, then anything I’m doing is fine to tell people." ... "And when you think you cannot tell someone something about yourself, ask yourself, 'Really, why not?'"
posted by Jacqueline at 8:40 PM on February 17, 2011 [3 favorites]


God that was long and boring. If I'm signing up for the site or something? No way I read it.
posted by J. Wilson at 8:49 PM on February 17, 2011


jacqueline - i can see your point and agree with it mostly - but trying your hardest to be good and kind doesn't matter to an employer who has fucked up rules. it sort of smacks of "if you're not doing anything wrong then you have nothing to worry about" which is pretty flawed when a person in a position of power is using that power to control. just because you or i are honest and good and kind doesn't mean those pulling the strings are.
posted by nadawi at 8:55 PM on February 17, 2011 [9 favorites]


Another little note to give you a further taste of my paranoid musings: I remember seeing a news story, which I can't find in a cursory search, from way back in the nineteen hundreds about an MIT student who had used a custom-built laptop in a backpack with a crude digital video camera and heads-up display^ mounted on a visor, incorporating face recognition software and a database he'd compiled from yearbooks, to create a system that allowed him to walk around campus and have the computer recognize the faces of the people he met and automatically pull their info from the yearbook database and show it to him.

So the info from social network scraping databases won't just be available to the wealthy and the secret police types, (though they'll be the ones accessing it at first) rather eventually all of the strangers you meet on the street will probably have iPhone-type devices that will immediately tell them everything about you, and the perverts¹ will be able to leaf through every embarrassing photo of you ever taken as you walk past...

All of this makes me somewhat of an adherent of David Brin's "transparent society" idea, that now or sometime soon we need to start adapting our culture and social mores to the idea that no one any longer has anything like the normal expectations of privacy that existed in the past.
posted by XMLicious at 8:58 PM on February 17, 2011 [2 favorites]


If I'm signing up for the site or something? No way I read it.

I agree that many people won't read the FAQ, even if important parts are prominently linked elsewhere, but that's not the same thing as saying that it shouldn't be there.

Not to mention that just by discussing this issue in MeTa and more people in the community being cognizant of it, maybe some AskMe responders will be more likely to include passing remarks saying, "You really ought to have this question anonymized..." when it's appropriate. Too late for Google, as Google almost immediately indexes every new MeFi page (on the blue and on AskMe at least), but soon enough perhaps to escape other bots and far-future bots.

(Though as jessamyn notes and the FAQ observes, "Anonymous questions are for basic privacy, not for hiding from Interpol.")
posted by XMLicious at 9:10 PM on February 17, 2011


"And when you think you cannot tell someone something about yourself, ask yourself, 'Really, why not?'"

Hmm... Because maybe you have interests that are not among the accepted interests of mainstream society? Maybe because the society in which you live deems some of your activities unwholesome or outright criminal?

Anyone who would be comfortable giving their parents, neighbors, or the police complete access to information about everything they've ever done has got to be a)crazy, b) someone who's lived a very, very vanilla life, c)overly trusting, or d) someone who I can't honestly comprehend existing.
posted by Ghidorah at 9:20 PM on February 17, 2011 [7 favorites]


Ah, the vanilla life.
posted by clavdivs at 9:30 PM on February 17, 2011 [3 favorites]


The Trunk quote sounds like a prettified version of "if you've done nothing wrong you've got nothing to fear".
posted by We had a deal, Kyle at 9:33 PM on February 17, 2011


"...if I am living an honest life, and my eyes are open, and I’m trying my hardest to be good and kind, then anything I’m doing is fine to tell people." ... "And when you think you cannot tell someone something about yourself, ask yourself, 'Really, why not?'"

Well, back in early 1900s people might have said to the Jews living in Germany that there was no reason to hide the fact they were Jewish. 35 years later, I bet a lot of them would have been even worse off if there was some sort of undeletable online record of their association with that religion and with other Jews.

The point is, we can never really predict what innocent, honest, totally innocuous feature of our lives and personality might become the next trigger that sets off some government agency or private organisation's alarm bells. We don't know who is going to be in power 20 or 30 or 50 years down the track.
posted by lollusc at 9:34 PM on February 17, 2011 [12 favorites]


Paranoia fuel: You know that robot that's winning Jeopardy? I'm pretty sure this is the sort of thing it's designed for. Whereas previously, the [agency redacted] would have to rely on a defined trigger vocab list or whatever to figure out whose car to plant their tracking devices, now they've got Watson going all A.I. on the internet's ass.

(You're probably OK if you live in a U.S. city, however. Watson's not very good at those.)
posted by Sys Rq at 9:37 PM on February 17, 2011 [1 favorite]


My paranoia is way ahead of ya, Sys Rq: check out the third link in the OP.
posted by XMLicious at 9:54 PM on February 17, 2011


You know that robot that's winning Jeopardy?

Get'cher spoilers here, get'em now while they're hot! Drat, hadn't seen the last episode yet.
posted by arnicae at 10:12 PM on February 17, 2011


I will take Toronto for 948$
posted by clavdivs at 10:13 PM on February 17, 2011 [1 favorite]


My paranoia is way ahead of ya, Sys Rq: check out the third link in the OP.

Oh.

Well, at least you I know I'm not a net-scraping robot.

*pulses eerie red light to simulate human wink*

*updates firmware*
posted by Sys Rq at 10:24 PM on February 17, 2011 [9 favorites]


Because maybe you have interests that are not among the accepted interests of mainstream society? Maybe because the society in which you live deems some of your activities unwholesome or outright criminal?

And as the less-noble cases of the Chinese human flesh search engine phenomenon demonstrate, (and the similar activities of 4chan, etc. in the English-speaking world) it's not even just the general attitudes of your society that are primarily dangerous: if you gain the attentions of enough fuckwads you can come to harm even for details of your personal life that wouldn't be publicly condemned by a sober or reasonable normative person from your culture.
posted by XMLicious at 10:32 PM on February 17, 2011 [2 favorites]


Nice use of the collateraldamage tag. Not used since 2002 and Poppa Bush.

While I agree 100% with your paranoia and fears on this as I actually lose sleep over this on a regular basis, what you are suggesting is that someone stupid or ignorant enough to post personal information about others here thinking that because they are anonymous or use a fake handle cannot be associated with the person about whom they are speaking would actually read the disclaimer, understand the disclaimer and take rational reasoned precautions based on the disclaimer. While one of my favorite sayings is "sooner or later the crazy always comes out", the related one is more relevant here, "You can't unstupid someone."
posted by AugustWest at 10:36 PM on February 17, 2011


This isn't something that magically just became a possibility. There has never been a guarantee that your boss/ISP/government/parents/spouse/sysadmin couldn't trace personal information back to you. Putting up an AskMe disclaimer to that effect would be sort of like putting one up that says "There's a chance someone will give a boneheaded response to your question. We'll do everything we can to prevent it, but it may still happen."

Because, yes, of course that's true, but it's not specific to this site, and it's really something people need to internalize about the Internet at large.
posted by kagredon at 10:45 PM on February 17, 2011


I'm sorry but this is a little weird.
posted by clavdivs at 10:46 PM on February 17, 2011


...what you are suggesting is that someone stupid or ignorant enough to post personal information about others here thinking that because they are anonymous or use a fake handle cannot be associated with the person about whom they are speaking would actually read the disclaimer...

I think that some people, at least, aren't doing it out of stupidity or any great degree of ignorance; the general importance of online anonymity and the technical details of what it takes to be even moderately successful in achieving it just haven't gotten high enough on society's radar yet. (Despite Brin and others having begun banging the drum more than two decades ago.) Some of the people I've noticed pseudonymously posting potentially-problematic information about acquaintances or loved ones are clearly intelligent and sensitive people who obviously care a great deal about the people who might be accidentally put at risk in this scenario.

I think that some people do actually read the instructions, follow the links, and approach posting carefully. These sorts of people might heed even a small parenthetical warning that linked to a larger FAQ item and choose to post anonymously or might discard their account for safety's sake after posting pseudonymously or perhaps choose not to post at all once more information is available to them.

I personally think that this issue poses a real and substantial danger in some cases and a sort of "genie out of the bottle" danger that can't be fixed later on with hindsight. IMO it's actually more important than some of the instructions and advice that are currently there when you go to post an AskMe since as a community we're sort of asking the posters (many of whom are totally new to MeFi) to trust us. But I could understand others disagreeing with me and this issue only being mentioned in the FAQ somewhere.

(I also figured that discussion here might aid mathowie or whoever in composing a FAQ entry.)

As far as it not being an issue specific to AskMe or MeFi, that is true; the thing that I think makes it salient here is that simply because AskMe is one of the most successful (if not the most successful) question-and-answer type sites on the internet people are much more likely to come here, post an important question with sensitive details pseudonymously, and subsequently post extensively elsewhere on MeFi because MeFi is interesting and engrossing. Note that this isn't a consequence of anything that Matt, any mods, or anyone in the MeFi community have asked anyone to do - no one is forced to come and ask about their important stuff here: it's more a victim-of-Matt's-own-success, with-great-power-comes-great-responsibility thing.

kagredon: It's true that this isn't a new thing. Two factors that prompted me to post this MeTa now rather than earlier are, that a number of recent developments have indicated to me as a web software engineer that the transition between default-anonymity-with-luck on the internet and default-pretty-easy-for-everyone-to-figure-out-who-you-are is probably going to happen during the next decade or so, a transition period where people are more likely to screw up and leave search engine bait than they were earlier or will be later; and having these two threads going on right now - Watson's success in distilling the internet and dominating Jeopardy that way and HBGary as a U.S. government contractor developing social network scraping tools and using them against American citizens - made it seem less likely that I would be dismissed as (just) paranoid. (I could easily be wrong, of course, I'm generally pretty hit-or-miss at predicting how people will react to posts on MeFi.)
posted by XMLicious at 12:04 AM on February 18, 2011


One other anecdote I thought of: another reason this stuff has been on my mind lately is because I think I may have been able to give a little helping hand in derailing a well-financed U.S. Congressional campaign this past year by unearthing and posting to political blogs some juicy dirt from the nooks and crannies of the net that hadn't previously been covered in the media or on blogs. It was a story that was suddenly all over the blogs and got entire stories devoted to it in the local media a few months later when the race became tight, which was prominently mentioned by the candidate's opponents and which the campaign put out ads specifically responding to. (Not mentioning that as a teaser, just as an anecdata point, I won't be mentioning any more details.)

As much fun as watching all of that was, I would expect that as time passes this kind of problem will be more and more frequently harming average people rather than the bad guys.
posted by XMLicious at 12:38 AM on February 18, 2011


Aww, hell, y'all knew every one of my comments not focused on my superhuman sexual prowess and superlatively detailed lovemaking instructions for maximum level five orgasm satisfaction (referred to in textbooks as "the hammer") were pretty much bullshit anyway, right?
posted by klangklangston at 12:42 AM on February 18, 2011 [2 favorites]


*pulses eerie red light at klangklangston*
posted by Sys Rq at 12:50 AM on February 18, 2011 [4 favorites]


(referred to in textbooks as "the hammer")

psst... The hammer is his penis.

/stage whisper
posted by Ghidorah at 1:28 AM on February 18, 2011 [3 favorites]


I don't really care. I'm not running for office.
posted by empath at 4:15 AM on February 18, 2011


Not to discount the problem but, you need to also appreciate the incompetence of the [agency redacted] who will link you up with some one else's embarrassing photos and compromising data. So we need to have a publicly known disinformation server to continually alter everything on the web to either sanitize it, or make it so it can never be trusted to be correct.
posted by Obscure Reference at 4:30 AM on February 18, 2011 [1 favorite]


crunchland: And while it's a good point to remember, going forward, what about all the stuff that's already been written? Could people point to this passage, and reasonably expect the askme question they posted 5 years ago to be deleted?

Realistically, it's probably too late for that. I would expect that there are already multiple private and government archives of all the sites that Google pays lots of attention to, which includes MeFi. Just think of all of the different start-up search engine companies like Cuil^. They undoubtedly had an archive like this, probably current up until last September when they shut down, and they either sold it off when they went under or no one bought it because something better is already available for cheaper. Or for another example I doubt that Watson was trained on the live internet, it probably would have been trained on a partial archive of the internet and various databases.
posted by XMLicious at 4:35 AM on February 18, 2011


"If you give me six lines written by the hand of the most honest of men, I will find something in them which will hang him."

--Cardinal Richelieu

Also, XMLicious, you probably want to go read The Numerati.
posted by MonkeyToes at 4:42 AM on February 18, 2011 [4 favorites]


Not to discount the problem but, you need to also appreciate the incompetence of the [agency redacted] who will link you up with some one else's embarrassing photos and compromising data. So we need to have a publicly known disinformation server to continually alter everything on the web to either sanitize it, or make it so it can never be trusted to be correct

You mean Google's incompetence? ;^)

I was just giving HBGary as an example of someone who's in the news and has been caught red-handed doing sketchy things with a social network scraper. With the amount of search-engine-building and data warehousing and analysis experience and technology that's at large these days, doing something like this even moderately well probably isn't rocket science, so I'm sure that there are much more competent people already doing it. You probably won't hear much about them, at least not any bad press, because they're competent of course...

I would love to believe that some sort of disinformation mechanism could be a general solution for everyone, but I'm pretty sure that such an approach would probably only work for individual people who know what they're doing.
posted by XMLicious at 4:55 AM on February 18, 2011


"Don't post anything you wouldn't post under your real name and show to your boss, your mom, and your friends."

That's what the disclaimer basically boils down to.
posted by smackfu at 6:42 AM on February 18, 2011


Just to give a bit of an "other team" view of scraping... I'm working with scrapers right now. My team's business, and the business of many other business who are looking into scraping, is mainly about finding places where people are talking about our customers' products and distilling those conversations into data that informs the client about the strengths and weaknesses of those products. It's the social media buzz thing(tm), very big business right now, and it's really cool from a business perspective, to learn about the image of their products and brands from places where people are speaking freely.
So... it's not always about collecting data about YOU*. A lot of the time it's collecting data about what you're talking about when you talk about some kinds of things, and I personally think it's kind of neat to know that if I bitch about [company] on a message board, [company] might actually see the essence of my bitching and learn that they have something to fix.

*At the same time, I've also worked on products that got used by gov't agencies who do like to use this kind of technology to focus on people so.. ymmv.
posted by L'Estrange Fruit at 6:55 AM on February 18, 2011 [1 favorite]


Honestly, this seems like overkill. You can't escape from modern society, everything you do or say or write will be archived. At some point everything you think will be archived also. It's just the way the world is going, as storage and connectivity gets cheaper, it's just too easy to save and connect everything. Plan on that, as opposed to trying to escape from something that is inescapable.
posted by Brandon Blatcher at 6:55 AM on February 18, 2011


Honestly, I'm more concerned about forward-tracing (just made that up) than backtracing. If someone figures out who I am based on my mefi name, which I'm sure is staggeringly easy, well so what, they already know all my deep dark secrets because I've posted them on mefi. I'm more concerned about people IRL finding my online username, especially employers, because a lot of the stuff I post is TMI. So I don't link my Facebook account with anything, I don't use my real name in profiles so it's not searchable, etc.
posted by desjardins at 7:26 AM on February 18, 2011 [1 favorite]


smackfu: Almost, except that I would also add something like "...or wouldn't show to anyone who absolutely hates your guts, has no morals, and would do everything in their power to use it against you and / or scam you, your boss, your mom, or friends, at any point in the future."

Brandon Blatcher: What you say there is basically what I think the warning or FAQ entry should say. I'm not proposing helping anyone to escape from anything, just that we should try to ensure that when someone posts sensitive information on AskMe they fully understand what they're doing.

Because the availability or use of comprehensive social network scrapers, et cetera, is relatively limited now, you don't yet see in the headlines average people dealing with the the kinds of things celebrities or other prominent people having to deal with: everyone they ever meet (and lots of people besides) being immediately familiar with everything they've ever said or done, their work history, their financial details, etc. So I think that most people don't understand what it is that's inevitable and that they have to deal with - they're constantly making the kinds of mistakes that would ruin their careers, lives, etc. if they were celebrities. Those mistakes go on record, and they aren't going to get hit with them until some point in the future when this sort of technology has wider availability.

L'Estrange Fruit: Do you do the thing trying to match up identities across multiple sites and integrate the information with other sources? If so, do you know how common it is at this point? (I would assume that the companies checking for people bitching about their product would do this if they easily could, if only to get better demographics about who's complaining. I know that Facebook gives pretty good demographics if you're using their analytics and other systems and that people drool over that.)

desjardins: The sort of stuff we're talking about would do both forward-tracing and back-tracing, to use your terminology, if that wasn't clear.
posted by XMLicious at 7:41 AM on February 18, 2011


L'Estrange Fruit: Do you do the thing trying to match up identities across multiple sites and integrate the information with other sources? If so, do you know how common it is at this point? (I would assume that the companies checking for people bitching about their product would do this if they easily could, if only to get better demographics about who's complaining. I know that Facebook gives pretty good demographics if you're using their analytics and other systems and that people drool over that.)

Relating user info to posts is a big client demand, yes. It serves a couple of purposes - one, if the same person is saying the same thing on many sites, it helps them to know that there aren't actually 25 people complaining about x problem, just one guy saying it many times. Or they can identify influencers. It's important again to remember that this is still going product > person - they're focusing on mentions of their product, not on you, so they're not gathering all info about you, they just want to understand who is talking about them a lot. This would extend into trying to get some basic demographic data (gender and geography), but not much deeper than that, and it's really just username matching, not mastermind levels of Tony-on-siteA=Anthony-on-siteB.
posted by L'Estrange Fruit at 7:52 AM on February 18, 2011


This level of paranoia makes me uncomfortable.

The "oh, I'm not talking about YOU guys, I'm talking about everyone else" thing also makes me uncomfortable. A default setting of "everyone is a moron except a few of us" is disrespectful and misanthropic, and isn't appropriate.

It is, to my layperson's mind, nucking futz.


"...or wouldn't show to anyone who absolutely hates your guts, has no morals, and would do everything in their power to use it against you and / or scam you, your boss, your mom, or friends, at any point in the future."


Nothing metafilter can do would change a situation like this one little bit. They could throw the servers in the river, and there would still be a crazy person out there tormenting [someone].
posted by gjc at 8:36 AM on February 18, 2011 [1 favorite]


L'Estrange Fruit: ...not mastermind levels of Tony-on-siteA=Anthony-on-siteB.

The sort of stuff I'm thinking of that a Watson-type computer might do to match up different accounts would be much more sophisticated than matching similar usernames.

Here's an article that perhaps explains better the particular thing that HBGary was trying to do that I'm talking about. I've been using "social network scraping" as shorthand for that and I probably should have come up with a better term, since there are many simpler applications of social network scraping that go on today.

The author of the above article notes that while making the sort of analysis Aaron Barr was thinking of - to match up identities across sites and extrapolate information about the activities of a group of people - can currently be done by a human investigator, you can't write a script to do it with current tools based purely upon a statistical analysis. He specifically says "scripted", I think, because a more sophisticated approach involving natural language processing, machine learning, and other AI techniques would be more successful. I think that he's correct that Aaron Barr did not understand that the more sophisticated approach was necessary and that's a major reason why the situation has been catastrophic for HBGary:
Aaron got his clock cleaned not only from the hack (which [he] now claims to have been partially a social engineering attack on the company) but also from the perspective of his faulty methodologies to harvest this data being published to the world by Anonymous.
What I'm saying is that I think at some point the more sophisticated approach will be realized by someone, somewhere, possibly using supercomputers like Watson. If that approach were to draw on an archive of the entire internet and many other sources of information, and if it were used on a site like MeFi where people reveal all sorts of information about themselves over the course of years and years rather than groups of astroturfing activists or jihadis - people trying to conceal themselves, like HBGary was talking about or the Infosec Island article was talking about - I think that it would be able to correctly identify the real life names of at least some pseudonymous MeFi users.

I could be wrong; I don't have personal experience working with NLP or the other technologies Watson and Google use, though I've got friends who do some of that stuff.

gjc: I'm not saying that I think anyone is stupid - here I tried to say exactly the opposite. I apologize for making you uncomfortable.

In the bit you quoted there, I'm not referring to crazy people. I was trying to add on to smackfu's rendition of a warning message, to emphasize that if sensitive information about you becomes public, there may be concerns beyond getting your boss, your mother, and your friends to accept something scandalous; a warning about this issue should express that too. smackfu's rendition isn't how I'd say it but I was trying to run with it.
posted by XMLicious at 11:16 AM on February 18, 2011


The sort of stuff I'm thinking of that a Watson-type computer might do to match up different accounts would be much more sophisticated than matching similar usernames.

I've read a bunch of whitepapers in the course of my work about mapping social networks, which sometimes touches on the stuff that HBGary was trying to do along those lines. It's important to note that the work in this field is still largely in the study stage, and that the studies themselves cite scaling as the big hurdle that is yet to be overcome (which is why I laughed in the HBGary exchanges when the developer said exactly that). By scaling I mean less the overall amount of data to analyze, but more expansion across social networks - confidence factors for the relationships plummet when you try to cross-map different communities.

It's cool stuff (if you don't think about the abuse possibilities) but very much in its infancy. Also, it's expensive as hell from a developmental and computational POV, which is why, while it's theoretically possible, most of the companies in the social mining space don't really do it; there's some utility in it but the ROI would not necessarily be worth it.
posted by L'Estrange Fruit at 11:46 AM on February 18, 2011


Something Penelope Trunk wrote once

UNDER A PSEUDONYM--"Penelope Trunk" is not this person's actual legal name or name of use.

about self-disclosure (online and off) that really resonated with me: "...if I am living an honest life, and my eyes are open, and I’m trying my hardest to be good and kind, then anything I’m doing is fine to tell people." ... "And when you think you cannot tell someone something about yourself, ask yourself, 'Really, why not?'"

I mean, for serious, it is ironic in the extreme that someone is all "Oh, self-disclosure" when writing under a pseudonym.

posted by Sidhedevil at 12:53 PM on February 18, 2011 [2 favorites]


Also ironic that I messed up that italics like mad. At least in the Alanis Morrissette sense.
posted by Sidhedevil at 12:54 PM on February 18, 2011


L'Estrange Fruit: Thanks for your assessment. It's good to know that there are some practical hurdles in the way for doing these sorts of things today.
posted by XMLicious at 2:15 PM on February 18, 2011


Someone I know IRL followed me on my desjardins Twitter account yesterday, which should have been solidly walled off from my Facebook account and my real name in general. She hasn't replied as to how she found my account, and I'm sure she has no ill intent, but it's a little disconcerting.

desjardins, could it be as simple as "find your friends by plugging in your email contacts here" kind of thing perhaps?

and XMLicious, I'm fighting the urge to spouse you right now.
posted by infini at 2:32 PM on February 18, 2011


You *said* you think they aren't stupid, but then you go on to say exactly how they aren't smart enough or competent enough or paying enough attention to protect their own interests.

You don't need to apologize- you are not offending me. My discomfort is of the "this guy seems manic and paranoid and we shouldn't be encouraging him" variety. Maybe I'm wrong, but at least switch to decaf for a while.
posted by gjc at 4:34 PM on February 18, 2011


Dude, maybe I'm paranoid, and I certainly have lots of other mental quirks, but I feel like I'm getting competition on that count: you are imagining out of thin air things I haven't said.

At least we can all agree that there are lots of crazy people in the world. ;^)
posted by XMLicious at 5:29 PM on February 18, 2011


"And when you think you cannot tell someone something about yourself, ask yourself, 'Really, why not?'"

A lot of the time? Because it's none of their goddamned business.

And sometimes? Because they don't want to know. There's a reason for the term "TMI," among other considerations.

And other times? Because posting that something, however innocent it may be, can be used in un-innocent ways.

I am about the square-est vanilla-est person in the whole world, but there are still things I don't discuss with the world in general. There are things I don't discuss with particular people. Information, in and of itself, is not simply safe to expose even if there's nothing about it that is wrong.

Example: Would you actually publicly post, unmunged, anywhere, an email address you wanted to be able to use again? That's an entirely wholesome, upright, vanilla piece of information. As soon as it's been publicly available for 5 minutes, though, it's going to be receiving more spam than any spam-blocker can cope with. It may be picked up by a stalkery creep who happens to think you sound nice. People you don't know may decide that you sound like someone who can help them for one reason or another and flood your inbox.
posted by galadriel at 7:54 PM on February 18, 2011 [2 favorites]


This is why I use different usernames and email addresses across different sites. While I know that someone with even a mild amount of determination could most likely link them, it makes me slightly less trackable to the casual passerby. So like if my family finds out one of my usernames they probably won't find all the sites I'm registered on (under other names) unless they go digging.
posted by IndigoRain at 11:51 PM on February 18, 2011 [1 favorite]


Found a better way to explain why what gjc is saying above seems odd to me: the way it looks to me, he or she is saying that the average AskMe poster perceives a continuous, nebulous, inspecific risk of being recorded or watched all the time on the internet, and that because I won't just assume this assertion to be true or because I'm interfering with it by trying to communicate real, specific risks I'm aware of, I'm being unusually or especially paranoid and insulting.

If that really is true, I'm having the refreshing experience of feeling unusually sane in some respects. At least I really know that I can't be recorded sometimes, or I know when the risk is genuinely low, but those poor bastards have to feel that way all the time.
posted by XMLicious at 9:29 AM on February 19, 2011


desjardins - Twitter also has suggested followers based on people who follow you and people you follow. So a common friend (or a couple common celebrities) could do it.
posted by maryr at 9:42 AM on February 19, 2011


and that because I won't just assume this assertion to be true or because I'm interfering with it by trying to communicate real, specific risks I'm aware of, I'm being unusually or especially paranoid and insulting.


Imho, sounds more like your cogent arguments based on experience, knowledge and observations are being undermined by nebulous feints just enough to seed doubt and make us all think you're weird.
posted by infini at 10:05 AM on February 19, 2011 [1 favorite]


Re-reading that Infosec Island article I linked to above, I just noticed a comment which the author made in the attached thread: (responding to something about Anonymous's ideological motives)
...I did not say they were completely justified. Nor have I said that the tools would not work. In the implementation and scale that Aaron wanted to do it, the programmer was right. I too perform this type of intelligence gathering so I know this.
posted by XMLicious at 10:18 AM on February 19, 2011


desjardins, you also need to count on no one you know linking your real name to desjardins. I think thats the tricky part of trying to stay truly anonymous online.
posted by chunking express at 7:55 AM on February 20, 2011


You also need to count on no computer ever linking your real name to desjardins. That's the significance of these sorts of social network scraping tools (which the security expert in the quote above thinks are workable to some degree today) - that it's not a matter of possibilities pseudonymous people might be outed, it's virtually a certainty at some point in the future. (Unless the scaling problem described by L'Estrange Fruit and the security expert is inherently insurmountable, which seems unlikely to me.)

(Now, we're not talking about every single pseudonymous account everywhere, just ones which have provided enough personal details over the years to be successfully matched by one of these programs - but lots of people on MeFi, including me for example, fall into that category. Also, remember that every electronic record concerning you, like your tax records, credit records, and electronic medical records, may be publicly available by that time too, if it's any consolation; so you don't need to worry so much about having revealed details on MeFi that are going to be in those other databases anyways.)
posted by XMLicious at 1:03 PM on February 20, 2011


Well, if it's inevitable I'll just stop worrying about it then. It's too late for me.
posted by desjardins at 6:05 AM on February 21, 2011 [1 favorite]


Completely unrelated to the topic of this thread but apropos to online paranoia in general, someone just pointed me to this WSJ blog looking at the amount of tracking done by particular web sites:

Wall Street Journal: What They Know

possibly connected to this July 2010 story?

...well, although it has some interesting information in a well-designed interactive graphic, upon a closer look that story and blog is actually an advertisement for some sort of privacy tool or service as evidenced by the "powered by..." and links at the bottom. Sigh.

.

(for journalism)
posted by XMLicious at 7:33 PM on February 23, 2011


« Older San, Toeknee. "Recursive linking in the archive"...   |   you've new comments! Newer »

You are not logged in, either login or create an account to post comments