HTML formated text goes poof. November 29, 2011 9:06 AM   Subscribe

It would be really nice if we got a warning when HTML formated text is not going to show up on a post preview instead of it just being deleted from the text box. Thanks for all the hard work!
posted by Brent Parker to Feature Requests at 9:06 AM (51 comments total)

Can you elaborate on what happened and where? Preview and post are generally not destructive (though you shouldn't trust Live Preview with some of the more esoteric stuff since it's not perfectly in sync with what actual preview and post produce), but there are occasional fake-open-tag situations where using < as a literal character can confuse the parser.
posted by cortex (staff) at 9:08 AM on November 29, 2011


From my just asked question - what I was trying to put this in the more inside text area and it goes poof.
posted by Brent Parker at 9:11 AM on November 29, 2011


Ah, if you're pasting actual tags then they won't ever show up as literal text—they'll either be interpreted as code (like typing out <b> would) or stripped if they're on the tag blacklist.

That'll paste fine if you replace the literal brackets with named entities: &lt; for < and &gt; for > . Previewing should preserve those named entities in place rather than convert them to literals (which used to be a problem), though it's possible that is goofed up somehow on the askme more inside field. pb or Matt would be able to say for sure on that front.
posted by cortex (staff) at 9:16 AM on November 29, 2011


Okay, it didn't go poof, it's just that all you pasted are HTML tags. So they either drop out as disallowed HTML tags, or they're just rendered invisible because they're tags. So if you want to include HTML tags as visible items in a question/comment you either need to escape out all the HTML or do what you did which is take the code and host it someplace offsite that is set up to manage HTML content. This is the short entry on the FAQ about it, maybe we should beef it up some?
posted by jessamyn (staff) at 9:16 AM on November 29, 2011


I understand that HTML tags aren't allowed and that our rendering of them is persnickity - I just think it would be pleasant to have a warning that "Hey metafilterian we see you are doing something we can't do, here's a FAQ that explains why we don't do this -- here is the text we won't render highlighted in RED -- if this is code, use a pastebin." But more friendlier and full of unicorns and narwhales.
posted by Brent Parker at 9:23 AM on November 29, 2011


er some HTML tags.
posted by Brent Parker at 9:24 AM on November 29, 2011


Since we're on the subject, when was the <center> tag eliminated from acceptable FPP html code? I tried using it yesterday when I was working on my post about Lana Peters and couldn't. It was stripped out when I clicked Preview. But I know I've used it in the past.
posted by zarq at 9:27 AM on November 29, 2011


center


It works on preview.

posted by nangar at 9:49 AM on November 29, 2011


You probably just typed it wrong, or something.

posted by nangar at 9:50 AM on November 29, 2011


FPP, not comments.

Probably removed because it's dumb.
posted by smackfu at 9:51 AM on November 29, 2011


Pretty sure it works for everyone but you.

posted by Sailormom at 9:52 AM on November 29, 2011


No, smackfu's probably right. It might get stripped from FPP like br tags are.
posted by nangar at 9:54 AM on November 29, 2011


Sailormom: "Pretty sure it works for everyone but you."

nangar: "You probably just typed it wrong, or something."

F. P. P.

Not comments.
posted by zarq at 9:57 AM on November 29, 2011


smackfu: " Probably removed because it's dumb."

I formatted the post without it and put more content above the [more inside]. But the wording on the version I posted doesn't flow well as a result.
posted by zarq at 9:59 AM on November 29, 2011


Maybe this is relevant:
if i use <center> again matt will murder me
posted by cortex at 12:35 AM on April 25, 2006
posted by smackfu at 10:01 AM on November 29, 2011


It looks like it's specifically filtering center tag from the above-the-fold portion of post text, yeah. Still kosher on the more inside. We're trying to figure out what happened there, it's possible this was a miscommunication on the mod side when we were talking specifically about stripping a distracting center tag from a specific post, it's possible Matt or Paul thought we were talking about stripping the tag as a matter of course. Looking into it.

It might get stripped from FPP like br tags are.

The tags aren't stripped, I don't believe; we just don't auto-interpolate break tags for line breaks above the fold like we do for more inside or for comments, because while we're not fundamentally against breaks appearing in a post we really want it to be an occasional thing, not something folks are doing all the time.

Which is kind of where we are on most formatting above the fold, tags stripped or not: keeping it simple 99% of the time is the way to go, and it feels like much of the time when someone throws in a little extra tag action it serves more to distract than anything.
posted by cortex (staff) at 10:01 AM on November 29, 2011


And based on email, it looks like the stripping probably would have started right at the end of July. So we're about four months in and I guess this is the first it's come up?
posted by cortex (staff) at 10:07 AM on November 29, 2011


cortex: "It looks like it's specifically filtering center tag from the above-the-fold portion of post text, yeah. Still kosher on the more inside.

Ah! Okay. Thanks. Didn't occur to me that there would be two different sets of rules for the above and below fields.

We're trying to figure out what happened there, it's possible this was a miscommunication on the mod side when we were talking specifically about stripping a distracting center tag from a specific post, it's possible Matt or Paul thought we were talking about stripping the tag as a matter of course. Looking into it."

Good to know. Thank you.

Was curious because even though I rarely use it, (like say, the blockquote tag,) I had an idea in my head of how I wanted the post to look. Had moment of surprise when I realized I couldn't. No biggie.
posted by zarq at 10:08 AM on November 29, 2011


Huh, so to the original poster, I'd say that you might be used to various coding sites like StackExchange where if you set off some code with pre or code tags, they won't parse and will be set off in a nice blockquote with some color, but we're not really a code-centric site so we never built in that featureset.

I was going to say we should do this, but looking at actual HTML, if I pop your code into a browser from a .html text file, Google Chrome doesn't escape the HTML within. It doesn't for code or tt tags either. I have a feeling this feature is something we've seen some programming forums adopt but browsers don't actually do this in the HTML spec.

In general if you want to share code on Ask MeFi it does help to use outside helper sites like pastebin to share complex code. We have to process thousands of comments daily to go with the millions we already have, doing analysis and reformating on the fly for HTML output would be a big pain.
posted by mathowie (staff) at 10:08 AM on November 29, 2011


the stripping probably would have started right at the end of July.

Is it still $20?
posted by desjardins at 10:15 AM on November 29, 2011 [1 favorite]


I'm really not used to any coding site at all - my coding/scripting skills are rudimentary at best. However for people who don't know about those sites at all, ask.me is likely going to be one of the first places they turn if they are interested in solving a very small scripting/HTML problem, so a warning of some sort would be nice when HTML is escaped.
posted by Brent Parker at 10:16 AM on November 29, 2011


...so a warning of some sort would be nice when HTML is escaped.

The problem is that we don't know when you're using HTML to format your comment and when you're posting example HTML. We offer both a Live Preview and a more formal "Preview" option so you can see what your HTML will look like before it's posted. That will let you know what is stripped and what isn't. We also have a link next to every comment form that says, "HTML help". If you click that you'll get the FAQ entries that explain how HTML is used here.

I know it's a drag to put time into something and then not have it turn out the way you wanted it. But I do think we try to make it clear how your comment will look before it's posted.
posted by pb (staff) at 10:20 AM on November 29, 2011


You could put up a warning when the HTML is sanitized though:

WARNING: Some of the tags you used are not allowed on Metafilter and have been removed. If you wish to include the tags in your post as text, you must use &lt; and &gt; to replace < and > respectively.
posted by smackfu at 10:27 AM on November 29, 2011


I just think that's getting into overkill territory that has the potential to annoy people more than it helps. You can clearly see what you can use and what you can't in the Live Preview or the HTML Preview.
posted by pb (staff) at 10:29 AM on November 29, 2011 [1 favorite]


I just think it would be pleasant to have a warning

The problem is there's more than one use case here. While you're posting HTML because you want to discuss the specific HTML, others are using it to format their content. There's no easy way to determine which case is in play. And this is a more complicated problem than it appears at first glance. I'd say the vast majority of users are trying to format content, so the site is designed to make that easy.

Put it this way: if you're savvy enough to know about Pastebin-type sites, you're savvy enough to use them in these cases going forward.
posted by yerfatma at 10:36 AM on November 29, 2011


The original example also has a missing greater than sign at the end of the second frame tag, which is the kind of thing that will kill an entire comment too.

As a side note, this actually illustrates an odd behavior. Enter in a comment with the following text and hit the preview button:

<b
abc
<\b>

The resulting preview has a bold abc then <>, but abc is nowhere in the comment box anymore. Isn't HTML parsing fun!
posted by smackfu at 10:41 AM on November 29, 2011


Yeah, if you have malformed HTML all bets are off. Our parser will do its best to try to understand what's going on, but if it gets something that looks like a tag it's going to assume it's a tag and not text.
posted by pb (staff) at 10:44 AM on November 29, 2011


There's no easy way to determine which case is in play.

Is there some reason you can't make MetaFilter sentient?
posted by Brandon Blatcher at 10:49 AM on November 29, 2011


Job security?
posted by jessamyn (staff) at 10:51 AM on November 29, 2011 [7 favorites]


Job security?
posted by jessamyn (staff) at 6:51 PM on November 29


What? Job what? Oh wait... I vaguely remember my grandfather saying something about that. Damned if I can remember what it was all about, though.
posted by Decani at 10:58 AM on November 29, 2011


> The tags aren't stripped, I don't believe; we just don't auto-interpolate break tags for line breaks above the fold like we do for more inside or for comments ...

Heh. You're right! I would have used that on my last post, actually. I could swear I tried it before and it didn't work, I must have been thinking of something else.

nangar: "You probably just typed it wrong, or something."

> F. P. P.

Not comments.


Sorry, zarq, I snarked first, then thought about it.
posted by nangar at 11:02 AM on November 29, 2011


nangar: " Sorry, zarq, I snarked first, then thought about it."

Thanks, but no worries! :)
posted by zarq at 11:05 AM on November 29, 2011


"...so a warning of some sort would be nice when HTML is escaped."

No, you mean when it isn't escaped.

Which is the point. I think you are not understanding something essential and basic about MetaFilter. MeFi, including AskMetaFilter, allows users to hand-write some text markup in posts and comments. That is to say, by design MeFi passes along (much of the) HTML it sees in user submissions right to the database and when you view the site, that HTML markup is sent right to your browser...where your browser interprets it as HTML and handles it accordingly.

To not do this requires some extra action, something special. You're thinking the reverse is true, that MeFi is doing something special when it doesn't display the HTML markup tags you submit in a comment right there on the page. But it isn't. It's sending it to your browser...it's your browser that's not showing it because it's interpreting it as HTML markup tags.

To avoid this happening, the tags have to be made to look like something that aren't tags to the browser while still looking like the tags to the human reader. Using named HTML entities such as the left and right brackets instead of the actual characters solves the problem, as does some other ways of making the tags not be actual tags.

"Escaping" some text is a technical term meaning "to make it not be acted upon as if it were some command the way it otherwise would be acted upon and instead treat it as just text". If MeFi were to regularly escape all HTML markup, it would have to parse all post and comment submissions for HTML markup and change the submitted text to make it act like just regular text and not markup. But that would also mean that users couldn't use markup in their comments to, say, center text (as above) or hand-write links or italicize or whatever. All the buttons below the text input box do is add the tags to the comment. Escaping all HTML markup would mean that users couldn't use any HTML markup in their comments at all, including links and italics and bold.
posted by Ivan Fyodorovich at 11:11 AM on November 29, 2011 [1 favorite]


To be clear, I think what Brent Parker is specifically asking for is some sort of error-reporting hook in the tag-filtering process that bubbles up as user feedback in lieu of or as a complement to the actual tag-filtering.

So if the preview step as is would pull out a <frame> tag because that's blacklisted, it would either (a) decline to move forward with preview, instead offering a "hey, frame tag isn't okay" error, or (b) move forward with preview as usual but also provide a "hey, this bit of your comment was filtered" so that the text doesn't just disappear.

Which, I get that idea but I agree with pb on this that it's just not even worth doing because the use cases are narrow as heck. Live Preview can be slightly wonky with edge cases but mostly will give a reliable picture of what your comment will look like, so this really only applies to post construction, and we don't have a lot of posts with elaborate outside-typical-basics html going into them.

It is entirely a bummer if the only copy of your dependent-on-blacklisted-tags content was what was in the comment or post text box before you hit preview, I'm sympathetic as heck there. But it's kind of a one-time learn-the-hard-way sort of thing that is not likely to come up so often in a really problematic way that it makes sense to try and wrap the kind of explicit check-and-warn functionality above into the posting or commenting process.
posted by cortex (staff) at 11:20 AM on November 29, 2011


You can clearly see what you can use and what you can't.

Just a heads up not to mistake what's clear to an expert developer for what's clear to members of Metafilter to happen not to develop the site for a living. Even if the preview were reliable, I wouldn't be at all surprised to find that your level of insight into the cause and effect is informed by your, you know, having built the feature, and populated the tag blacklist; the same goes from anyone who uses the term "tag blacklist." See here. See also, this MetaTalk thread.
posted by Jeff Howard at 11:21 AM on November 29, 2011


Jeff Howard, I hear what you're saying. But if Brent Parker would have glanced at Live Preview before posting, he would have seen a big blank area where a comment should have been. If you're new to the site and not used to using Live Preview, then yeah, lesson learned. My point is that we provide some tools to check on this kind of thing.
posted by pb (staff) at 11:24 AM on November 29, 2011


Hmm ... what would be the overhead of adding something to the Live Preview javascript where if I type in "<blink>", a warning appears under the live preview like "The following tags will be automatically stripped from this comment: <blink>."?

If the answer is, hardly anyone needs that and it doesn't justify the dev time and complexity, then I'm down. (And maybe it creates unreasonable expectations about the accuracy of Live Preview?) But it seems like an unobtrusive and low-server-load feature that would save confusion now and then.
posted by jhc at 11:53 AM on November 29, 2011


cortex, he wrote HTML that he expected to show up as text in his post and then complained that it didn't. Even if MeFi didn't try to filter out any tags at all, this is what would have happened. His expectation, then, was either that MeFi doesn't allow users to hand-code HTML at all and instead always escapes it, or he didn't even know that much and just expects to be able to type HTML markup in a post/comment and have it display as text in the browser.

This isn't an unusual problem for people new to writing HTML. They post it to places on the web as code samples and then don't understand why it doesn't show up on the page because it's interpreted by the browser. Or they don't understand why it's filtered out even before it's saved to wherever it's saved. If someone knows enough to try to be writing HTML, then they should know enough not to expect to be able to merely write HTML and have it appear as text (or not be stripped).
posted by Ivan Fyodorovich at 11:57 AM on November 29, 2011


The leap here is in deconstructing why there's a big blank area where the comment should have been. What's causing the discrepancy?

I'm not really advocating for this feature, and it's probably an edge case, but what I periodically advocate for (when I notice the problem) is a staff member who understands as much about the population of Metafilter as you, cortex, jessamyn or mathowie but who knows much, much less about the internal mechanics of the site. Otherwise, concerns about dev time and complexity tend to carry the day without always recognizing the bias.
posted by Jeff Howard at 11:59 AM on November 29, 2011


Cortex is correct on the feature that I was asking to be implemented and it seems to be something that is not going to happen because it would be too processor intensive according to matt. Thanks for the consideration of the feature though.

IRT to my making a mistake with the HTML, that is exactly the kind of thing I was hoping this feature would pick up.
posted by Brent Parker at 12:01 PM on November 29, 2011


Ah I see a problem here, "Live Preview" shows up only in the first page, if you "Preview" repeatedly - as I do to revise posts - the next preview pages leave out the "Live Preview" of the post.
posted by Brent Parker at 12:05 PM on November 29, 2011


Turning and turning in the widening gyre
The falcon cannot hear the falconer;
Things fall apart; the centre cannot hold;
Mere anarchy is loosed upon the world,
The blood-dimmed tide is loosed, and everywhere
The ceremony of innocence is drowned;
The best lack all conviction, while the worst
Are full of passionate intensity.


This was Yeats' poem about how we lost the <img> tag.
posted by kaibutsu at 1:17 PM on November 29, 2011 [1 favorite]


"Live Preview" shows up only in the first page, if you "Preview" repeatedly - as I do to revise posts - the next preview pages leave out the "Live Preview" of the post.

Yep, and "live preview" is for the comment threads for when people don't use the preview feature (which is almost always, very few people use preview). If you have taken the time to use Preview, there's no reason for us to show you an approximated "live preview" since you are seeing exactly what the server will do to your comment. I suspect you didn't preview a last time after putting in some HTML, not a big deal, but doing warnings is close to impossible.
posted by mathowie (staff) at 2:01 PM on November 29, 2011


desjardins: Is it still $20?

Only if you're in town.
posted by deborah at 4:02 PM on November 29, 2011 [1 favorite]

(which is almost always, very few people use preview)
Am I the only one who has no option other than to hit preview first? I don't even see the post button till I've previewed.

(I'm also curious if there's some way of seeing that the poster used Preview, and if so, does that mean I'm accruing special brownie points with the mods each time I do it?)
posted by SMPA at 6:49 PM on November 29, 2011


While we're doing this, why are multiple spaces stripped? I often want to do a table thing, and I run into all sorts of problems.
posted by Joe in Australia at 11:13 PM on November 29, 2011


Multiple spaces aren't stripped at all, but browsers routinely collapse uninterrupted runs of whitespace down to a single space outside of specific contexts.

The <pre> tag will help you here:

This is your ample whitespace.
This is your       ample whitespace on pre.
Check the source, you'll see just as many spaces between "your" and "ample" in those two examples.
posted by cortex (staff) at 7:07 AM on November 30, 2011


SMPA, we offer the option to hide Live Preview if you don't like it. If someone hides it, we force them to use the Preview button so they have at least some sense of what they're going to post before they hit the Post Comment button.

If you have hidden live preview, you should have a "Show Preview" link under the "HTML help" link to the left of the comment textarea. Click that link and you'll have Live Preview back.
posted by pb (staff) at 8:06 AM on November 30, 2011


Oh, and to answer your question, no—we don't have any way to see if someone used Preview vs. Live Preview.
posted by pb (staff) at 8:14 AM on November 30, 2011


Upon hitting a preview, having a one-line warning - something like -
Note: Some unallowed HTML was filtered from your comment. (More info)
- could be useful, without being overly helpful or obnoxious. This could help with subtle errors in lengthy posts, too.
posted by Pronoiac at 10:39 AM on November 30, 2011

SMPA, we offer the option to hide Live Preview if you don't like it.
AHA. Clearly I hid it at some point without meaning to. Neat. Oh, and look, there's a little link that says "HTML help." Handy!

And I'm sad I can't get points for all of those times I've previewed. Because I'm one of those people who only feel worthwhile when they get some kind of external validation, I guess.
posted by SMPA at 3:22 PM on November 30, 2011


« Older Five Golden Malls -- the fifth year of the MeFi...   |   First Name: Female, pronounced Ladasha Newer »

You are not logged in, either login or create an account to post comments