Smart Quotes too smart for MetaFilter? May 3, 2012 8:57 AM   Subscribe

Are character entities for smart quotes supported?

I write my posts with Markdown and convert them into HTML for posting. However, whenever I write
"That's an interesting question," said Lord Dingus.
and convert it from Markdown to HTML, the preview window shows it as
Thats an interesting question, said Lord Dingus.
Yet if I paste formatted text with smart quotes there's no problem, as in
“That’s an interesting question,” said Lord Dingus.
Oddly enough, the entities seem to work fine on preview in MetaTalk.

It seems that the parser edits out character entities, including those used by Markdown. (I think the mods mentioned this was for security reasons.) Is there any way these particular ones can be whitelisted? Alternately, is there a way if to disable smart quotes in TextMate or Notational Welocity?
posted by modernserf to Bugs at 8:57 AM (37 comments total) 1 user marked this as a favorite

Character entities for smart quotes are already supported:

“ = “
” = ”
‘ = ‘
’ = ’

I don't know Markdown at all, but maybe it can be adjusted to use those entities?
posted by koeselitz at 9:13 AM on May 3, 2012 [1 favorite]


(Or maybe I am wrong in assuming that by "character entities" you mean "unicode character entities.")
posted by koeselitz at 9:14 AM on May 3, 2012


That's what I mean. But Markdown creates them as 8221; instead, which seems to break in some parts of the site. In fact, in the live preview window here, it just drops the ampersand.
posted by modernserf at 9:37 AM on May 3, 2012


Are you using numeric entities? Yeah, we do filter those out for security. People have caused problems with them in the past so we completely filter them out. Can you use named entities instead? Maybe a quick replacement of the numeric entity with the named entity equivalent before you post?
posted by pb (staff) at 9:38 AM on May 3, 2012


modernserf: “That's what I mean. But Markdown creates them as 8221; instead, which seems to break in some parts of the site. In fact, in the live preview window here, it just drops the ampersand.”

Ampersand indicates the beginning of an entity when it's not surrounded by spaces, I think – so you have to use the named entity, which is what I did in my comment above:

&

There are a few other named entities that are supported, like – (–) and — (—). I don't know all of them. I think somebody put together a list once.
posted by koeselitz at 9:49 AM on May 3, 2012


Any named HTML entity is supported, so it's not a whitelist. Anything goes. Numeric entities aren't supported.
posted by pb (staff) at 9:51 AM on May 3, 2012


What is the security problem with numeric entities?
posted by jepler at 10:01 AM on May 3, 2012


The problem is that you can get around a lot of things we filter out by using the numeric entity equivalent. We look for things like "javascript:" in links and filter that out. It's possible to use numeric entities to get around content-based filters like that.
posted by pb (staff) at 10:03 AM on May 3, 2012 [1 favorite]


Is it too much work to decode the numeric entities on the server before applying the filter?

Oh, and markdown would be pretty neat if it was officially supported.
posted by schmod at 10:45 AM on May 3, 2012


We don't want to get into maintaining a whitelist of numeric entities that are allowed. We'd rather just have the blanket rule: hey, no numeric entities. But named entities are go! It's a pretty niche situation that someone is using numeric entities and can't do a quick replace somewhere in the process.

We've talked about markdown quite a bit here in MetaTalk in the past, it's not going to happen.
posted by pb (staff) at 10:56 AM on May 3, 2012 [2 favorites]


# Real reasons pb won't implement Markdown support:

* John Gruber once punched his cat
* Markdown is pimpin', which indeed ain't easy
* Gruber's also a Yankee fan, which, seriously, *what kind of baby killer does that*
* Dan Benjamin-related inferiority complex
* Markdown is kosher, but not kosher for Passover
* Gnomes
posted by middleclasstool at 11:02 AM on May 3, 2012 [3 favorites]


It can't be that hard to hack this Markdown thing to work. It's just a Perl script, for god's sake. Right?
posted by koeselitz at 11:21 AM on May 3, 2012


On a related note, is it because I've switched to chrome that my <q> tags put " instead of smart quotes now? Or did something change on the backend?
posted by ob1quixote at 11:48 AM on May 3, 2012


It can't be that hard to hack this Markdown thing to work. It's just a Perl script, for god's sake. Right?

I took an Excel class once in high school, I'm pretty sure we could use like VBA to make it work. I've got a copy on my Pentium.
posted by cortex (staff) at 11:55 AM on May 3, 2012 [2 favorites]


modernserf: "I write my posts with Markdown and convert them into HTML for posting."

Why? Is there a client you have or some plugin that transmogrifies markdown to html for you?

I'm not being snarky. I'm genuinely curious. It seems like this would be unwieldily.
posted by chairface at 12:20 PM on May 3, 2012 [1 favorite]


Any named HTML entity is supported, so it's not a whitelist. Anything goes. Numeric entities aren't supported.
posted by pb (staff) at 11:51 AM on May 3

Does this include blink?? Sweet. I thought blink was verboten.
posted by dios at 12:21 PM on May 3, 2012


Blink's a html tag, <blink>. I think tags are heavily restricted in comments.

That's different from HTML named entities, like &amp; to represent ampersand. Apparently all named HTML entities are permitted in comments.

And different still from HTML numbered entities like &26; which also represents ampersand but is apparently prohibited in comments.

crossing fingers that he got his entities right here
posted by jepler at 12:25 PM on May 3, 2012


(I didn't quite get it right, of course. The numbered entity is 26;)
posted by jepler at 12:26 PM on May 3, 2012


well now I'm just going to quit.
posted by jepler at 12:26 PM on May 3, 2012


My posts tend to be link or formatting-heavy and I write them in advance (both of these posts have been cooking for about a week.) I usually collect my links and references into a sheet in Notational Velocity, and then edit them into a post in TextMate. Both of them support Markdown.
posted by modernserf at 12:31 PM on May 3, 2012 [1 favorite]


Using Markdown Web Dingus, smart quotes like '“' in the input appear literally in the generated HTML source, not as 8220;. The same goes for the Markdown.pl in the Markdown 1.0.1 distribution. What specific program are you using to turn your Markdown into HTML?
posted by jepler at 12:41 PM on May 3, 2012


Thanks, I hadn't seen the web dingus before. I'll use that from now on.
posted by modernserf at 2:25 PM on May 3, 2012


Can you replace 8221; with &quot;?
posted by yerfatma at 2:57 PM on May 3, 2012


Isn't MetaFilter still on ColdFusion? You can't use numeric entities in ColdFusion because it uses pound signs to denote varibles.
posted by kirkaracha at 4:17 PM on May 3, 2012


You can't use numeric entities in ColdFusion because it uses pound signs to denote varibles.

No, that's not it. You can use pound signs throughout your comment text. # User input is never interpreted as ColdFusion code.
posted by pb (staff) at 4:30 PM on May 3, 2012


If you call an octothorpe a pound sign what do you call £?
posted by unliteral at 4:38 PM on May 3, 2012


Dollars.
posted by Edogy at 4:42 PM on May 3, 2012


£ is called a "pound symbol" or "an English pound symbol," at least by my high school business teachers. # = "pound sign," £ = "English pound symbol."
posted by Sidhedevil at 4:44 PM on May 3, 2012


what do you call £?

A peon's guinea! No, we call it a pound, (I've never heard English pound symbol, but sure) or sometimes I see it as pound(s) sterling first then pound(s) for additional references in the same discussion/article.
posted by BrotherCaine at 6:12 PM on May 3, 2012


what do you call £

A quid.
posted by arcticseal at 6:47 PM on May 3, 2012 [2 favorites]


Just type the damned characters. They aren’t special, because there are no “special characters” in Unicode. Type the character you want to use. It is only in exceptional cases like whitespace that entities are actually necessary.
posted by joeclark at 7:19 PM on May 3, 2012 [2 favorites]


@joeclark - Notational Velocity and TextMate automatically convert dumb quotes to smart quotes when converting Markdown to HTML. I have given my reasons for composing in Markdown here. If you know of a specific way to disable the SmartyPants.pl conversion of dumb quotes to smart quotes in these apps without breaking Markdown conversion, I would love to know it.

(I don't even like smart quotes.)
posted by modernserf at 8:53 PM on May 3, 2012 [1 favorite]


Here's how you can do it in TextMate if you're using the Markdown Bundle command "Convert Document to HTML". You want to edit that bundle command to take out the smart quotes conversion, like so:

1.) From the top TextMate menu: Bundles, Bundle Editor, Edit Commands.
2.) Find Markdown, select "Convert Document / Selection to HTML"
3.) Remove this: |"${TM_SMARTYPANTS:-SmartyPants.pl}" on around line 6-9 depending on how things are wrapping for you.
4.) Close the Bundle Editor, try it out.

That will keep the conversion process from piping the output through the smart quotes command. I can't guarantee it won't do anything else to your markup, but it worked when I just converted a document and checked the quotation marks.
posted by pb (staff) at 10:01 PM on May 3, 2012 [1 favorite]


Thank you!!! I wish I could mark "best answers" in MetaTalk...







pb, can you get on that? ;)
posted by modernserf at 10:27 PM on May 3, 2012


angly quotes!!!! how do!???
posted by TwelveTwo at 10:57 PM on May 3, 2012


I'll leave this other (online, Javascript based) Showdown converter here.

Showdown is a Markdown flavor and this one has no smart quotes.
posted by zengargoyle at 10:36 AM on May 4, 2012


TwelveTwo: "Angly quotes!!!! how do!???"

Like this:

&laquo; = «
&raquo; = »

I'm on an iPhone, here's hoping that works
posted by koeselitz at 7:00 PM on May 4, 2012


« Older Malicious AskMe   |   Looking for an article that connects bike riding... Newer »

You are not logged in, either login or create an account to post comments