Please don't use Unicode to make fancy fonts in posts May 19, 2020 8:08 AM   Subscribe

I've noticed more and more posts that make use of Unicode characters to simulate fancy fonts in FPPs. This breaks screen readers, often in very annoying ways. For example, “𝒎” is read out by my screen reader as "mathematical bold italic small m". So whole words or sentences of this makes a post impossible to understand. Please avoid.

Here's a tweet containing a video to illustrate what it's like.

And just in case it comes up, emojis are fine as long as you don't use long strings of them. But screen readers describe their contents and they're much better to use than other types of emoticons.

I've brought this up in one of the disability threads before but I'd like to surface it so it can be made official(ish) policy.
posted by Space Coyote to Etiquette/Policy at 8:08 AM (39 comments total) 74 users marked this as a favorite

A good reminder Space Coyote. I feel like I've been guilty of this in the past for purely aesthetic reasons and it's not necessary. So I'll incorporate this into how I consider making posts further down the road. Cheers.
posted by Fizz at 8:37 AM on May 19, 2020 [1 favorite]


I did not know this, though am unlikely to use fancy characters. Thanks for posting. This would be a useful item for the sidebar, which doesn't always get attention.
posted by theora55 at 9:15 AM on May 19, 2020 [2 favorites]


Would it be practical to detect such characters (or large numbers of them, or long sequences of them) during the post drafting process, similar to the check for URLs that have been posted before?
posted by jedicus at 9:35 AM on May 19, 2020 [2 favorites]


A related question - how do screen readers and their users interpret the standard bold and italicized tags? I am guilty of using them both for emphasis, but does it have that effect or is it more of a barrier?
posted by Think_Long at 9:52 AM on May 19, 2020 [3 favorites]


This is a good accessibility reminder, thanks Space Coyote.

Would it be practical to detect such characters.... during the post drafting process

The site's not so busy that mods can't just fix it when they see it. Simpler than building a thing at the moment (imnsho). I love that not every good idea here has to be tool-mediated so that it can scale 1000x.
posted by jessamyn (retired) at 9:58 AM on May 19, 2020 [12 favorites]


A related question - how do screen readers and their users interpret the standard bold and italicized tags?

The default MeFi B and I buttons actually use the strong and em tags. We're probably using them wrong, but so is everybody else.

Two common screen readers are NVDA and JAWS.

It sounds like NVDA has been able to understand strong and em since 2015, but defaults to ignoring it. As this github comment explains,
Having emphasis reported by default has been extremely unpopular with users and resulted in a lot of complaints about NVDA 2015.4. The unfortunate reality is that emphasis is very much over-used in the wild. I had serious misgivings that this would be the result when we implemented this and it seems these unfortunately turned out to be quite warranted. As such, we've now disabled this by default, though the option is still there for those that want it.
According to this page that tests JAWS HTML element support, JAWS does not support strong, and probably doesn't support em. Couldn't find any definitive documentation, though.
posted by zamboni at 12:31 PM on May 19, 2020 [15 favorites]


Thank you, Space Coyote.

On an only tangentially related note, like Think_Long I'm also curious to know what the screen reader experience is like with the strong and em tags, particularly as I (and many others) use em to demarcate quoted text from another user's post, and I (and others) often use strong to highlight a user name I'm mentioning (as I did above). Visually this works fine I think, but semantically it doesn't really make sense and it occurs to me that a screen reader may not parse that very well.

Would it improve the screen reader experience to enclose quoted sections from other users' posts in literal quotes? As in the following:

"I've noticed more and more posts that make use of Unicode characters to simulate fancy fonts in FPPs."

Or is using the em tag sufficient when parsed by the screen reader?
posted by biogeo at 1:19 PM on May 19, 2020 [2 favorites]


Also, zamboni, that article on how we're all using strong and em incorrectly is really fascinating, thanks for that.
posted by biogeo at 1:28 PM on May 19, 2020


Is there a reason screen readers don't have an option to treat these as regular characters? I can't imagine how it would be useful to read out the weight, style, and case of each individual letter like that. Searching unicode letters on Google, for instance, brings up the same results as if you'd typed it in normally.
posted by Rhaomi at 2:27 PM on May 19, 2020 [1 favorite]


I can't imagine how it would be useful to read out the weight, style, and case of each individual letter like that.

The screen reader is reading the Unicode name of each character. Sometimes the name conveys useful information. There are a lot of Unicode characters. See this NVDA github issue for some of the details of dealing with them.

Searching unicode letters on Google, for instance, brings up the same results as if you'd typed it in normally.

This is called Unicode normalization. It can be complicated. For screen readers, you’d have to do a weird lookahead thing to figure out where the effect stops, keep track of which Unicode block you’re in if you’re grouping things by character type, and have some way of knowing whether the user wants to know the correct name (say, in a math paper) or the normalized form (in a lulzy tweet).

Allegedly you might be able to hack in manual support for normalization in NVDA, but JAWS and VoiceOver users are out of luck. I haven’t been able to find any details about how’d you go about doing that with NVDA. There is an existing feature request for it as an integrated feature.
posted by zamboni at 3:35 PM on May 19, 2020 [9 favorites]


I started writing the below and then got interrupted by dinner, during which time zamboni asnwered, but maybe it's still useful.

If you're using these characters as they're intended, in mathematical formulas, then the weight, style, and case of each letter matter. If I needed a screen reader, I would definitely want it to read out those characters in exactly that way. For example, it's entirely normal to encounter something like "ℱ𝒇=(Fx(kx),Fy(ky),Fz(kz))", which in certain contexts might be read as "the Fourier transform of the vector-valued function 𝒇 is equal to a vector of functions over the components of its conjugate domain" or something like that. Of course a screen reader can't produce that, but it can tell you that the first "ℱ" is "mathematical calligraphy capital f" and the second "𝒇" is "mathematical bold italic small f" and so on, which is enough to let you understand the equation if those symbols mean anything to you.

Google can automatically convert Unicode mathematical symbols to normal text because no one is going to search using mathematical symbols, as their meanings are too context-dependent to be useful for search. A screen reader can't necessarily make the assumption that the user doesn't want the true symbol instead of a substitute from the normal Latin alphabet.
posted by biogeo at 4:05 PM on May 19, 2020 [13 favorites]


Speaking as a screen reader user, I want to enthusiastically second this request. :)

As far as emphasis is concerned, it's unfortunately hard to do accessibly. Screen readers historically haven't supported many of the tags used for it, and it can be disorienting for folks who don't know what is meant by "bold," or whatever. Unfortunately I don't know of a great solution in this case that will work for everyone.
posted by Alensin at 9:38 PM on May 19, 2020 [8 favorites]


Quoting biogeo:

"Would it improve the screen reader experience to enclose quoted sections from other users' posts in literal quotes?"

I also wonder whether quote characters would help, or if it would be better to use words like "Quoting user X" to indicate what we are doing.
posted by a snickering nuthatch at 12:31 AM on May 20, 2020


There's also the actual quote HTML element, which works in the MeFi editor: <q></q>
posted by XMLicious at 2:35 AM on May 20, 2020 [5 favorites]


If I could ask a slightly off-topic question -
I format ebooks. Fiction. And occasionally I will use unicode symbols as scene separators. Now it occurs to me that this is probably not a very good idea, for readers with screen readers. The standard for scene separators in fiction seems to be a series of asterisks but I imagine those are also not particularly pleasant to encounter in the middle of a text?
posted by Zumbador at 3:43 AM on May 20, 2020


I think bolded text, quotes, etc. should be done with the semantically specific HTML tags. Most of the time it gets ignored which is actually the most pleasant reading experience. Interrupting sentence to indicate that a word is bold is usually pretty unnecessary since it's often just obvious from context that that word is what the writer wanted to emphasize.

The best guide rule for accessibility is just use the semantically correct HTML tag. This is surprisingly difficult for a lot of modern web frameworks that just throw everything in divs, but it lets the screen reader make the most useful decisions about how to convey the page's contents.
posted by Space Coyote at 5:18 AM on May 20, 2020 [6 favorites]


. The standard for scene separators in fiction seems to be a series of asterisks but I imagine those are also not particularly pleasant to encounter in the middle of a text?

The asterisks is a pretty well-known convention so breaking from that to convey the same thing would be less good to convey understanding.
posted by Space Coyote at 6:49 AM on May 20, 2020 [3 favorites]


I both want to second this as a supporter of screen reader users as well as support it from the point of view of someone that hates terrible crazy font games in posts and comments.

Selfishly hoping this gets implemented (or unimplemented or modded or whatever) as a thing in the future.
posted by RolandOfEld at 7:44 AM on May 20, 2020 [6 favorites]


Thanks, Space Coyote. I've actually been wondering about this for a long time, and if there's choices between options that are more or less indifferent for me writing a comment but make the screen reader experience slightly more pleasant, I'm glad to know them.

I honestly didn't realize the <q> tag was still part of the HTML standard. It looks like it formats text with opening and closing quotes, as in the following:

The best guide rule for accessibility is just use the semantically correct HTML tag.

(I'm not sure if this is browser-specific formatting or not so it would be interesting to know if others see that quote rendered differently.) What would people think about also adding italic styling into the Metafilter stylesheet for the <q> tag? Since the unofficial Metafilter stylebook seems to currently favor italicizing text quoted from other users' comments, this would allow people to pretty seamlessly switch to using <q> for this purpose instead of <em> or <i> without changing the "look and feel" of Mefi comment threads. It seems like a pretty minor change to implement but I know I'd like it.
posted by biogeo at 11:11 AM on May 20, 2020 [3 favorites]


What would people think about also adding italic styling into the Metafilter stylesheet for the q tag?

I think it'd be an improvement improvement that would mess with twenty years of text that was formatted under different rules, for limited gain. I use the q tag reasonably often for quotes, and would not be excited about having things italicized when that wasn't what I intended.
Usually, I work like this:
quote from upthread gets em or i
quote from the article gets blockquoted
inline quote, or so-called get q
"straight double quotes": I'm either quoting code, or I have been kidnapped and am signalling for help via MeFi.
posted by zamboni at 11:48 AM on May 20, 2020 [1 favorite]


Also, ever since Apple launched their own set of emoji (a few years back), they just show up as squares or squares with x’s in them on non-apple devices. It just shows up as nonsense.
posted by sexyrobot at 12:26 PM on May 20, 2020 [1 favorite]


> Since the unofficial Metafilter stylebook seems to currently favor italicizing text quoted from other users' comments, this would allow people to pretty seamlessly switch to using <q> for this purpose instead of <em> or <i> without changing the "look and feel" of Mefi comment threads.

As the author of one of the slightly popular quote-generating Javascripts for Mefi threads (used for creating this comment), I find this alluring, since the text above, quoted from somebody else's comment, could be tagged semantically while looking like it does now. But I don't like that idea much. Dramatically changing the default behavior of a semantic tag can cause its own complications for sighted users.

The practice of using a less-than symbol or other punctuation followed by styled text inherits from old-timey block quotes generated by terminal programs for email or Usenet. It's not the most accessible way to do it, but it's understandable at a glance. That doesn't help users of screen readers and isn't forward-compatible, as they say, towards a farther-flung future where applications run from the terminal are even more obscure.

Metafilter could create its own custom HTML element that extends the quote element. This technology's been available for a few years despite not being a ratified standard DOM API, so it's generally available to browsers with Javascript support other than Internet Explorer. This way the site can have its own quote element, such as <mefi-qi>, inheriting the semantics of <q> and its cite attribute but having a unique default style.
posted by ardgedee at 2:11 PM on May 20, 2020 [3 favorites]


The MeTa request is a good one.

I think it'd be an improvement "improvement" that would mess with twenty years of text that was formatted under different rules, for limited gain. I use the q tag reasonably often for quotes, and would not be excited about having things italicized when that wasn't what I intended.

Random data point here: in Voiceover, the strike tag breaks this paragraph into three discrete sections -- one before the strike, one after the strike, ending after "improvement" and then the rest.

Also, Voiceover doesn't recognize the strike tag.

One more thing: Voiceover (at least while running in English) pronounces "MeFi" as "Mee-fee," with the emphasis on the "fee."
posted by mandolin conspiracy at 2:18 PM on May 20, 2020 [5 favorites]


I did not know this was an issue. Thanks for bringing it up.
posted by They sucked his brains out! at 2:43 PM on May 20, 2020


Huh. I had no idea anyone actually used the <q> tag, least of all for inline quotes.
posted by biogeo at 3:06 PM on May 20, 2020


Also, ever since Apple launched their own set of emoji (a few years back), they just show up as squares or squares with x’s in them on non-apple devices. It just shows up as nonsense.

Can you please provide more information on this? I think you might be mistaken. As far as I know, all Apple emoji are Unicode compliant, without any additional presentations like Microsoft, Samsung and HTC. If the non-Apple device doesn’t implement characters from recent Unicode releases, that’s an issue with that device, not Apple.
posted by zamboni at 4:00 PM on May 20, 2020 [1 favorite]


Huh. I had no idea anyone actually used the q tag, least of all for inline quotes.

spiders zamboni is an outlier adn should not have been counted
posted by zamboni at 4:03 PM on May 20, 2020 [3 favorites]


The only custom Apple "emoji" I can think of is , but yeah I think they just mean they aren't updating their mobile OS to keep up with new Emoji.

As for the side discussion, it is basically pointless changing since all we are asking for is for actively unusable stuff to be discouraged. Quote styles don't make a difference to one's ability to use the site.

Now if I were getting ambitious, I'd say it's time to ditch some of the ascii clutter below each post and / or occlude it with aria-hidden="true" so it stops reading out the left and right square brackets around the flag link.
posted by Space Coyote at 4:50 PM on May 20, 2020 [6 favorites]


Now if I were getting ambitious, I'd say it's time to ditch some of the ascii clutter below each post and / or occlude it with aria-hidden="true" so it stops reading out the left and right square brackets around the flag link.

Could the button role be helpful?
posted by zamboni at 5:18 PM on May 20, 2020 [2 favorites]


Are screen readers sophisticated enough to pronounce words differently when they're marked with the <em> tag? It's supposed to indicate "emphasis".
posted by Joe in Australia at 5:00 AM on May 21, 2020


Are screen readers sophisticated enough to pronounce words differently when they're marked with the em tag?

See above.
posted by zamboni at 5:40 AM on May 21, 2020




I love that not every good idea here has to be tool-mediated

I dunno, a mention of the concern on the "new post" screen, or, if that makes the posting page too busy, a warning on preview akin to the "YOU FORGOT A TITLE!" thing when someone uses a long string of characters that mess with screen readers doesn't seem like a hugely tool-mediated solution.
posted by mediareport at 1:20 PM on May 22, 2020


Oh, also, I remember seeing a comment years ago that the placement of the period before or after a closing html tag can be annoying. I think the idea was that placing the punctuation inside the closing tag makes for a better experience for a screen reading member, but can't recall exactly. Is that still a thing worth considering as we post?
posted by mediareport at 1:29 PM on May 22, 2020 [1 favorite]


mediareport, you might be thinking of contraption and alasdair?
posted by cgc373 at 2:28 PM on May 22, 2020 [3 favorites]


Yes, exactly; thanks. Does putting punctuation inside closing html tags still yield a noticeably more fluid experience for folks using screenreaders? Happy to do it if it matters to folks.
posted by mediareport at 5:56 PM on May 22, 2020


Oh, also, I remember seeing a comment years ago that the placement of the period before or after a closing html tag can be annoying. I think the idea was that placing the punctuation inside the closing tag makes for a better experience for a screen reading member, but can't recall exactly. Is that still a thing worth considering as we post?

It just says "period" at the end, entirely innocuous. I would say not to include it since if a user wanted to copy the text of the link, they probably won't want the period to come along. Far far beneath the threshold of problems that would compel someone to disclose that they use a screen reader to ask for people to do something differently.
posted by Space Coyote at 7:48 PM on May 22, 2020 [2 favorites]


OP here, from the FPP that kicked this off. Sincere apologies for this, and I really appreciate you bringing it to my attention Space Coyote. I agree wholeheartedly that accessibility shouldn't be an ideal but rather the norm, and it completely escaped me that this would be an issue here. Lesson learned and welcome.
posted by Ten Cold Hot Dogs at 3:56 AM on May 26, 2020 [3 favorites]


The standard for scene separators in fiction seems to be a series of asterisks but I imagine those are also not particularly pleasant to encounter in the middle of a text?

It's come up on Tumblr, with screen-reader people complaining about fanfic writers' choices of scene separators. Consensus was that a single symbol was much preferred: One asterisk, section-break symbol, caret, dingbat, or whatever. Three is annoying, specially for stories that have a lot of them. Imagine this:

"He left her there, alone in the room.
Asterisk asterisk asterisk
Six months later, they met at..."

(Now imagine OoOoOo or |'~*.*~'| being read character-by-character.)
posted by ErisLordFreedom at 1:45 AM on May 30, 2020


« Older How are you doing? Are you ok?   |   Two decades in the blue Newer »

You are not logged in, either login or create an account to post comments