Spell Check Problem September 20, 2002 12:27 AM   Subscribe

Spell check problem - if the text being checked has character entity sequences > or < when spell check completes they will be replaced in the textarea with greater-than or less-than signs. My browser is: Mozilla 1.1 -- Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826.
posted by chipr to Bugs at 12:27 AM (9 comments total)

Hmmm ... that should read:
... has character entity sequence ampsersand-g-t-semicolon or ampersand-l-t-semicolon when spell check ...
It previewed differently from the post.

WHAT.
THE.
....oh never mind.... :-)
posted by chipr at 12:31 AM on September 20, 2002


This bug has been MetaTalked repeatedly. See here, here, here, here, here, and here, to name but a few.
posted by timeistight at 1:51 AM on September 20, 2002


it's a different bug - chipr is saying that spellcheck replaces angle bracket "escaped" characters (entity references) with the signs that they represent.

for example (if i understand correctly - i've not checked) spell checking text that includes &lt; will replace it by < (and if i preview this message it munges what i'm typing, so this post may be unintelligible...)
posted by andrew cooke at 5:54 AM on September 20, 2002


I'm pretty sure it's all part of the same character-handling quirk in Cold Fusion MX.
posted by timeistight at 8:17 AM on September 20, 2002


no, i don't think so, because this problem is related to escaped characters which, in the case of angle brackets, are pure ascii (ie 7 bit). the little box problem, on the other hand, comes (i presume) from people using 8 bit characters (or unicode or whatever), when the system (ie cf) either is either missing, or hasn't been properly configured to use, information about the character set in use. also, he/she/it isn't complaining about little boxes, but that the escaped sequence is being replaced by the apropriate code.

it seems much more likely that somewhere in the spellchecking chain (which, incidentally, appears, from the file extensions, to be via asp, not cf) the data in the text are being treated as "url encoded" (i'm not sure that's a universal term, but it's the java terminology, iirc) when they weren't. or, more exactly, it's being unencoded more times than it was encoded. this is a common (at least, i've battled with it before today) problem, because it's never completely obvious (to me) what does escape data and what doesn't.

however, it is possible, i guess, that the code used to mitigate the second problem (non-ascii) is reponsible for the first, but it seems unlikely (although it's possible to mix the two problems, i suppose, by using an escape sequence for a non-ascii character). all the above just speculation given what i understand of html and character sets - i know diddly squat about cf (used it only once and not in much detail).
posted by andrew cooke at 8:00 PM on September 20, 2002


I stand corrected.
posted by timeistight at 8:22 PM on September 20, 2002


sorry - did i go on too long? i can't sleep... (i should add that i'm not in the pst time zone).
posted by andrew cooke at 8:41 PM on September 20, 2002


Not at all. You've obviously given this a great deal of thought. I probably spoke up too soon.
posted by timeistight at 8:47 PM on September 20, 2002


Ohhh, I got a 'post a comment' link. But here, not on the main site.

And here I have a Post button. 1st time I've ever seen this. Is this a bug or a feature?
posted by rough ashlar at 8:17 PM on September 22, 2002


« Older I can't post comments at all   |   Listing links on user profile pages Newer »

You are not logged in, either login or create an account to post comments