Special characters October 8, 2002 1:18 PM   Subscribe

I “think” spéciàl chàràctérs shöuld wörk nöw, thöugh it's pröbably töö éàrly tö sày the böxés and quéstiön marks will be göné. That'll be £5 or 1000¥ or €3.5.
posted by mathowie (staff) to Bugs at 1:18 PM (54 comments total)

hmm. everything seems to be working.

Here's some hebrew:
?? ???? ?? ??????? ??

Here's a trademark symbol: ™
posted by mathowie (staff) at 1:22 PM on October 8, 2002


Ì ßènt ?? Woo??e.

Excellent.
posted by thewittyname at 1:22 PM on October 8, 2002


dang, is hebrew beyond utf-8?
posted by mathowie (staff) at 1:22 PM on October 8, 2002


Uh....fill in the blanks
posted by thewittyname at 1:22 PM on October 8, 2002


Let's see if my favorite — the emdash (the — version) — works. How about the correct version (the — one) — does it work?
posted by dayvin at 1:27 PM on October 8, 2002


Do you prefer ¢ or £?

[I prefer £, I have enough sense already!]
posted by dash_slot- at 1:29 PM on October 8, 2002


old metatalk threads seem to work now, as do metafilter threads that had boxes in them.
posted by mathowie (staff) at 1:33 PM on October 8, 2002


Me – I like the en-dash – either version.

Looks good. I hope the fix isn't what's slowing down the site.
posted by timeistight at 1:34 PM on October 8, 2002


thanks, matt.
posted by moz at 1:38 PM on October 8, 2002


Sniff. Well, very good work anyway, Matt!
posted by yhbc at 1:39 PM on October 8, 2002


Excellent. Thanks for the improvement, Matt.
posted by rocketman at 1:45 PM on October 8, 2002


é è ç ì ù

posted by matteo at 1:47 PM on October 8, 2002


[I prefer £, I have enough sense already!]

/me £'s dash_slot
posted by bradlands at 1:49 PM on October 8, 2002


Let’s see if it’s fixed what happened last time I tried to use typographer’s punctuation.
posted by Firefly at 1:50 PM on October 8, 2002


Another attempt at hebrew: (just letters, works in preview)
??????

? Œ‰†‡æ¡™£¢8§¶£¢8°·?‡‹Ô?¯¿

Preview is working, at least in Mozilla/Chimera... here goes...
posted by joemaller at 2:07 PM on October 8, 2002


and failure.
posted by joemaller at 2:08 PM on October 8, 2002


Well, I'm glad the boxes are gone anyway, was it something you did or a patch to the server software? Any geeky details you'd care to share?
posted by joemaller at 2:09 PM on October 8, 2002


¡Excelente, puedo ahora utilizar la primera marca del exclamation!
posted by MrBaliHai at 2:25 PM on October 8, 2002


I had to change the database field types to "n" types (nvarchar instead of varchar, ntext instead of text, etc), as well as force utf-8 encoding, and I also had to install the latest service packs on the application server. It seems to be mostly working, the only problem I've found in testing is hebrew.
posted by mathowie (staff) at 2:45 PM on October 8, 2002


Let's try Korean! :-)

? ?? ?? ???.

Works in the preview, now the moment of truth...
posted by Plunge at 2:58 PM on October 8, 2002


grrrrr...
posted by Plunge at 2:59 PM on October 8, 2002


¿Que?
posted by mr_crash_davis at 3:07 PM on October 8, 2002


Here is some Hebrew as character entities:
א ב ג
And just as characters:
? ? ?
posted by chrismear at 3:10 PM on October 8, 2002


Hmm..what about japanese?

?????????

looks ok in preview...


posted by puffin at 3:13 PM on October 8, 2002


Nevermind, then..heh :)
posted by puffin at 3:14 PM on October 8, 2002


ok, so language issues still persist. Most problems are solved though. If you're seeing the characters in preview, the problem lies with the database drivers still.
posted by mathowie (staff) at 3:15 PM on October 8, 2002


The difference in the character entity versions and the unicode is that your browser interprets the entities, SQL server and Cold Fusion have to interpret the unicode versions. Hence, the ???? when you try to do it without the entities.
posted by eyeballkid at 3:44 PM on October 8, 2002


I could make some snarky comment on how this must be a plot by mathowie against the Hebrew speaking Jews and the Asians (with Korean & Japanese not posting, Chinese most likely doesn't either) to keep us down, but instead I'll say good job for even getting it this far. I'm sure you'll get it working with all the different languages and then boy, are we in trouble.

:-)
posted by Plunge at 3:46 PM on October 8, 2002


UNICODE data types in MS SQL Server only support characters from the code page assigned at the SQL Server install (though, 2000 supports the use of seperate collations for each database and seperate collations for queries). If I'm right, the default install code page is 1252 . That page only supports, I believe, the english alphabet and variations therof (Italian, Spanish, etc.) I believe that UNICODE set required for multiple languages is 850.

Still, I can't find a listing for the 850 code page anywhere in the MSDN or in SQL's BOL, so I'm not positive that it's the right page. I do know that 1252 won't handle anything except the variants I mentioned above (and, I believe, the Greek alphabet).


posted by eyeballkid at 4:04 PM on October 8, 2002


Fantastic - but now all the apologies for weird characters appearing in posts are going to look silly :-)
posted by dg at 4:48 PM on October 8, 2002


mathowie.. I wouldn't worry too much about getting the language characters to show up.

As nifty as it would be to post Japanese in hiragana, most users probably wouldn't be able to see the characters anyway because their computers don't have the language support installed. And hey -- this is an english-language web site, after all.


posted by puffin at 7:04 PM on October 8, 2002


I'm guessing this might not work: 日本語? א? ⇔?
posted by bobo123 at 7:07 PM on October 8, 2002


日朮語?
posted by bobo123 at 7:16 PM on October 8, 2002


The first person that tries to post in Japanese is going to get a severe wood-shed-ass-hauled-into-metatalking-to.
posted by insomnyuk at 7:28 PM on October 8, 2002


conichi-wa, insomnyuk (couldn't resist!)
posted by amberglow at 7:30 PM on October 8, 2002


bobo: that second character ain't quite right
posted by hama7 at 8:58 PM on October 8, 2002


???
posted by reverendX at 9:21 PM on October 8, 2002


Korean test :

?????!
?? stavrosthewonderchicken ???.
posted by stavrosthewonderchicken at 9:21 PM on October 8, 2002


I know. I meant 日本語 but I typed 672E instead of 672C.

私はガラスを食べられます。それは私を傷つけません。
posted by bobo123 at 9:22 PM on October 8, 2002


Hmmm... so Japanese does work? But Korean is still a no show. *sigh*

/me ponders a Japanese conspiracy...
posted by Plunge at 9:25 PM on October 8, 2002


This is so great—dashes work now!
posted by Yelling At Nothing at 11:19 PM on October 8, 2002


????????? ????
posted by dydecker at 11:20 PM on October 8, 2002


Regular Japanese input doesn't work.
posted by dydecker at 11:23 PM on October 8, 2002


No conspiracy, Plunge. bobo123 is using character entities, which means that instead of typing a character straight into the comment box, you type in the code for the character, in this format:

本

which would give you 本.

The database has no problem storing the character entities, because they're just numbers and punctuation, but the current software seems to choke if you enter the actual symbols, and produces all those ?s.

So if you're really desperate to put the odd foreign character in, that's how to do it (although remember it won't display in everyone's browser). Also watch out that when you type in a character entity, and then hit preview, it converts the code into a regular, typed-in character. You'll have to turn it back into a code before you post.
posted by chrismear at 11:41 PM on October 8, 2002


Åwøsæme!
posted by dagny at 1:16 AM on October 9, 2002


UNICODE data types in MS SQL Server only support characters from the code page.
That's not true actually. In SQL Server, Unicode fields are Unicode, pure and simple. The collation refers to the character set/sort order of non-unicode fields (e.g. varchar instead of nvarchar).
Matt, question marks (as opposed to blocks), and the fact that the characters work in preview, indicate to me that the text has not made in into the database correctly. Can you see Hebrew characters in there?
I know nothing of Cold Fusion, but in ASP, the first thing to check would be that the server codepage is set to UTF8, this basically tells ASP that when it is response.writing unicode data from the database, it should convert it to UTF8, and that any input it receives from form data should be considered UTF8 and converted to Unicode accordingly.
I know nothing of your internationalization skills, so I apologize if a certain phrase about sucking eggs comes to mind.
posted by chill at 2:09 AM on October 9, 2002


this is all just just a cunning plan to reduce the maximum comment size from 8000 to 4000 characters... ;o)
posted by andrew cooke at 6:01 AM on October 9, 2002


No conspiracy, Plunge. bobo123 is using character entities, which means that instead of typing a character straight into the comment box, you type in the code for the character...

Oh sure, way to go and ruin a good conspiracy for me.
posted by Plunge at 9:31 AM on October 9, 2002


Wondering if this addresses the problem I posted last month.

&

That should display as ampersand-a-m-p-semicolon
posted by chipr at 10:35 AM on October 9, 2002


I ? Unicode.
posted by wanderingmind at 12:11 PM on October 9, 2002


Dang. That looked fine on preview (and I even typed in the character-entry thingie, not copy-and-pasted from the character map).
posted by wanderingmind at 12:14 PM on October 9, 2002


An oldie but a goodie - Metafilter : That looked fine on preview!
posted by stavrosthewonderchicken at 3:24 PM on October 9, 2002


Let's see! Here's a Portuguese sentence that uses all our accents (and which miserably failed the last test):

Os cães só têm maçãs à terça-feira.

[Dogs only have apples on Tuesday]
posted by MiguelCardoso at 4:39 PM on October 9, 2002


Whee!
posted by MiguelCardoso at 4:39 PM on October 9, 2002


« Older MeFi user survey about MeTa topics?   |   Internal errors in MetaFilter and MetaTalk Newer »

You are not logged in, either login or create an account to post comments