locked Re: A question on character sets


So, when that gets displayed by an application that doesn't support UTF-8, what you get is usually interpreted using one of the old code pages, and you see the two character sequence ® (Capital A with circumflex, Registered Sign).

As an exercise for the reader, try sending ALT+0255, it should come out as 0xC3 0xBF (Capital A with tilde, Inverted Question Mark) on incompatible systems.

Interesting. For ALT+0255 my Win7 system gives ÿ (Small Letter Y with Diaeresis), which is ISO/IEC 8859-1, but Windows-1251 says that should be a reversed R (Cryillic Capital Letter Ya). Maybe Microsoft changed it.

-- Shal
Note: this message is composed in an email client that will send it in an 8-bit code page, not UTF-8
Thanks, Shal, for explaining this. I am understanding part of it, even though I'm not really that tech savvy. I guess part of what confuses me is why, if the message is composed in this web page window, which seems to understand UTF-8, and it goes to my machine, which also understands UTF-8, does it pick up the circumflex A [ Â ] somewhere in between? (I want to see if that circumflex A symbol stays since I put it in with an ALT code.) I'm guessing that somewhere in between is a machine that's not just passing the raw code through. Maybe it's the dread NSA trying to read our secret mail? :-)

ALT+0255 gives me this [ ÿ ]

Join main@beta.groups.io to automatically receive all group messages.