Text encoding Babel. Was Re: George Keremedjiev

Liam Proven lproven at gmail.com
Sun Nov 25 16:53:00 CST 2018


On Sun, 25 Nov 2018 at 23:42, Grant Taylor via cctalk
<cctalk at classiccmp.org> wrote:

> I bet you see all sorts of things that I'm ignorant of.

It's been enlightening!

Some I was ready for.

E.g. In French or Spanish, both of which I can speak to some extent,
letters  like á or ó are not seen as separate letters: French would
call them a-acute, an a with an acute accent. Ç is a c with a cedilla.
Etc.

But in Swedish/Norwegian/Danish -- I speak basic Norwegian and
rudimentary Swedish -- ø and å and ä and so on are not a or o with
accents on: they are _different letters_ that come at the end of the
alphabet.

Czech is like that. Š and Č and Ž and many more that my Mac can't
readily type are _extra letters_ which come after the unmodified form
in the alphabet.

Without them, you can't write correct Czech. It's worse than writing
English without the letter E.

Usually you can guess but not always.

Byt means flat, apartment; b y-acute t means the verb "to be".

You can probably work that out, but you can't always. A restaurant
menu would be hopelessly corrupted as both "raw" and "with cheese" are
quite likely.

> > For example, right now, I am in my office in Křižíkova. I can't
> > type that name correctly without Unicode characters, because the ANSI
> > character set doesn't contain enough letters for Czech.
>
> Intriguing.  Is there an old MS-DOS Code Page (or comparable technique)
> that does encompass the necessary characters?

Don't know. But I suspect there weren't many PCs here before the
Velvet Revolution in 1989. Democracy came around the time of Windows
3.0 so there may not have been much of a commerical drive.


> Would you please provide an example?

Sure, my office street name:  Křižíkova

> (I'm curious if my email client
> will display things properly.)

K, r haček, i, z haček, i acute, k o v a.

A hacek is like an upside down circumflex: ^

Also known as a caron.

> Oh my.  I had no idea that accent characters made such a difference. But
> I consider that to be my personal ignorance living in the U.S.A.  I do
> NOT think it's anybody's fault by my own.  I'll defend others if someone
> tries to say that their native / local regional norm is the problem.

Oh yes. It's quite a minefield.

Czech keyboards have so many extra letters, the *numbers* are on shift
combinations!

> I will say that I think everybody has their own individual prerogative
> to filter email as they see fit.  They just need to know that they are
> doing and own the fact that they might be causing unintentional harm.
>
> P.S.  Resending from the correct email address.  —  A recent Thunderbird
> update broke the Correct-Identity add-on.  :-(

Well yes.

I believe Mr Corlett here rejects all mail from gmail.com -- except mine... ;-)

-- 
Liam Proven - Profile: https://about.me/liamproven
Email: lproven at cix.co.uk - Google Mail/Hangouts/Plus: lproven at gmail.com
Twitter/Facebook/Flickr: lproven - Skype/LinkedIn: liamproven
UK: +44 7939-087884 - ČR (+ WhatsApp/Telegram/Signal): +420 702 829 053


More information about the cctalk mailing list