In this article, I would like to explain the basic issues with Central European languages and how to avoid making mistakes. The first step in making Linux "your-language-friendly" is to create a locale for your language. A locale is a set of definitions of how to represent and process various data types like time, date, monetary symbols, special characters, and so on. One of the important parts of a locale is the so-called message translation definition, a set of files which define how certain messages are translated to that particular language. There's usually one such file for an application, a hash table which contains all the application's messages, so it's generally the translation of the program's user interface.
The problem is not related to including these locales in certain distributions (they're part of glibc, and it's quite easy to add to glibc if you want to). The problem is with setting the locale parameters for each user. This is what almost no distribution considers when setting up users. There should not be a system-wide default, because Linux is a multiuser environment. Each user should be able to set his own language variables. If he wants to do it now, he has to edit his .bashrc or similar file to have the proper values set. This is not very user-friendly.
There is almost no problem with locales and translations, so, in most distributions, you can see the messages in your language when you set the correct locale and have the messages installed. Now we want to type our characters and see them, so we need fonts for displaying. There are not enough free fonts, but most distributions include those which are available for each character encoding.
Keyboards are more difficult; there's no general way for users to configure them. Many distributions with graphical installations used to read a directory called rulesets, which is outdated and contains deceiving information (for example, a "Czechoslovakian keyboard" -- complete nonsense, since Czechs and Slovaks use different keyboards and different characters). There is also the problem of not setting the correct locale (which causes the keymap to not work correctly). These are major internationalization problems which are not so difficult to solve. All it wants is just a bit of good will from the distribution creators (they can contact me if they want to discuss something; I really want Linux to be usable in my country and with my language).
There are also more difficult problems to solve. One is the problem of locale and switching keyboards. When I went to Norway last year to visit a friend, I found a problem: I wanted to switch between Slovak, English, and Norwegian keyboards. Since it's quite easy under some systems, I thought it would be no problem with Linux. I launched xkbsel, which switches the keyboards "on the run". The problem was that the keyboard doesn't work without the correct locale. If I started xterm with locale set to Slovak, I could not type Norwegian characters. If I set it to Norwegian and started another xterm, the Slovak characters were not working. The cause was that the Slovak keyboard mapped keys to ISO8859-2 characters, while the Norwegian keyboard used the ISO8859-1 charset. There are characters in one charset which are not present in the other. It was not possible to use the particular keyboard without setting the corresponding locale. Currently, this means restarting the application with the correct locale set.
The next problem arises when translating applications. Currently, we mostly use GNU gettext to do the translation. It is quite nice, but, in some cases, not sufficient. In many languages, the translation of a sentence can differ according to the context. Since there is no context information, it is difficult to make correct translations. The KDE team solved this issue by putting the context information into the message identifiers, so it works correctly with a few workarounds, but that's not a real solution. In English, for example, a noun differs only in its singular and plural forms (you have "one file" and "two files"). In Slavic languages, the plural form is often not regular (in Slovak: "1 súbor", "2, 3, 4 súbory", "5, ... súborov"). This is another issue to be considered when creating an application (currently, the programmer has to think about this, but the easy solution would be to create a framework).
The KDE team is developing workarounds for most of the problems I describe here, but I also want other developers and distribution manufacturers to be aware of these problems and to try to solve them. Otherwise, Linux will stay English-centric, and that would be bad for Linux itself.
Juraj Bednar (http://www.darkie.sk/index.en.php) is a security consultant and a columnist for a Slovak computer magazine. He has been a member of the KDE i18n team since the 1.0 release. He can be reached at firstname.lastname@example.org.