Linux Key Combinations

This explains how to type the characters from encodings for some language’s alphabets in Linux.

Terminology

character encoding

The data in a document can be thought of as a string of 0’s and 1’s: binary numbers. These numbers are always grouped into bytes which is a set of 8 bits. A byte can be interpreted as a number from 0 to 255.

A character encoding is a re-interpretation of these numbers as characters from human language alphabets.

For instance, the old ASCII encoding interprets numbers in the range 33-126 as characters that would be found on an American typewriter. (The remaining characters 0-33 and 127 are used as device control characters: carriage return, tab, newline, delete, etc.)

ASCII uses only the first half of the possible numbers in a byte, so it is called a “7-bit” encoding.

ASCII only barely suffices for American English, however. Many other encodings have been developed to encode other languages.

The first 127 characters of almost (but not all) encodings are ASCII. This is a good thing because it means that the US English part of any text can be understood (almost) independently of the encoding.

The character encoding to end all character encodings is Unicode, whose goal is to encode the writing systems of all the world’s languages (as well as many other symbolic systems). Variants of Unicode are UTF-8, UTF-16, and ISO-10646. To do this, it is insufficient to associate the 255 different bytes with characters, so Unicode uses multiple bytes to encode characters. It is therefore called a multi-byte encoding.

Computer systems have mostly completed the move to Unicode. There are many weak spots, and older systems often don't support it at all.

Before Unicode, dozens of other encoding systems were developed. Most of them were 8-bit encodings, specialized for a particular language, or for some small group of languages. These include the international standard ISO-9959 series, which enable one to type in both English and some other language or languages. Then there were series of encodings meant for a particular computer architecture, such as the PC CodePage (CP) series for IBM PC compatibles and Microsoft Windows, and the Macintosh encodings.

An 8-bit encoding is insufficient for ideogrammatic writing systems such as Chinese, Japanese, and Korean. So multi-byte encoding systems were developed for these: Big-5, JIS, and KSC.

key code
A key code identifies a particular key on the computer keyboard. Unfortunately, different manufacturers numbered their keyboards in incompatible ways. However, with knowledge of your keyboard brand and model, X-windows can sort this out.
input method
For English, we have to use the shift key to obtain upper-case letters. In other languages, it is necessary to use other combinations of keys to obtain accent marks. In many languages, complex combinations of keys may be required to obtain a given character. For a given writing system, an input method is the computer algorithm which obtains a character from a combination of key strokes. Notice that some languages may have several input methods.
keyboard layout

There is more to this. Probably something in X configuration turns on the sticky keys.

KDE

In KDE 3, one first must configure the keyboard with an appropriate layout. This is done with the Keyboard Layout module of the KDE Control Center. Under the Layout tab, click “Enable keyboard layouts”. This places a “KDE Keyboard Tool” in the KDE System Tray. To type primarily in U.S. English, you will need the layout called “U.S. English w/ ISO9995-3”. The layout “U.S. English with deadkeys” also works, but I find it annoying. To type in many other languages, click on the appropriate items. This will put that language on the list for the Keyboard Tool. The “Switching Policy” determines is where the KDE Keyboard Tool has its effect: Globally, only for the current application, or only for the current window.

Applications

Setting the keyboard layout only allows for the possibility of X-windows to enter special symbols. This does not mean a given application can accept the symbols.

Some older applications such as Kterm have built-in input methods.

Applications linked with the KDE 3 or Gnome 2 libraries should be able to accept general Unicode typing. That is, you should be able to type any language, given you have the fonts and know how to work the input method for that language.

Recent versions of the text editors KWrite and Gedit are both Unicode enabled (but in different ways: Gedit assumes everything is Unicode; KWrite will open and save files in a variety of encodings). The mail client Balsa is also Unicode enabled.

There is a project under way to make Unicode xterms, but most xterm clones are still very much 8-bit devices. So long as you are willing to type only in certain combinations of languages with alphabetic writing system, this is not a big problem. You have to set the encoding to the one you want, then find a font that supports the encoding.

Typing with 9995-3

See man iso-8859-1 and man iso-8859-15 for more info about these encodings.

Especially note: Latin-9 is a modification of Latin-1.

On my keyboard, the right “ windows” key is the meta key. To produce a character, hold down the meta with the key in the meta column below then type the key in the combo column below to get the result.

Time is of the essence between pressing the meta key and the first key of the combination.

When shift is required with meta, hold it down before holding down meta.

Some other useful commands in this connexion: dump keys, show key.

This list is not exhaustive of key combinations that produce 8859-1 characters. I’ve only listed the ones that seemed most natural to me. On the other hand, dumpkeys lists many characters that can’t be made by any combination listed. In fact, most of the combinations dumpkeys lists don’t work.

meta combo resultsdescription
Punctuation
< < «chevron or guillemet
> > »chevron or guillemet
? ? ¿inverted question mark
! ! ¡inverted exclamation point
- ^ ¯overbar or macron
0 ^ °degree
- - ­soft hyphen
 non-breaking space
Superscripts
^ 1 ¹
^ 2 ²
^ 3 ³
Math
x x ×multiplication
- : ÷division
. . ·middle dot
- + ±plus or minus
- , ¬negation
Editing
0 s §section
P ! paragraph or pilcrow
Foreign
a _ ªfeminine ordinal
o _ ºmasculine ordinal
` aAeEiIoOuU àÀèÈìÌòÒùÙgrave accents
' aeiouyAEIOU áéíóúýÁÉÍÓÚacute accents
^ aAeEiIoOuU âÂêÊîÎôÔûÛcircumflex
" aeiouyAEIOU äëïöüÿÄËÏÖÜumlaut or dieresis
, ,cC žçÇz-caron; cedilla
~ nNaAoO ñÑãÃõÕtilde
s s ßsharp s
t h þsmall thorn
T H Þcapital thorn
- dD ðÐeth
/ oO øØO with stroke
/ u µmicro or mu
a a åa with ring
A A Åcapital A with ring
a e æligature ae
A E Æcapital ligature AE
^ ! Š
' ' Žacute accent (8859-1), capital Z with caron (8859-15)
1 2 ½fraction 1/2
o e œligature oe
1 4 ¼fraction 1/4 (8859-1)
O E Œligature OE (8859-15)
3 4 ¾fraction 3/4 (8859-1)
" Y Ÿcapital Y with diaresis (8859-15)
Business
L = £pounds
Y = ¥yen
c / ¢cents
= C currency (8859-1) Euro (8859-15)
o c © or 0 c, copyright
o r ®registered