Single-byte codepages (which are used in Western European and American countries) assign the codepoints from 0 to 255, while multi-byte codepages that are primarily used in Asian countries can assign codepoints above that to support the many characters that are used in Asian languages.
Historically, all codepages are based on the American Standard Code for Information Interchange (ASCII). This standard assigns the codepoints from 0x20 to 0x7F (32--127) only. As a result, those codepoints are the same in all codepages, which means that English texts which do not contain any special characters will display correctly under any codepage. For example, ASCII 0x41 (65) will always be a capital A letter, and ASCII 0x20 (32) will always be a space.
However, there is great variety in the assignments for codepoints above 127. This is the reason why some characters will be distorted when the wrong codepage is used. For example, codepage 850 assigns the German o-umlaut character to 0x94 (148), but codepage 1004 uses 0xF6 (246) instead. Things become even more complicated with Asian languages which have too many characters to fit in a single byte.
To overcome this historical stupidity which has led to so much confusion and distorted e-mails, the Unicode standard was set up which no longer relies on byte values, but assigns codepoints up to the tens of thousands. As of 2002, Unicode 3.1 defines 94,140 codepoints for almost any language and country that you can think of (see www.unicode.org for details). As a result, if an application is Unicode-enabled, it no longer needs to care about codepages. To display a glyph, it only needs the Unicode codepoint.
Per definition, Unicode no longer knows a maximum codepoint value either, as codepages do. Essentially, Unicode codepoints can have any value and thus need to be encoded. WarpIN uses UTF-8 internally, which is a definition of how to encode multi-byte values in single bytes. This has the advantage that, from a programmer's point of view, much of the traditional character processing will still work.
Under OS/2, the default codepage is set with the
CODEPAGE
statement
in the CONFIG.SYS
file.
In the U.S. and western European countries, commonly used codepages are:
Note that the Unicode codepoints from 0 to 0xFF (255) are the same as unter Latin-1.
COUNTRY
command in the OS/2 Command Reference.