A-code and UTF8

As of version 12.83, A-code is sufficiently UTF8 compliant to handle games with messages, vocabulary and entity names in languages other than English, including ones which use non-ASCII characters, provided (a) words are separated by ASCII blanks (octal 32) and (b) are parsed left to right. All you need is a UTF8-compliant text editor. While this degree of UTF8-compliance is only possible since A-code version 12.83, it is available in games using A-code styles from 10 upwards.

UTF8 compliance means that you can have texts and player vocabulary, including object/place names in any character set that can be represented in UTF8 encoding of Unicode. To activate this feature of the current A-code engine, add the UTF8 major directive to the game source header.

There is, of course, a residual difficulty. The A-code kernel has to be able to identify words being used to structure complex command: AND being used instead of a comma, and THEN being used instead of a semicolon. Furthermore, if the verb AGAIN is defined by the game, it is intercepted by the kernel and taken to men a request to repeat the previous command (see the section on automatic entities and flags in the A-code language documentation). There are a few other special words that have to be known to the kernel. The current complete list is as follows:

The obvious solution in writing a game in a language other than English is to declare the appropriate synonyms for such special words. (AND and THEN are defined automatically, but can be also defined explicitly as vocabulary words.) E.g. in Czech the equivalent of AND is A, and of THEN is PAK (or POTOM). Thus

WORD AND, A
WORD THEN, PAK, POTOM

enables these words to be used in place of AND and THEN.

There is, of course, an obvious snag to this simple solution. It leaves the English version of such special words in the player vocabulary, which may be very confusing to players (e.g. because of the typo correction mechanism). Or worse, those words might mean something else in some other language. To avoid such problems, one can specify exclusion of English versions from the player dictionary in the standard manner:

WORD -AND, A
WORD -THEN, PAK, POTOM

A bit of magic happens when any synonyms are given to the excluded word, allowing that word to be defined in its own right and having another meaning altogether. Thus it would be perfectly legal to add e.g. WORD THEN to the above example. Any code references to THEN would then be referring to that addition definition. This avoids any potential case of a clash between a reserved English vocabulary word and some word in another language.

If no synonyms are given, the excluded word can still be used by game code, if required.


Back to the documentation index
To the Mipmip home page
Feel free to leave a comment!
Mike Arnautov (27 March 2023)