PanelSoft User Interface and Safety Consulting

Reading

Niall Murphy's Blog:
Usability Bites

PanelSoft

User Interfaces and Usability for Embedded Systems

Feedback to 'Parlez-Vous Francais?' - Murphy's Law, Feb' 2001

This column led to considerable feedback - I knew I was not the only one who had battled with this area. Some of our readers solutions are described bellow.

return to Murphy's Law

James Pettinato gives a detailed description of his method of handling char *'s - mostly as a response to Nigel Jones article. He also describes an interesting feature which allows the sales reps in each country to update the strings on the products with no interaction with the original designers - it won't fit every application, but sounds like it could save a lot of trouble when it works.

Gentlemen,

I'd like to extend my appreciation for the articles in the Feb 2001 issue of Embedded Systems Programming (Vol 14, #2) related to the support of multiple languages in embedded systems. This is a concern that we have wrestled with here for some time, as many of our products are marketed worldwide. Please excuse the length of this email, obviously this is a topic that I've been interested in for some time!

I was pleased to see our group had reached similar solutions to many of the problems raised by Mr. Jones related to the issues of translations, such as the use of enums to match the strings in the array, etc. However, we took an alternate approach to the problem of run time language swapping. In our case, we were working from scratch on a new product and had ROM and also RAM to spare (for a change!). Our approach used the same const * const char[] construct in ROM as described in the article for the default English language pointers, but added an array of pointers in RAM which were initialized to point to the English-language defaults:

enum {
    string1,
    string2,
    .
    .
    .
    LAST_STRING
}
 
static const char * const default_strings[LAST_STRING+1]
{
    "String 1",
    "String 2",
    .
    .
    "Last String"
  }
 
static char *stringTable[LAST_STRING+1];
 
// initialize function called at powerup
// can also be called to reinstate default strings
// via diagnostic menu selection (for servicing)
void strTableInit(void)
{
    int i;
 
    for (i=0; i<=LAST_STRING; i++)
        stringTable[i] = (char *) default_strings[i];
 
ASSERT (strcmp(stringTable[LAST_STRING], "Last String") == 0);
}
 
// GetString()...  exported getter function 
// (Use if desired to encapsulate stringTable array, 
//  alternately stringTable could be made global)
char *GetString(int index)
{
    ASSERT (index < LAST_STRING);
    return stringTable[index];
}

This approach adds one more benefit... it allowed us to implement 'run time' translation updates and additions. In fact, we give our distributors and customers the ability to change any or all of the strings in the table using a Windows-based companion application. This program provides for the actual editing of the string table using any of character sets displayable on the embedded system, and provides a mechanism for transferring a new language file to the device via serial communications. The translation is then stored in flash memory. The stringTable array is re-initialized so that any translated (non-NULL) entries in the downloaded table are used in place of the default string. With this approach, a subset or all of the table can be translated. Very handy for us... 'do it yourself' translations!

This brings me to the area of most interest to me currently... handling languages represented best by character sets other than Latin-1 in embedded systems. I think that we have implemented a rather novel approach to this problem as well. I will attempt to briefly characterize the design. We first built a display driver that allows text drawing in several 'fonts'. The fonts are actually each a bitmap cache of character glyphs based on a rendering of a font at a specific point size and weight. A tool I wrote allows any font displayable by Windows (TrueType or raster) to be used to produce an embedded font (the tool outputs 'C' source). For the first release of our latest project, we included a small non-proportional handcrafted font for the menus and a larger bold font for data displays, both using the Latin-1 code page. For demonstration purposes, we also included the equivalent typefaces using code page PC-866 (the pre-Windows Russian Cyrillic) since we had already done that font for another project and had the handcrafted version done. The distributed companion application then provides the ability for any language that can be represented in Latin-1 or PC-866 Cyrillic to be represented on our device. Note that the download does not have to consist of a complete set of strings. If someone wants to change one string on one menu, they can. Since often our market (custody transfer of petroleum) is tightly regulated by local governing bodies, terminology requirements can be stringent. This flexibility allows local agencies' requirements to be satisfied without a custom build.

Mr. Murphy described to some extent the difficulty in arranging translation of source text via external resources. Many of the described difficulties are avoided (as suggested in the article) by allowing the translator to see the results of his work. We also had implemented a PC-based emulation of our display as he suggests. Our German distributor simply took the emulator and utility, translated sections of the string table, then downloaded and verified the appearance on the display, with no intervention by us. The coordination effort is quite simplified, as the translator can actually see for himself how the new translations appear in context. Even if a translator unfamiliar with the product or application is used, fewer iterations have been required since they still can 'see' how the translation looks as they progress.

The portion of this scheme that I think really makes this approach powerful is the Windows companion application that allows for run time translations. (I may be biased, as I wrote it). The utility uses the same bitmap cache 'fonts' as the actual display so you see what you get. Since ISO standard code pages are used, selecting the proper keyboard mapping using Window's built-in language support results in the proper characters being inserted by just typing normally. Additional code pages can be added to the system as required, although this still requires a firmware change at this time. It would be possible for even the bitmap cache 'fonts' to be uploaded real time if that was desired. (The original project had a separate display processor board to allow for remote display placement via EIA232, and so transferring the font images would have been a two-step process, which we decided was not worth the effort on that particular project.) The companion application can read out previously downloaded translations, and also does a nice job of importing translations from a previous revision and keeping everything in the right location, so it is easy for users who upgrade their firmware to keep their custom translations accurate and up-to-date. Strings that are inserted into a new revision need to be translated during the upgrade process by someone, of course.

We are now in the process of adding additional code pages (ISO 1251-Cyrillic, ISO 1250-Eastern European, and ISO 1253-Greek) to the third generation of our flagship product, since our European distributors are clamoring for language support in those markets. It will be simply a matter of adding some fonts to the display firmware and updating the utility to be aware of them. The distributor will happily handle the translation details if it will sell units for him. We feel this is a vast improvement over the prior generations, where custom versions were released for each language, and each had to be maintained.

Some drawbacks to this approach include the fact that the companion application must match the revision of the embedded system's firmware (to assure that the default string tables are synched). Currently there is no CJK or multibyte support and the display engine also does not support right-to-left scripts or combining of glyphs, but our marketing requirements have not forced us to address these issues as yet. I see no reason why this same architecture would not work with these features added in as well.

Currently I'm working on updating the font conversion utility to a 32 bit app (it was originally done as a 16-bit Win3.x app) and enhancing it to be able to access characters in a Unicode font from code pages other than the default. Then the utility will be able to (for example) create a 'Greek' Arial from the Unicode Arial font rather than having to dig up a Greek version of Arial produced by someone else that may or may not be mapped to the ISO standard.

I was also pleased to see Mr. Murphy include Roman Czyborra's excellent site as a reference... I have found it to be invaluable on numerous occasions and have referred overseas colleagues to this site as well.

I am looking forward to Mr. Murphy's coming column on double-byte character sets since I am sure that it will not be long before I am asked to address these issues with the emerging Asian market for our products.

Thanks again, keep them coming!

Jim Pettinato --

James M. Pettinato, Jr.
FMC Measurement Solutions
Smith Meter - Erie Operation

Cliff Smith gives another example of good use of the PC platform to keep translators in check! His mail follows:

Having recently completed a project in which I was responsible for obtaining translations, I enjoyed your article in Feb. 2001 'Embedded Systems Programming'.

I ran into all the problems you mentioned when getting translations for our product which used a 96x48 pixel display and our own proportional fonts. We 'solved' (if I can say that with a straight face) our problems with the following strategy.

I wrote a PC app that had access to a database containing the strings, field specifications, and font specifications. This program presented the translator with the English string and the space available (width and number of lines) As the translator entered the translation, the program kept a running total of space used so the translator could try different combinations to get the best translation that would fit the available space.

Also, I made arrangements with the translation agency to have the translators come to our site to do the translations. This gave the translators a chance to ask questions and get clarifications quickly. We had about 540 words and phrases to translate and each translation was done in one 8 hour day. We are still in the review process, but the quality of the translations has been very satisfying. We did have to pay a little extra but it was money well spent.

Not all agencies were willing to do this. Several agencies turned us down because "they don't work that way". It requires a degree of trust that the agency will supply quality people, and that you won't subsequently attempt to cut the agency out.

After having done it this way once, we will never do translations any other way.

Best regards,

Cliff Smith
Mobile Communications Design Center

SY Wong gives an interesting, if not very directed, response which visits topics as diverse as programming language choice and architectures of various bit-widths. The mail follows:

Your language articles in ESP touched a subject of much interest to me. There were about 20000 Chinese characters in the authoritative Kwanshi dictionary named after the emperor that commissioned the work. I dare say a 4000-character set can include the English, Latin, most commonly used Chinese words/characters plus the very useful original IBM PC character set of 1980. No more than 16 escape characters can cover the entire ISO 8859 standard.

A bit of history. There never was any reason that computer word lengths need by powers of 2 or divisible by 8. The 8-bit was originated by the STRETCH super computer project at IBM in the 50s. The 64-bit word length was both powers of two and divisible by 8. The 12-bit column on IBM cards may have similarily influenced the IBM 701 and 704 with 36-bit word size prior to STRETCH. About that time, I also designed a 36-bit machine as specified by the funding agency that used Teletype I/O with 6-bit codes which may have influenced the choice of word-length. The two address (4k words) instruction included a 12-bit command code. I next designed a 48-bit machine for the Navy to control displays. 48 bits was chosen because it is divisible by large number of sub-groups. It was sawed in half for a drone control system with 24-bit words. CDC later made similar 48- and 24-bit machines commercially. Even earlier, the Institute for Advanced Study computer, now in the Smithsonian museum, used 40 bits. The binary keyboard was grouped in 4 buttons for one-hand entry of hex symbols using neons on the computer register visible through a glass window as display.

My conclusion is that for a 4k basic character set for Chinese or Japanese, there is no need to slavishly haning on 16-bits of the ISO standard especially for embedded application-specific appliances such as a web-email box. You cannot make a box to sell for less than $100 by wasting memory space, however cheap are memories. National Semiconductor used to have a 12-bit microcontroller. If processors cost nothing, language processing can be a series of fixed translators or filters that eliminates the resource wastefull multi-tasking operation system.

What do you think about the 4k character subsets for Asian languages? Trade groups can define compatible subsets by standardizing the compatibility part. ASCII, Latin and the original IBM PC fonts of 1980 can be the upper part of 8-bit sub-subsets of the 12-bit subset.

Several years ago, the Electronic Design editor sent me a few samples when their Chinese editions first started. They were excellent in quality. I don't think I will see practical machine translators in my lifetime. GUI alone may not be optimum for Chinese data inputs. The system of spelling English sounds (pin-yin) of the about 100 Chinese phonemes and let the user to select from a row of similar sounding characters is a tedious process which even I cannot stomach. Seems a trainable voice recognition of the 100 pin-yins might be achievable with current VR techniques. A bit extra intelligence using context can let the machine move the cursor and highlight the most likely character to reduce finger movements, might also be achievable.

What do you think about the above conjecture?

I use a small safety critical subset of Ada as my hw/sw design language to define an "almost zero cost" core processor design. There is a similar subset in IEEE VHDL for IC design. This Ada subset is really a language to define languages that supports reusable components and should be also useful in the language processing field. That subset I use is restricted according to the Safety Annex in the ISO standard and not C extensions or YAL (yet another language) so frequently suggested in magazines. Unfortunately Ada has a bad taste in the software community and subsetting is generally frowned upon.

Can the CS community ditch the assumption that C/C++ are too entrenched and impossible to consider better alternatives that remedies well known C shortcomings?

SY Wong, Tarzana CA

Hello Niall,

I just read your article in Embedded Systems Programming magazine. I found it quite helpful.

You may already know about Win-Trans, but if not, it is worth checking out. The feature I have used myself is the ability to convert .rc files to/from .xls files.

I guess that's only helpful if you're using Microsoft tools for development. But if you are, Excel supports text entry in nearly every language you might need.

Best regards,

Rex Baldon Sr.
Software Engineer
Newport Corporation

This note from Eric Lukac-Kuruc confrims everything that my French teacher ever said about my command of that language!

Hello, Just a few notes about your French translation examples in Embedded Systems.

In French, "a" used as a conjunction, and not as the verb "avoir" (to have), requires an accent (�), which is the case in your example. Moreover, it is not allowed to have "� le" sequence of words. The contraction for this meaning is "au", so that the sentence becomes "Bienvenue au gadget".

On the other hand, the translation of "Welcome to this gadget" is not "Bienvenue au gadget" but "Bienvenue � ce gadget". "Bienvenue au gadget" would come from "Welcome to the gadget".

French is a tricky language, with traps at every corner. I wish you a nice day.

Best regards,

Eric Lukac-Kuruc, R&D Manager Klavis Technologies, Belgium

Hello Mr. Murphy,

Thank you for your excellent article on embedded systems translations. We run into this issue often with our products.

As you noted, translation context is essential. We have also considered the feasibility of PC prototypes as substitutes for reference products, but time & schedule constraints have limited this approach. I look forward to your future articles on this topic. What we have done is to create templates that show typical screens, with screen elements clearly identified (e.g. "Menu item", "Item status", "Help text"). The translator can then understand the basic structure of the interface, as well as our internal vocabulary for referring to each item. These templates are not a substitute for an actual reference product or PC simulation, but for very little time, they do provide a little context for the words in a spreadsheet. In the spreadsheet, the words are identified using the vocabulary on the template.

Another issue we sometimes come across is consistency across products. Sometimes if we have one translation in one product, we can re-use the translation again (assuming the original was error-free). A sort of "translation history" may be a useful starting point for some translations, but without a good system, it can be difficult to track, and also may impose unnecessary constraints, since there's no reason to limit a 30-character interface to the translation used for a 20-character interface, unless the shorter translation cannot be improved.

Thanks again for your thoughts. So often, these types of issues are faced and solved again and again, but the information isn't shared as freely or clearly as you have done.

Best regards,

Sabrina Yeh
Sony Electronics
San Diego,
California

Niall's reply:

You are right about the importance of being consistent across products. This can be an awkward issue if you change translation company, or if the translation company change the translator from one product to the next (or even from one version of a product to the next).

On the PC simulations point, I have never built them just for translations work (and I do not think the effort would be justified for that alone), but I generally do the prototypes to allow the user interaction to be investigated before building the final hardware. I also use the PC prototype to develop some of the production code. This is a topic I will return to in my column before the year is out.

[PanelSoft Home | Training Courses ]