PanelSoft User Interface and Safety Consulting

Home

Training

Reading

Niall Murphy's Blog:
Usability Bites

PanelSoft

User Interfaces and Usability for Embedded Systems

Feedback to Forget Me Not! - Murphy's Law, Jun' 2001

return to Murphy's Law

Errata

There is a reference in the article to Ross Williams's "A Painless Guide to CRC Error Detection", but no reference is given. This document is an on-line guide and is available, in a number of slightly different formats, on the Internet at any one of:

ftp://ftp.rocksoft.com/papers/crc_v3.txt
http://www.repairfaq.org/filipg/LINK/F_crc_v3.html
http://www.acte.no/teknisk/html/crcguide.htm

Feedback

A number of readers mentioned various approaches to handling EEPROMs that are limited in the number of write cycles that the chip can handle. I neglected this topic in my original column, since I was tring to stick to the higher level software issues, and avoid the hardware and technology-specific stuff, but it is an important issue, and probably should have gotten a mention. Some of the mails follow:

Dear Niall Murphy,

I can relate to your article, Forget Me Not, in ESP June 2001. I would like to add that you can not assume only a single byte will be corrupted when power fails during an EEPROM write. We have experienced multiple byte EEPROM corruption in the Atmel ATmega103 microcontroller. For the revision of the ATmega103 we used, the address registers EEARH and EEARL could change randomly as power drops out, corrupting random multiple locations. Our solution for this was to store factory calibration data in triplicate and implement a voting algorithm to repair corruptions.

As an interesting side note, Atmel recommends "Avoid using [EEPROM] address 0 for storage, unless you can guarantee that you will not get a reset during EEPROM write." This is because an unexpected reset during an EEPROM write will zero the EEPROM address registers.

Karl Knauf
Datex-Ohmeda

Some boxes I have helped design have made use of non-volatile memory for storage of fault information. At first, the technology employed was NVRAM. More recently, we have used EEPROM. As you noted in your article, the time available after a power-down indication is often a factor in the design of the non-volatile storage software. These days, one can get EEPROM having write cycles measured in microseconds, and lifetimes of more than a million writes. It has not always been thus. In our early use of EEPROM, we had to cope with its limited write-cycle capability, and lengthy write times.

Since our interest is primarily in fault data, we generally write it only when a fault is detected. Sometimes, there are many of these computed in a single computation "frame". More often, the box goes for days, weeks, or longer between faults. In any event, the required 30+ year service life of the box dictates some provision be made for limiting the number of writes to a given EEPROM address when a fault is recorded. This means, for example, that storage area pointers, or flags such as your "best" flag, must be distributed over several different addresses. Likewise, operating hour information, power-up counts, and other frequently-written data are arranged to occupy several bytes.

As you suggest, each fault record has its own checksum. In fact, it has several--one for each of the fields. Some fields actually use "checksums" that permit bit-error correction, though the error correction capability has never been used as far as I know. (These Hamming codes are easier to compute than CRCs and you get the error correction capability for free.) Our "current record" identification and power-up count fields use a "traveling indicator" scheme.

Our fault memory stores information about the most recent 64 "equipment cycles." A corresponding block of 64 words is set aside for pointers to the "current equipment cycle" data. As each equipment cycle pointer is written, a "current record" indicator in the pointer word is "toggled".

Larry D. Morris

There is a fourth consideration in planning when to store non volatile data, and that is the lifetime write ability of the hardware. There are some EEPROM technologies that have unlimited read capabilities but a limited number of writes such as 100,000 or 1,000,000. Writing once per minute will render the EEPROM useless in 2 years for a part with 1,000,000 writes, assuming round the clock operation. If your hardware uses such a limited write cycle device, some checking of the expected usage and lifetime of the device and a little arithmetic is in order when designing the non volatile scheme.

Thanks for some good columns. You have a knack for summing up all the issues. Your columns could have saved me some grief if I'd read them at the beginning of some of my designs.

Nancy Goering
Clarity Visual Systems

Mr. Murphy,

I just read your June column. I think you did a nice job of summarizing many of the issues and methods involved with storing data in a non-volatile manner. I have faced many of these myself.

One thing I did not see mentioned is the issue of wearing-out memory by writing to it too often. Most EEPROM and Flash devices are limited to 10,000, 100,000, or 1,000,000 write cycles. This must be considered in the context of the intended life of a product to determine the maximum allowable update rate. At my company, we create products that are intended to last for 30 years, so this is a significant consideration. We've come-up with a handful of techniques for dealing with data that needs to be stored more often than the write-cycle limit would allow, primarily be spreading the storage of such data over multiple memory locations.

I look forward to your next column regarding maintaining non-volatile data through firmware upgrades. This is another issue I have dealt with, but have not been completely satisfied with the methods we used. I hope I will find some new ideas in your article.

Dave Wood
Schweitzer Engineering Laboratories, Inc.

[PanelSoft Home | Training Courses ]